Analytical ultracentrifugation as a tool in the studies of aggregation of the fluorescent marker, Enhanced Green Fluorescent Protein

Enhanced green fluorescent protein (EGFP) is a fluorescent marker used in bio-imaging applications, including as an indicator of folding or aggregation of a fused partner. However, the limited maturation, low folding efficiency, and presence of non-fluorescent states of EGFP can influence the interpretation of experimental data. To measure aggregation associated with de novo folding of EGFP from a high GdnHCl concentration, the analytical ultracentrifugation method was used. Absorption detection at 280 nm allowed to monitor the presence of monomers and aggregated forms. Fluorescence detection enabled the observation of only properly folded molecules with a functional chromophore. The results showed intensive aggregation of EGFP in low concentrations of GdnHCl with a continuous distribution of aggregated forms. The properly folded monomers with mature chromophore were fluorescent, while the conglomerates of EGFP molecules were not. These facts are essential for a proper interpretation of data obtained with EGFP labelling.


INTRODUCTION
Wild type green fluorescent protein (wtGFP) is a small, 27 kDa globular protein, isolated from the Pacific jellyfish, Aequorea victoria (Shimomura et al., 1962). The native structure consists of 11 β-strands forming a β-barrel and a single α-helix running inside the molecule. In the centre of the protein a unique chromophore is formed after protein folding upon cyclisation, dehydration, and oxidation of residues Ser65-Tyr66-Gly67 (Reid & Flynn, 1997;Bartkiewicz et al., 2018). This chromo-phore is responsible for wtGFP fluorescence in the range of visible light. After pioneering experiments in which the fluorescent protein was expressed in nematode cells (Chalfie et al., 1994), wtGFP became a source of a high number of mutants with altered spectral and biophysical properties. They developed into widespread biomolecular markers. Enhanced GFP (EGFP, S65T/F64L-GFP) belonged to this group of mutants, with better fluorescent properties than wtGFP (i.e. 35-times higher fluorescence intensity, quantum yield of 0.6) (Cormack et al., 1996;Tsien, 1998).
EGFP fluorescence is used in bio-imaging in vitro and in vivo (Chalfie & Kain, 2006), in a number of biotechnological and biophysical applications, including observation of gene expression, protein-protein interactions, localisation and migration of proteins (e.g. Tsien, 1998;Skaar et al., 2015;Tan et al., 2018). It can also be an indicator of folding (Chang et al., 2005) or aggregation of a fused partner (Cabantous et al., 2008;Gregoire & Kwon, 2012;Higgins et al., 2018). Data based on green emission of EGFP, used as an aggregation indicator, can be interpreted in two ways. On the one hand, a decrease in fluorescence was interpreted as aggregation of fused partners (Cabantous et al., 2008;Gregoire & Kwon, 2012), on the other hand the observed green emission in aggregates was proof of the target protein's presence (Higgins et al., 2018). In the first case, it was assumed that the decrease in fluorescence was directly related to the aggregation of the fused partner, not GFP. In the second case, it was not obvious that lack of fluorescence indicated that the target protein aggregates were not present. In both cases, the data describing the behaviour of the fused partners could be affected by the photophysical and biochemical properties of EGFP. There are still unanswered questions about how the measured parameters (such as fluorescence intensity and diffusion coefficient) can be properly interpreted considering the behaviour of EGFP, which is dependent on the environment (e.g. pH, ionic strength, buffer composition).
In this context, the fluorescence properties of EGFP are important parameters that serve as a direct measure of protein behaviour. All EGFP molecules are assumed to emit a signal in the visible range of electromagnetic radiation. However, fluorescent proteins can exist in non-fluorescent states caused by the photophysical properties of the chromophore. For example, only the anionic form of EGFP chromophore is fluorescent. Therefore, it is sensitive to pH of environment and its brightness (defined as a product of extinction coefficient and quantum yield) is lowering with decreasing pH value. This is significant when this biomarker is found in different cell compartments, especially those with acidic pH (Patterson et al., 1997). Additionally, low folding efficiencies, reported for EGFP (Krasowska et al., 2014), including incomplete chromophore maturation, can influence interpretation of the data obtained for fused proteins (Dunsing et al., 2018). Thus, detailed knowledge of EGFP fluorescent properties is necessary to properly interpret data obtained when EGFP was used as a marker. The analytical ultracentrifugation method was used in this study to measure the aggregation accompanying de novo folding of EGFP, obtained from inclusion bodies and dissolved in a buffer with 6 M guanidine hydrochloride (GdnHCl). Absorption detection at 280 nm allowed for the observation of the monomeric and aggregated protein. The more sensitive fluorescent detection (excitation at 488 nm, emission in the range 505-565 nm) enabled identification of the molecules with a functional chromophore that was formed after the proper folding of EGFP. The obtained results show that these techniques can be useful as a tool in the studies of aggregation of enhanced green fluorescent protein and a proof of the existence of its non-fluorescent states.

EGFP preparation.
The protein was expressed in E. coli (BL21) strains transformed with pRSET B plasmid containing the EGFP gene with a 6xHis-Tag. 50% of EGFP was found in inclusion bodies. In contrast to soluble fluorescent fraction, this protein had no chromophore. It was purified in the unfolded form in a phosphate buffer at pH 8.0 with 6 M GdnHCl and 300 mM NaCl using IMAC method with a gradient of imidazole, as described earlier (Krasowska et al., 2010). The imidazole was then removed by washing the sample with buffer without imidazole on Amicon filters, and the protein was concentrated to ~1mM. It was stored in a phosphate buffer pH 8.0 with 6 M GdnHCl and 300 mM NaCl. The folding of such EGFP was initiated by diluting the sample at least 100x in a 50 mM phosphate buffer, pH 8.0, containing 300 mM NaCl, 14 mM β-mercaptoethanol (Krasowska et al., 2014) and a desired concentration of denaturant to get the final concentration in the range from 0.06 (residual concentration of GdnHCl) to 2.5 M. The measurement conditions varied from those in which the protein folds/aggregates immediately to those in which it should remain unfolded (with high concentrations of GdnHCl). Dilution was initiated by the rapid injection of a buffer into the tube containing the small volume of concentrated protein.
The protein found in the soluble fraction was in a folded form with a mature fluorescent chromophore and was purified, washed and concentrated analogously to the unfolded EGFP in a 50 mM phosphate buffer at pH 8.0 and 300 mM NaCl, but without GdnHCl.
Absorption and fluorescence characteristics of EGFP. The absorption spectrum of the purified EGFP sample was measured on a Cary 100 UV-Vis spectrophotometer (Agilent Technologies, Australia) using a quartz cuvette. The concentration of the protein was calculated using an extinction coefficient of 21 900 M −1 cm −1 at 280 nm for folded EGFP (Mach et al., 1992) and 19 800 M −1 cm −1 at 280 nm for denatured protein (Gill & von Hippel, 1989), and in stationary absorption measurement it was 8.4 μM and 8.2 μM, respectively. Stationary fluorescence spectra were recorded using an LS-55 spectrofluorometer (Perkin-Elmer, UK). The quartz cuvette (Hellma, Germany) had a path length of 2 mm for excitation and 5 mm for emission. Samples with concentra-tion 0.1 μM (folded and denatured EGFP) were excited at 295 nm or 489 nm and spectra were recorded at a bandwidth of 2.5 nm both for excitation and emission in the range of 300-550 nm and 500-550 nm, respectively. The measurements were performed at 20°C, in 50 mM phosphate buffer pH 8.0 with 300 mM NaCl for native fluorescent protein, and for unfolded EGFP the buffer contained additionally 6 M GdnHCl.
Analytical ultracentrifugation with absorbance (AUC) or fluorescence (F-AUC) detection system. Sedimentation velocity (SV) experiments were conducted in an Optima XL-I analytical ultracentrifuge equipped with the absorbance (Beckman Coulter, USA) and fluorescence (Aviv Biomedical, Inc., USA) detection systems in a four-or eight-position AN-Ti rotor. Fluorescence detection system contains a 10 mW laser emitting a wavelength of 488 nm. The protein samples were loaded into a double-sector 1.2 cm cell with an epon charcoal centrepiece and either quartz or sapphire windows. For absorbance measurements, a sample solution (390 μl) and a reference buffer (400 μl) were loaded into the right and the left sector, respectively. In the fluorescence measurements, no reference solution was required, hence both sectors were used for the protein samples, with up to 14 samples per run in an eight-hole rotor.
After equilibration at 20°C and 3 000 or 5 000 rpm, the radial calibration was performed both for absorbance and fluorescence measurements. In the fluorescence measurements, photomultiplier voltage and gain were adjusted for each cell, and focus scans were conducted for all samples. An appropriate focusing depth was selected to maximise the signal, typically around 5 000 µm. After initial procedures, the rotor was stopped, and the temperature was equilibrated to 20°C. The ultracentrifuge was then accelerated to 50 000 rpm and radial absorption at 280 nm or fluorescence scans of protein-concentration profiles in the cell were collected. The samples were measured about three hours after sample dilution (see EGFP preparation), because of the time needed for temperature equilibration. After each SV experiment, the samples were stored at 20°C in the AUC cells and measured again the next day.
Concentration of protein in AUC measurements. In statistical measurements of the sedimentation coefficient for folded, fluorescent EGFP, concentration of 5 μM was used. It was a compromise between the detection accuracy of absorption (acceptable S/N ratio) and fluorescence (too high signal intensity for higher concentrations of fluorescent protein) (see Fig. 2).
De novo folding of EGFP was initiated by dilution of the unfolded protein solution containing 6M Gdn-HCl to a protein concentration of 5 μM (fluorescence detection) or 10 μM (absorption detection) in 50 mM phosphate buffer with 300 mM NaCl and 14 mM β-mercaptoethanol and a final concentration of GdnHCl in the range of 0-2.5 M. In these measurements, part of the aggregated protein dropped to the bottom of the cell during ultracentrifuge speeding. Thus, the initial EGFP concentration of 5 μM was too low to monitor the sedimentation in absorption detection system and the signal in statistical measurements was much noisier than for folded EGFP. Aggregation depends on protein concentration and could be higher for higher protein concentration, but our main goal was to check if we can observe aggregation of the enhanced green fluorescent protein in the analytical ultracentrifugation experiments and if aggregates formed during EGFP folding are fluorescent or not. Data analysis. Density and viscosity of buffers with different GdnHCl concentrations and partial specific volume of EGFP (0.72689 cm 3 /g, from amino acid composition) were calculated using the SednTerp programme (Hayes et al., 1995;Hayes et al., 2006).
The sedimentation velocity profiles were analysed using the Sedfit programme with the continuous sedimentation coefficient distribution c(s) model (Schuck, 2000, Lebowitz et al., 2002. Meniscus positions and frictional ratios were treated as adjustable parameters in the nonlinear regression of c(s).
A sedimentation coefficient distribution c(s) can be defined according to the Equation 1: where α(r,t) -experimentally observed signal, with an error of measurement ε, c(s) -the concentration of species with sedimentation coefficients between s and s+ds, X(s,D(s),r,t) -the Lamm equation solution.
The integration of c(s) in the range between s and s+ds gives the total value of signal corresponding to the number of molecules with average sedimentation coefficient s. This value is called population of molecules, c, for calculated sedimentation coefficient. This value is expressed in absorption or fluorescence units, depending on the detection system.
All s-values were corrected to water at 20°C as a solvent (Equation 2).
(2) s -measured sedimentation coefficient value ρ b,T -the buffer density η b,T -the buffer viscosity ρ 20,w -the water density in standard solution conditions at 20°C η 20,w -the water viscosity in standard solution conditions at 20°C ύ -the partial-specific volume of the protein.

Steady-state measurements
The enhanced green fluorescent protein (EGFP) found in inclusion bodies occurred in an unfolded form without a functional fluorescent chromophore. UV-Vis absorption and fluorescence spectra of this form confirmed that the chromophore was not present in the EGFP structure. The absorption spectra measured in the 240-550 nm range showed only one maximum at 280 nm resulting from the presence of aromatic amino acids (1 Trp and 11 Tyr, Fig. 1A). The fluorescence spectrum of the protein excited at 295 nm had only one maximum at about 350 nm (Fig. 1B, inset).
The protein present in the soluble fraction was folded and with mature fluorescent chromophore. The absorption spectrum revealed an additional band at 489 nm (Fig. 1A). In the emission spectrum upon 295 nm excita-tion, the tryptophan fluorescence was suppressed, most probably because of Förster resonance energy transfer (FRET) between tryptophan and the chromophore in the β-barrel, due to the small overlap of tryptophan emission spectra of and absorption of EGFP chromophore (compare Figs. 1A and 1B). When the Trp is excited at 295 nm in the denatured protein, the emission of Trp can be observed with maximum fluorescence at about 365 nm, while for the native protein in the same conditions the Trp peak is negligible and the chromophore emission dominates (Fig. 1B, inset). The green fluorescence of the chromophore was observed both for excitation at 295 nm (Trp absorption region) and 489 nm (maximum of chromophore absorption, Fig. 1B).

Statistical measurements of EGFP sedimentation coefficient
The sedimentation coefficient of the folded, fluorescent EGFP monomer has been measured by several labs. The obtained value varies in the range 2.52 S-2.65 S (Vámosi et al., 2016;Zhao et al., 2013). For His-tagged EGFP, two values were reported: 2.81 S for absorption and 2.73 S for fluorescence detection systems (MacGregor et al., 2004).
To confirm these results and to compare the accuracy and repeatability of measurements made using the SV method with absorption or fluorescence detection, experiments with His-tagged EGFP were conducted (see example in Fig. 2). Each of the six measured samples contained native, properly folded and fluorescent protein at 5 μM concentration. The measurements were performed at 20°C and 50 000 rpm.
The sedimentation coefficient obtained using both detection systems was equivalent in the range of the errors: 2.63±0.02 S for absorption, and 2.64±0.08 S for fluorescence. The sedimentation coefficient calculated with HY-DROPRO as 2.56 S (PDB ID: 2Y0G) was slightly lower than that obtained from the experiments.
The population of EGFP inferred from the distribution plot in the absorbance measurements showed that monomers represented 81±10% of absorbing objects in solution. However, the fluorescence detection system showed 93.7±1.1% of monomers in the mixture of fluorescent molecules. In the fluorescence detection system, a laser emitting at a wavelength of 488 nm was used. Therefore, only the molecules with a correctly formed chromophore were visible during fluorescence measurements. On the other hand, the absorption detection at 280 nm allowed observation of all the molecules that had aromatic amino acids in the primary structure. For this reason, molecules without functional chromophore were also detected.

The EGFP de novo folding in vitro and aggregation
Protein folding, chromophore maturation, and aggregation of EGFP are processes of a different time scale. Folding proceeds in microseconds, maturation takes hours, and aggregations takes days (Krasowska et al., 2014;Krasowska et al., 2010). Therefore, sedimentation velocity experiments were conducted over the span of a few days to monitor the aggregation that accompanied EGFP de novo folding as a function of decreasing concentration of GdnHCl. The existence of molecules with functional fluorescent chromophore was monitored by observing the 505-565 nm range emission for excitation at 488 nm. The presence of all forms of protein, including non-fluorescent or misfolded monomers and aggregates, was identified by measuring absorption at 280 nm.

AUC measurements in the absorption detection system
The dynamic process of aggregates formation began after mixing unfolded EGFP with appropriate buffer without or at low concentration of denaturant. The aggregates appeared and grew during the measurements. Thus, a one-day experiment specifying the statistics of this process using the same conditions in six cells was performed. A small, highly concentrated unfolded EGFP sample was rapidly diluted to a concentration of 10 μM in a buffer with 14 mM β-mercaptoethanol and without GdnHCl.
The s (20,w) in most cases was higher than for properly folded EGFP. The mean values of s (20,w) and c (the population of macromolecules) were calculated from the six measurements with the standard deviations, as follows: Standard deviation obtained for c was not satisfactory as it was reported at 40%. This was likely due to the rapid protein aggregation, which was not reproducible. Thus, aggregation was an accidental process and it was not possible to accurately predict the creation efficiency of the correct EGFP form in every cell. In addition, we do not see the largest aggregates that fall to the bottom of the cell during the centrifuge acceleration.
The value of the standardised sedimentation coefficient s (20,w) was higher than the sedimentation coefficient of properly folded EGFP, which was 2.63 S. In the absorption system, the molecules that possess the aromatic amino acids, which absorb at 280 nm, were observed. The presence of mature, folded chromophore is not obvious. The peak obtained from data analysis interpreted as a monomer may not represent an individual state of the EGFP molecule, but a dynamic mixture of the different forms of protein. Probably, not only folded monomers, but also initial forms of aggregates -misfolded molecules prone to aggregation -are hidden under this one peak. Sedfit programme could not separate these individuals.
Next, seven-day folding experiments were conducted, in which unfolded EGFP was initially rapidly diluted to a concentration of 10 μM in buffers with different concentrations of GdnHCl (0-2.5 M).
The population of protein remaining in the solution is dependent on the denaturant concentration. In all samples the initial concentration of protein was the same. The high concentration of GdnHCl is a hindrance to folding and aggregation of EGFP. Consequently, in the presence of GdnHCl, a high population of protein remained in the solution (for 2M of GdnHCl see Fig. 3E). With a decrease in GdnHCl concentration, an acceleration in the folding and accompanying aggregation was observed. The huge conglomerates of EGFP aggregates dropped to the bottom of the cell during ultracentrifuge speeding. The results presented in Fig. 3 show that the population of monomeric EGFP remaining in solution was much lower for low concentration of GdnHCl than for 2M GdnHCl (Fig. 3A, C). As expected, the monomer sedimentation coefficient (s) depended on the buffer density, viscosity and stage of the protein unfolding and varied with the concentration of GdnHCl. It is noteworthy that for absorbance detection system, the molecules with a lower sedimentation coefficient were observed in several, but not all samples, which could be interpreted as unfolded molecules in the mixture. There were also individuals with higher sedimentation coefficients, probably dimers and larger aggregates of EGFP with immature, non-fluorescent chromophore (Fig. 3A, C, and E).

AUC measurements in the fluorescence detection system
The fluorescence detection system showed mainly the stable monomers with mature chromophore, while the non-fluorescent conglomerates of EGFP molecules were not visible.
The analysis of data obtained for different GdnHCl concentrations showed that the fluorescent molecules were mainly monomers (99% of observed molecules, Fig. 3B, D, F for 0, 0.5, 2 M GdnHCl, respectively). There was a trace population of fluorescent protein aggregates (likely dimers). In contrast with absorption measurements, the calculated s (20,w) for fluorescent monomers was similar to that for folded EGFP with mature chromophore. For example, for protein in 2 M GdnHCl s (20,w) was in the range of 1.78-1.83 S for absorption detection and 2.40-2.49 S for fluorescence ( Table 1).
The folded monomers were very stable, and the sedimentation coefficient changed negligibly over time (Fig. 4A), but as the concentration of GdnHCl in- creased, a decrease in the fluorescent monomer content was observed (inset in Fig. 4B).

CONCLUSIONS
Analytical ultracentrifugation with a fluorescence detection system is useful for monitoring events during FPs folding. Only mature, properly folded EGFP can be detected by this system, while unfolded or largely unfolded protein is not fluorescent but is detectable by absorption. The unfolded protein tends to form aggregates over time. The huge aggregates fall to the bottom of the cell when the centrifuge accelerates. Therefore, sedimentation velocity (SV) in combination with absorbance detection report the s-value distribution of the total protein remaining in the solution, while SV with fluorescence detection shows only the sedimentation behaviour of the folded, active fluorophore. A combination of these two detection methods would possibly allow determining the fraction of an EGFP sample, which is active, the fraction of monomeric unfolded EGFP and the fraction of aggregates in solution.
Our results showed a slow aggregation of EGFP in the solution with 1.6-2 M GdnHCl concentration. Aggregation accelerated at low concentration of denaturant. The sedimentation coefficient s (20,w) changed from 1.66 (for unfolded protein) to 2.8 (for folded protein) in the 50 mM phosphate buffer at pH 8.0, with 300 mM NaCl and GdnHCl concentrations in the range of 0-2.5 M.
This study has shown that in experiments where EGFP was used as a non-interacting fluorescent biomarker, the existence of states without mature emitting chromophore must be considered for a proper interpretation of data obtained with EGFP labelling. In addition, measurements, where the folding of an FP partner is monitored by fluorescence, which disregard nonfluorescent FP forms, could lead to misinterpretation of the data. Thus, the aggregation of the marker itself also should be considered.  (20,w) [S] obtained for EGFP folding and aggregation in the sedimentation velocity measurements as a function of GdnHCl concentration for absorbance (A) or fluorescence (F) detection system. The unit [S] (Svedberg) is equal to 10 -13 s.  Inset shows c as a function of GdnHCl.