Black rice cultivar from Java Island of Indonesia revealed genomic, proteomic, and anthocyanin nutritional value

1Research Center of Smart Molecule of Natural Genetics Resources, Brawijaya University, Indonesia; 2Department of Biology, Faculty of Mathematics and Natural Sciences, Brawijaya University, Indonesia; 3National OMICS Center, National Science and Development Agency, Pathum Thani, Thailand; 4Functional Ingredients and Food Innovation Research Group, National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Pathum Thani, Thailand; 5School of Chemistry, Institute of Science, & Center for Biomolecular Structure, Function and Application, Suranaree University of Technology, Nakhon Ratchasima, Thailand; 6Department of Chemistry, Faculty of Mathematics and Natural Sciences, Brawijaya University, Indonesia

Anthocyanins are influenced by several environmental factors and regulated by genetic variability and transcription factors (Chin et al., 2016;Zheng et al., 2019). Genes related to anthocyanins' production that have been well studied include structural genes encoding the enzymes: phenylalanine ammonia-lyase (PAL), chalcone synthase (CHS), chalcone isomerase (CHI), anthocyanin synthase (ANS), flavonol-3-hydroxylase (F3H), dihydroflavonol reductase (DFR), and anthocyanin reductase (ANR) (Chen et al., 2013). Furthermore, these anthocyanin synthesis genes are regulated by several transcription factors, such as MYB, MYC, Rc, and C-S-A (Himi & Taketa, 2015). Red coleoptile (Rc) is a gene encoding a basic helix loop helix (bHLH) transcription factor protein mapped on chromosome 7 (Xu et al., 2015). The Rc gene is associated with the rice domestication process. The Rc gene has seven exons, and a mutation in Rc exon 7 causes a frameshift in the open reading frame, thereby producing a dysfunctional bHLH protein to switch off anthocyanin gene expression in white rice (Furukawa et al., 2006;Sweeney et al., 2006;Zhu et al., 2019). However, mutations and variations in other exons of the Rc gene are less well characterized. In their functional forms, bHLH proteins with WD40 and MYB form an MBW (MYB, bHLH, and WD40) complex to activate anthocyanin gene expression, including that of DFR, LAR, ANS, ANR, and glycosyltransferase (GT) (Albert et al., 2014).
The GT gene encodes the glycosyltransferase protein that transfers sugar residues to anthocyanidins . Glycosyltransferases constitute a large group of proteins, which have several functions in plant metabolism and physiology (Cao et al., 2008). The GT genes contribute to rice anther growth and development (Moon et al., 2013), hormone inactivation (Luang et al., 2013), structural polysaccharide formation, and anthocyanin biosynthesis (Sun et al., 2016;Wang et al., 2018;Liu et al., 2020). In the anthocyanin synthesis, glycosyltransferases add a sugar group onto carbon number three of anthocyanidin. Glycosylation stabilizes anthocyanins, allows them to be stored in the vacuole, and increases anthocyanin solubility in water (Wang et al., 2018). The glycosyltransferase enzymes have been widely studied in many plants. Nevertheless, the genes that encode anthocyanin glycosyltransferases in rice have not been described.
Genomic variability also influences the anthocyanin content in rice and correlates with rice cultivars. Genomic variability could be assessed by targeting simple sequence repeats (SSR), which present some advantages over other markers. The SSR markers are highly suitable for genetic diversity analysis with high reproducibility (Chen et al., 2013). SSR markers were used to effectively assess the genomic variability of African cultivated and wild-type rice (Chen et al., 2017). Both of these rice populations showed SSR allele polymorphisms that separated them well. Park and others (Park et al., 2019) used sixteen selective SSR primers to evaluate several black-purple and red rice cultivars' genetic diversity. The cultivars separated into several clades that associated with their morphology and the region where they were grown. Differential gene expression might alter the proteomic profiles and anthocyanin accumulation in pigmented rice. Proteomic studies have been conducted to assess the differences in protein expression due to cultivar differences and environmental effects. Maksup and others (Maksup et al., 2017) compared the proteomic profile of germinated and non-germinated brown rice, in which germinated brown rice exhibited higher protein expression than nongerminated brown rice. A previous study reported that at least six proteins related to anthocyanins were expressed in black and not white glutinous rice leaves (Phonsakhan & Kong-Ngern, 2015). Using sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE) with Coomassie Brilliant Blue (CBB) staining, Sari and others (Sari et al., 2019b) detected specific protein bands in black rice seeds from Java, while those proteins were not detected in white or red rice. However, SDS-PAGE with CBB staining is not the most sensitive method for measurement of protein expression levels. In the current study, an ultra-high-resolution time-of-flight liquid chromatography-mass spectrometry (UHR-ToF LC-MS) of tryptic peptides was used to identify and quantify protein expression in five varieties of pigmented and nonpigmented rice from Java Island.
Hence, to evaluate the differences between Javanese pigmented and non-pigmented rice, we assessed the genomic variability using SSR markers and partial sequencing of genes related to anthocyanin synthesis (Rc and GT genes), examined proteomic profiles using UHR-TOF-LC-MS, and measured their anthocyanins components.

Genomic analysis of pigmented rice from Java Island
Genomic DNA was extracted from 20-day-old seedling leaves of five cultivars of pigmented rice by the hexadecyltrimethylammonium bromide (CTAB) method (Fatchiyah et al., 2011). DNA concentration was measured with a NanoDrop spectrophotometer (ND-1000, NanoDrop Inc., USA). DNA quality was checked by running it on a 0.8% agarose gel in 1× TBE buffer (Tris base, boric acid, and EDTA pH 8.3) at 100 V for 30 min, and then the gel was observed on a UV transilluminator. PCR was conducted to analyze the genomic variability of pigmented rice. The set of primers used for the SSR was from Chen and others (Chen et al., 2017), while the Rc gene (KX549256) and the GT gene (XM015777298.2) primers were designed to bind at specific sites (Table S1 at Supplementary Data at https://ojs.ptbioch.edu.pl/ index.php/abp). Fifty microliters of the PCR reaction mix consisted of 25 µL of 2× GoTaq ® Green Master Mix (Promega, Cat.M712), 0.2 µM of each primer, 50 ng/µL genomic DNA, and deionized water. The PCR program was set to 95°C for 30 s, 51-57°C for 30 s, and extension at 72°C for 45 s, for 35 cycles. The Rc and GT genes were separated by electrophoresis using 1.5% agarose gels, while the SSR profile was done using 5% polyacrylamide non-denaturing gels. All DNA gels were stained with ethidium bromide and then visualized on a UV transilluminator (BioRad, Cat. No 161-0433).

Extraction and determination of total protein from pigmented rice from Java Island
Total protein was extracted from the ground rice seeds in Tris-HCl buffer followed by 10% trichloroacetic acid (TCA)/acetone precipitation . Approximately 500 mg of pigmented rice powder was extracted with lysis buffer solution (20 mM Tris-HCl pH 8.0, 2% NP-40, 1 mM of EDTA) and centrifuged at 10,000×g, 15 min. The supernatant was mixed with 10% TCA in cold acetone and incubated overnight at -20°C. The insoluble protein was washed with cold acetone five times and air-dried. The dried proteins were re-suspended with 0.5% SDS in Tris-HCl buffer, and the protein concentration was measured by the Lowry method with bovine serum albumin as a protein standard (Shen, 2019).

Tryptic digestion and analysis using Impact II UHR-TOF LC-MS
Five micrograms of crude protein were reduced using 5 mM dithiothreitol (DTT) in 10 mM ammonium bicarbonate at 60°C for an hour and alkylated using 15 mM iodoacetamide (IAA) in 10 mM ammonium bicarbonate at room temperature for 45 min in the dark. Then, the samples were digested with sequencing grade trypsin (Promega, Germany) at the protein to enzyme ratio of 1:25 for 4 hours at 37°C. The peptides were dried at 30°C under vacuum and analyzed on an Impact II UHR-TOF MS System (Bruker Daltonics Ltd., Germany). Pigmented rice peptides were enriched on a C18, 5 µm 100 Å (Thermo Scientific, UK), Pepmap 100, 5 mm × 300 µm i.d., µ-precolumn and separated on an analytical column (75 µm i.d. × 15 cm) packed with Acclaim PepMap RSLC C18 2 µm 100 Å, nanoViper (Thermo Scientific, UK). The gradient system of 5-55% B over 30 min, with solvent A (0.1% formic acid) and solvent B (0.1% formic acid in 80% acetonitrile) was used to elute peptides. The flow rate was 0.3 µl/min. Mass spectra (MS) and MS/MS spectra were acquired in the positive-ion mode (m/z) = 150-2200 with 1.6 kV of Captive Spray (Compass 1.9 for TOF Series software, Bruker Daltonics).

Proteins quantification and identification
Mass spectra of peptides were analyzed with Max-Quant 1.6.3.3 software associated with the Andromeda search engine (Tyanova et al., 2015). The peptide search parameters were a maximum of three missed cleavages, 0.07 Da and 0.006 Da as first and main search tolerances, 30 as threshold intensity, trypsin as digesting enzyme, carbamidomethylation of cysteines as a fixed modification, and the oxidation of methionine and acetylation of the protein N-terminus as variable modifications. To identify the proteins, peptides with a minimum of 7 amino acids and at least one unique peptide were required.

Data analysis
The diagram of total protein in pigmented rice from Java Island was constructed with the Jvenn tool (http:// jvenn.toulouse.inra.fr/app/index.html) and presented as a Venn diagram. A heat map was generated using the online heat-mapper software (http://heatmapper.ca), and the biological functions of each protein were predicted using the protein informatics resource (PIR) database (https://proteininformationresource.org/). Anthocyanin contents were reported as mean ± standard deviation. All of the experiments were conducted in triplicate and analyzed using either one-way analysis of variance (ANOVA) for cyanidin-3-O-glucoside among three black rice cultivars, or t-test for other anthocyanins between BREJ F5 and BRWJ F15. A p-value of <0.05 was considered to be statistically significant. All genomic data were scored with the values 0 for conserved band/sequence, 1 for similarity >variability, and 2 for similarity<variability (Supplementary Data, Table S2). Phylogenetic analyses were done using the Multi-Variate Statistical Package (MVSP) with UPGMA (Unweight pair group method with arithmetic averages) similarity coefficient.

Genomic characterization of local black rice from Java Island, Indonesia
In order to evaluate the genomic profiles of three black rice cultivars, the simple sequence repeat (SSR) patterns were assessed based on five percent polyacrylamide gel electrophoresis. A preliminary study identified nine sets of SSR primers that showed polymorphic bands in black rice samples. A total of 74 SSR alleles were identified using nine SSR markers across the five cultivars of rice from Java island, 29 of which were polymorphic (Fig. 1a). The SSR markers produced 100-800 bp bands, and specific bands were seen in three black rice cultivars, as shown in Fig. 1a. We found specific bands in all black rice samples, including RM318 and RM224 markers (both of which are 200 bp) in BRWJ, RM202 at around 350 bp in BREJ, and RM251 at around 500 bp in BRCJ and 800 bp in BREJ. A 100 bp RM1369 band was observed in BRCJ and BRWJ, and the 120 bp and 200 bp bands from RM223 appeared in the three cultivars of black rice. RM6364 showed a specific band at around 300-400 bp that was only identified in BRWJ, and 200-300 bp bands that were detected in BREJ and BRCJ.
The nucleotide sequences of the Rc and GT genes showed specific variations in the three black rice cultivars from Java island. The Rc gene segment from exons 1-2 was amplified to give 990 bp fragment (Fig. 1b-c).
All three cultivars of black rice had a c.1797A>T substitution in the Rc gene exon 1. Moreover, other mutations were found in Rc gene exon 1 in BRCJ, namely, a c.1781C>T substitution and a c.1782delC deletion, while a 1800G>A substitution was detected in BRWJ. . These particular mutations were not found in black rice hull from Cambodia (gene accession KX549109.1), and showed to be specific for the black rice cultivars from Java island. The Rc gene exon 2 in the three black rice cultivars showed similar sequences, not found in white rice. The GT gene consisted of one exon of 1693 bp, as shown in Fig. 1d. A partial sequence of the GT gene covering 624 bp that encoded an N-terminal region of the glycosyltransferase protein was compared for the black, red, and white rice cultivars (Fig. 1d-e). We found some mutations in the BRWJ GT gene compared to the other cultivars, which included substitutions c.130G>C, c.135C>A, c.144C>A, c.148G>A, c.174C>T, c.264A>T, c.297C>A, c.366C>T, c.377G>A, c.499G>A, a c.162delA deletion and the insertions c.150_150insT, c.207_208insA, and c.695insG. The mutations observed in the BRWJ GT gene might be unique to this black rice cultivar. This study found several DNA polymorphisms in rice plants from Java island occurring at 305, 328, 454, 479, 543, and 667 bp compared to the database sequence for black rice japonica. This sequence variation may be specific for the rice from Java Island. The genetic relationship between the three black rice cultivars compared to red rice and white rice is illustrated as a phylogenetic tree in Fig. 1f.
According to the SSR patterns and Rc and GT gene sequences, two black rice cultivars, BRWJ and BRCJ, clustered in one group with a similarity coefficient of 0.935, while black rice cultivar BREJ clustered with red rice and white rice in another group, with a similarity coefficient of 0.936. Genomic variability analysis showed that BRWJ had a high GT gene sequence variability, and in addition, BRWJ differed in color and morphological traits. BRWJ has purple color in the stem, auricle, leaf blade, bran, and whole grain (Fig. 1g). In comparison, BREJ and BRCJ are purple only in whole grains, primarily in the bran.

Anthocyanin compositions in local Javanese black rice
The black rice anthocyanins profile was analyzed with UHPLC-DAD and the chromatograph of absorbance at 520 nm is shown in Fig. 2. Cyanidin-3-O-glucoside was identified in the various fractions of all three cultivars of black rice including F5 in BREJ, F9 in BRCJ, and F15 in BRWJ (Fig. 2a-2c). In BREJ F5 and BRWJ F15, cyanidin, peonidin, and peonidin-3-O-glucoside were found. Interestingly, cyanidin expression was higher in BRWJ F15 than in other cultivars, and peonidin-3-O-glucoside expression was higher in BREJ F5 than in other cultivars.

Proteomic profiling of local black rice from Java island
When the rice seed proteins were extracted and trypsin -digested, and the resulting peptides analyzed with Impact II UHR-TOF LC-MS, a total of 3434 proteins were identified in the pigmented rice from Java island. The 2316 proteins were conserved in all the three black rice cultivars from Java island. A Venn diagram shown in Fig. 3a compares the identified proteins in three black rice from Java Island, red rice and white rice (Fig.  3a). Among the black rice cultivars, the most proteins were identified in BREJ, followed by BRWJ and BRCJ. A group of proteins was identified only in a certain rice cultivar, including 166 in BREJ, 72 in BRCJ, and 88 in BRWJ. Figure 3b shows the numbers of common proteins between different cultivars. Out of 2124 proteins in pigmented rice, black rice BRWJ had 166 proteins in common with BRCJ, and 314 proteins in common with BREJ. Black rice BREJ and BRCJ had 266 proteins in common. In comparison BRWJ had 15 proteins in common with RREJ, and 10 with WREJ.
The relative expression levels of proteins related to anthocyanin biosynthesis in different rice cultivars are shown in Fig. 3c. Several enzymes related to anthocyanin biosynthesis were more abundant in BRWJ and BREJ than in other cultivars (Fig. 3c). These included phenylalanine ammonia-lyase (PAL2), OsCAD, flavonoid monoxydase, UDP-Glc dehydrogenase (UDP-Glc DH), 3-glycosyltransferase (Os3GT), acyltransferase, cycloartenol 24-C methyltransferase, chalcone isomerase, and UDP-Glc DH5. The predicted biological functions of the 2316 black rice proteins were clustered into 13 categories, including biological phase, cell proliferation, immune system process, metabolic process, multicellular organism process, rhythmic process, developmental process, response to the stimulus, biological regulation, reproduction, localization, nitrogen utilization, cellular component organization or biogenesis, and cellular process. Metabolic and cellular processes were the most common functions of black rice proteins (Fig. 3d).

DISCUSSION
Genomic profiling of Java black rice plants revealed some specific bands in RM markers, such as RM318, RM224, RM202, RM1369, and RM6364 in case of BRWJ. Furthermore, RM202 showed a specific band in BREJ and RM251 showed specific bands in BRCJ and BREJ samples. RM223 product bands were conserved in the Java black rice cultivars. Fatimah and others (Fatimah et al., 2016) reported that some SSR markers that had high numbers of polymorphic alleles in Indonesian paddy rice were RM162, RM287, RM541, RM144, RM474, and RM171. Another study from Ladjao et al. (2019) revealed that Toraja paddy rice showed high polymorphism in RM259, RM224, RM334, and RM552 products. The different polymorphic patterns of SSR rice markers may be associated with paddy rice traits. For instance, an RM224 pattern was linked to germination rate, shoot dry weight, and shoot length, while the RM223 pattern was related to seed weight, germination rate, shoot length, and seedling early vigor (Anandan et al., 2016). Fukuta and others (Fukuta et al., 2012) mapped RM1369 on chromosome 6 and correlated it with panicle weight in new African (NERICA) rice varieties. The genomic profiling of unique alleles related to specific rice traits in Oryza glaberrima or wild type African rice revealed bands of 170-187 bp in RM250, 117 bp in RM312, and 140- 150 bp in RM223 markers (Chen et al., 2017). Karmakar and others (Karmakar et al., 2012) reported that 10 of 22 studied SSR markers showed unique alleles in the Bengal rice cultivar. The different SSR alleles profile in Java black rice compared to other Asian rice cultivars may reflect adaptation to different environmental factors. The phylogenetic tree of five rice cultivars provided a model of the genetic relationship of these rice cultivars. The clustering of BREJ with WREJ might suggest this white rice is a product of the BREJ black rice domestication process. The domestication process, breeding activities, and landrace condition greatly impact the high similarity coefficient in genetic diversity analysis (Reig-Valiente et al., 2016). The similarity coefficient among the five rice cultivars from Java proved greater than 90%, suggesting that the different cultivars are closely related. Our study revealed that the black rice color was caused by anthocyanin accumulation. Figure 4 shows a summary of the genomic, proteomic and anthocyanin profile components that promote biological activities.
Anthocyanin accumulation in pigmented rice grains is regulated by several genes, such as Rc and GT genes. The Rc gene encodes a basic helix loop helix (bHLH) protein that interacts with WD40 and MYB protein to form the MYB-bHLH-WD40 (MBW) complex (Albert et al., 2014 andPeña-Sanhueza et al., 2017). The MBW complexes switch on and off the structural genes, including those for enzyme involved in anthocyanidin synthesis (OsPAL,OsCHS,OsC4H,Os4CL,OsCHI,OsF3H,OsF3'H,OsANS,OsANR,and OsLDOX), decorating with sugars and methyl groups (Os3GT and OsMT), and transporting anthocyanins (OsGST). Our study identified similar sequences of the Rc gene in three black rice cultivars, indicating it has potential to activate anthocyanin synthesis genes. Previous studies revealed that a 14-bp deletion mutation and a transversion <3% in exon 7 of the Rc gene inactivated DFR gene expression and resulted in white-pericarp in rice (Sweeney et al., 2006;Maeda et al., 2014;Zhu et al., 2019). Another anthocyanin synthesis gene is the GT gene that encodes a glycosyltransferase that can add sugar to stabilize the anthocyanin structure. Some black rice GT gene mutations might be related to specific morphological characters on BRWJ, which has purple leaves, blades, stems, and grains. Li and others (Li et al., 2017a) stated that overexpression of a GT gene implied higher anthocyanin contents in Arabidopsis thaliana. However, GT gene overexpression decreased glycosyltransferase activity and reduced color intensity in Rosa rugosa (Sui et al., 2019).
In the present study, proteomic data identified some transferases in black rice, including proteins with putative methyltransferase, acyltransferase, and glycosyltransferase activities. These transferases may contribute to anthocyanin modifications (Sasaki et al., 2014;Provenzano et al., 2014;Li et al., 2017b;Wang et al., 2018;Sui et al., 2019). We found some proteins related to anthocyanin synthesis with the higher expression in black rice than white rice. Phenylalanine is an amino acid precursor for secondary metabolite synthesis in plants, including that of flavanols and anthocyanins (Cheng et al., 2014). Phenylalanine is processed to leucoanthocyanidin, which is oxidized to cyanidin by leucoanthocyanidin oxidase (LDOX) (Poustka et al., 2007). Fatchiyah and others  described that black rice has higher phenylalanine, flavonoids, and leucoanthocyanidin than red and white rice. Cyanidin is glycosylated to cyanidin-3-O-glucoside by glycosyltransferase (Os3GT), while it is methylated to peonidin by a methyltransferase (Cheng et al., 2014;Provenzano et al., 2014;Olivas-Aguirre et al., 2016;Peng et al., 2017;Zheng et al., 2019b). The acyltransferase in black rice transfers an acyl group to produce more complex anthocyanins (Bontpart et al., 2015). Anthocyanins and proanthocyanidins are synthesized in different pathways. Proanthocyanidins are derived from leucoanthocyanidins and cyanidin through flavan-3-ol (Olivas- Aguirre et al., 2016). Anthocyanins and proanthocyanidins are carried to the vacuole with glutathione-S-transferase (OsGST), serving as their protein transporter (Gomez et al., 2011;Chanoca et al., 2015). Cyanidin-3-O-glucoside and peonidin-3-O-glucoside are the main anthocyanins in black rice (Hou et al., 2013;Pengkumsri et al., 2015;Pedro et al., 2016). Similarly, the present study identified cyanidin 3-O-glucoside in the three kinds of black rice from Java, while other anthocyanins (cyanidin, peonidin, and peonidin-3-O-glucoside) were detected in BREJ F5 and BRWJ F15. The profiles of anthocyanins in black rice from Java might correlate with their biological function.. Fatchiyah and others  reported that BREJ and BRWJ extracts have high total anthocyanins and antioxidant activity. In silico study showed that peonidin-3-O-glucoside may have anti-inflammatory activity via inhibiting TNF-α receptor (Sari et al., 2019a). Cyanidin-3-O-glucoside and peonidin-3-O-glucoside are predicted to have anti-apoptosis effect by inhibiting caspase-3 . In vivo and in-silico studies have proven that black rice anthocyanins act against obesity and adipogenesis , Fatchiyah et al., 2020cSari et al., 2020b). In the current study, anthocyanins were not detected in RREJ red rice fraction containing pigment. Several studies identified malvidin (Chen et al., 2012) and proanthocyanidin, rather than anthocyanins, as the pigments found in high amounts Figure 4. Summary of genomic, proteomic, and anthocyanin profiles in black rice from Java island, numerical superscripts indicate the data derived from previous studies: 1 , 2 , 3 (Sari et al., 2020b), 4 (Sari et al., 2019a). in red rice (Vargas et al., 2018;Laokuldilok et al., 2011;Olivas-Aguirre et al., 2016).

CONCLUSIONS
Three black rice cultivars demonstrated different genomic, proteomic, and anthocyanin profiles. The SSR profiles identified specific bands in three black rice cultivars. The Rc gene exon 2 showed a similar sequence in all black rice cultivars from Java and GT gene demonstrated some mutations and predicted a new variant of the gene in BRWJ cultivar. Proteomic profiles revealed that the levels of proteins related to anthocyanin synthesis varied in black rice cultivars. Black rice Anthocyanin from Java cultivars proved some biological activities and the Java black rice substantiate recommended as a functional food.