Biological functions of natural antisense transcripts

Natural antisense transcripts (NATs) are RNA molecules that originate from opposite DNA strands of the same genomic locus (cis-NAT) or unlinked genomic loci (transNAT). NATs may play various regulatory functions at the transcriptional level via transcriptional interference. NATs may also regulate gene expression levels posttranscriptionally via induction of epigenetic changes or double-stranded RNA formation, which may lead to endogenous RNA interference, RNA editing or RNA masking. The true biological significance of the natural antisense transcripts remains controversial despite many years of research. Here, we summarize the current state of knowledge and discuss the sense-antisense overlap regulatory mechanisms and their potential.


INTRODUCTION
Natural antisense transcripts (NATs) are separated into two main categories, cis or trans, depending on their genomic origin.The most popular definition of cis-NATs describes these molecules as RNA sequences that originate from the opposite DNA strand of the same genomic locus, such that they physically share some genetic sequence.The overlap may be complete (Fig. 1C) or partial (Fig. 1A, B), and it may also be described as head-to-head (Fig. 1A), tail-to-tail (Fig. 1B) or embedded (Fig. 1C) (Makalowska et al., 2005).The sequences of cis-NATs within the overlap region are perfectly com-plementary between the sense and antisense RNAs, unlike trans-NATs, where transcripts originate from different genomic loci (Fig. 1D) (Vanhée-Brossollet & Vaquero, 1998).
Almost half of the century has passed since the first natural antisense RNAs were found within the b2 region of coliphage λ (Bovre & Szybalski, 1969).Bovre and Szybalski concluded that double-stranded RNA (dsRNA) may be produced from overlapping transcripts under some conditions.They also suggested that RNA polymerases transcribing opposite strands in the b2 region may collide with each other, which would lead to premature transcription termination (Bovre & Szybalski 1969).Today this is known as polymerase collision, one of the transcriptional interference scenarios (Shearwin et al., 2005).The golden age of natural antisense transcripts was far ahead despite this early discovery.Only 11 pairs of protein-coding genes with non-protein coding natural antisense counterparts were described until the late 1980s, all in viral genomes (Inouye, 1988).These studies consolidated the hypotheses that this type of genomic architecture was rather rare and probably limited to bacterial and viral genomes (Barrell et al., 1976;Sanger et al., 1977;Szekely, 1977).However, the first discoveries of overlapping genes in eukaryotic genomes were published in 1986.Henikoff and co-workers found that the pupal cuticle protein (Pcp) gene of Drosophila melanogaster was embedded on the opposite DNA strand of the Gart gene within its first intron (Henikoff et al., 1986).Further discoveries in fruit fly (Spencer et al., 1986) and mouse (Williams & Fried, 1986) were identified in the same year.The first overlapping pairs were found in humans and yeast three years later (van Duin et al., 1989).

BIOLOGICAL FUNCTIONS OF NATURAL ANTISENSE TRANSCRIPTS
The biological significance of natural antisense transcripts remains controversial despite many years of research.Some groups describe natural antisense transcripts as transcriptional noise with the potential to acquire a secondary function.Other groups posit that this latent regulatory potential is underestimated and should be considered as another level of gene expression regulation.The overlap between natural sense and antisense transcripts may regulate expression at the transcriptional level (via transcriptional interference) and/or post-transcriptional level.Regulation may be achieved via modulation of chromatin changes by NATs or the formation of double-stranded RNA, which leads to RNA masking, RNA interference or RNA editing (Faghihi and Wahlestedt 2009, Lu et al., 2012, Celton et al., 2014).Expression level regulation in cis, described in sections 1 and 2, may occur at the transcriptional and post-transcriptional levels.In contrast, regulation in trans, discussed in section 3, may act only post-transcriptionally (Vanhée-Brossollet & Vaquero, 1998).

Transcriptional interference
Natural antisense transcripts may regulate expression at the level of transcription via transcriptional interference (TI).This term describes a down-regulatory influence of the two ongoing transcription processes in a relatively close proximity (Shearwin et al., 2005).TI may result in transcriptional downturn, transcriptional inhibition or early transcription termination.Four main mechanisms of transcriptional interference were proposed.The first mechanism, promoter competition, is a mechanism in which promoter regions overlap, and transcription may start only at one of them at a time (Fig. 2A).The second mechanism is called "sitting duck interference", and it describes a situation in which RNA polymerase II (RNAPII) progresses too slowly to the elongation phase, and it is dislodged by another RNA polymerase II (Fig. 2B).The third scenario, occlusion, involves the temporary blocking of transcription initiation at a particular promoter region by the ongoing elongation of RNAPII originating from a different promoter (Fig. 2C).The last mechanism, polymerase collision, occurs when two RNAPIIs, which are transcribing genes in opposite directions, block each other's passage and collide in a headto-head manner (Fig. 2D).Shearwin et al., thoroughly discussed these mechanisms in a review (Shearwin et al., 2005).
Transcriptional interference was intensively studied in recent years.A transcriptional collision model was inferred from human and mouse genomes analyses, which observed lower expression levels of sense-antisense transcripts in longer overlapping regions (Shearwin et al., 2005;Osato et al., 2007).RNAPII collisions were described in yeast in in vitro and in vivo models.RNAPIIs were temporary suspended during transcriptional collision events, but the elongation complexes were stable, which extended the half-life of the RNAPII involved in this process (Hobson et al., 2012).Notably, the RNAPII collision was also linked with the "off-targeting" of the activation-induced cytidine deaminase (AID), which initiates somatic hypermutations (SHM) and  (Muramatsu et al., 2000;Meng et al., 2014;Pefanis et al., 2014).AID initiated on non-Ig targets is related to human B cell lymphomas (Alt et al., 2013).TI was also linked with Huntington's disease, in which CAG repeat expansion within the first exon of the huntingtin (HTT) gene is associated with the disease pathology (DiFiglia et al., 1997;Chung et al., 2011).HTT expression level is down-regulated by the huntingtin antisense transcript HTTAS via transcriptional interference and/or a Dicer-dependent mechanism.Growing CAG repeat expansion in huntingtin weakens the HT-TAS promoter strength and antisense expression level, which results in the up-regulation of HTT in Huntington's disease patients (Chung et al., 2011).TI regulates the expression level of the frequency (frq) gene in Neurospora crassa, which is a central component of the circadian clock (Xue et al., 2014;Cha et al., 2015).The expression of the antisense non-protein coding gene qrf leads to the premature transcription termination of the frq gene via transcriptional interference and mediation of chromatin modifications (Xue et al., 2014).The core circadian clock of animals is regulated in a similar manner (Koike et al., 2012;Menet et al., 2012;Vollmers et al., 2012).
How often a transcriptional interference truly controls the expression level of genes is debatable (Hobson et al., 2012).This mechanism may control the vast majority of genes because recent studies discovered that antisense non-coding RNAs are counterparts of a substantial number of genes in animals (Lehner et al., 2002;Yelin et al., 2003;Chen et al., 2004;Lapidot & Pilpel, 2006;Conley et al., 2008) and plants (Yamada et al., 2003;Stolc et al., 2005;Li et al., 2006;Matsui et al., 2008;Lu et al., 2012;Luo et al., 2013).Models of transcriptional interference mostly suggest a negative correlation of antisense RNA expression levels (Shearwin et al., 2005).However, studies indicate that overlapping transcripts generally do not exhibit expression level correlations, and these correlations tend to be positive rather than negative (Oeder et al., 2007;Grigoriadis et al., 2009;Conley & Jordan, 2012;Ling et al., 2013).Transcriptional interference may not generally regulate the expression level of all genes, but in some cases, TI may subtly regulate the expression levels of at least some genes where regulatory functions have emerged (Brophy & Voigt, 2016).

RNA masking
Simultaneous transcription of antisense RNAs may lead to the formation of double-stranded RNA, which may interfere with the accessibility of the target sequences of various miRNAs (Fig. 3A).This mechanism was discovered recently for the Sirt1 gene, which possesses a target sequence for miR-34a.The miRNA target sequence is located within the overlap region between the Sirt1 gene and its antisense, Sirt1-AS.This positioning results in competition between Sirt1-AS and miR-34a for Sirt1 transcript binding (Wang et al., 2016).Overexpression of the Sirt1-AS stabilized Sirt1 mRNA and increased its half-life from 2 to 10 hours (Wang et al., 2016).Similarly, beta-secretase-1 (BACE1) expression level is negatively controlled by miR-485-5p binding and positively controlled by the formation of dsRNA by BACE1 sense and antisense (BACE1-AS) transcripts.Knockdown of BACE1-AS exhibits the same effect as silencing of BACE1 (Modarresi et al., 2011).The imbalance of BACE1, BACE1-AS and miR-485-5p expression leads to an up-regulation of BACE1, which was linked to pathological states in patients with Alzheimer's disease (Faghihi et al., 2008;Faghihi et al., 2010).NATs are also involved in Parkinson's disease, where the short splice variant of PTEN-induced putative kinase 1 (PINK1), called svPINK1, may form a dsRNA with its antisense that is complementary at almost full-length with the sense RNA.dsRNA formation leads to stabilization of the sense transcript, and the antisense knockout resulted in the loss of the svPINK1 splice variant (Scheele et al., 2007).
Formation of dsRNA may also increase stability of the involved RNA molecules via protection from digestion by ribonucleases aimed at single-stranded RNA degradation, which was demonstrated in cyanobacterium Prochlorococcus sp.RNase E (Stazic et al., 2011).Sense-antisense duplexes may protect RNA molecules from entering into nonsense-mediated decay (NMD), which was demonstrated in yeasts (Wery et al., 2016).Protection against single-stranded RNases was also demonstrated in nds-2a, which is a stable, naturally occurring human dsRNA of sense-antisense transcription origin.Knockdown of nds-2a dsRNA using strand-specific locked nucleic acid (LNA) gapmers resulted in numerous mitotic-related effects which suggests a functional role of these RNA duplexes (Portal et al., 2015).dsRNA formation may influence alternative splicing of the rat and human α-thyroid hormone receptor (TRα) gene, which encodes two splice variants, TRα1 (active, hormone-binding variant) and TRα2 (inactive, non-hormone-binding variant).NAT only binds to the longer, non-hormone-binding TRα2 splice variant.TRα2-NAT dsRNA formation is presumably responsible for the negative regulation of TRα2 expression level, which pre- vents the inactive variant expression (Munroe & Lazar, 1991;Hastings et al., 2000).Munroe recently demonstrated that the TRα2 variant was not present in marsupials and platypus, and comparative analysis revealed that only TRα2 was adopted as a TRα expression level regulator in eutherian lineage (Munroe et al., 2015).
Another example involves regulation of the E-cadherin protein complex which plays a key role in cellular adhesion.Dysfunctions of this complex are associated with increased tumor metastasis (Beavon, 2000).Zeb2 is a transcriptional repressor of E-cadherin.Expression of the Zeb2 NAT induces an alternative splicing of Zeb2, which results in the presence of an intron containing an internal ribosome entry site (IRES) that is necessary for the Zeb2 protein synthesis (Beltran et al., 2008).
RNA masking may also inhibit expression at the translational level, which was demonstrated in the BCMA gene.The amount of BCMA mRNA in cells is not altered by the expression level of antisense (normal or increased expression).However, increased expression of BCMA-antisense RNA results in a lower amount of the BCMA protein (Hatzoglou et al., 2002).The functional significance of NATs at the translational level was demonstrated in more detail for PU.1 transcription factor expression regulation, where the PU.1 antisense interferes with the formation of the PU.1 elongating complex (eEF1A-mRNA) (Ebralidze et al., 2008).

RNA editing (A-to-I)
Natural antisense transcripts that form double-stranded RNA could become a target for the adenosine deaminase acting on RNA (ADAR) in a process called RNA editing (Fig. 3B).ADAR editing leads to adenosine (A) deamination into inosine (I), which is further interpreted as guanosine (G) by the cellular translational and splicing machinery (Nigita et al., 2015).Peters and co-workers investigated the 162-nt long overlap region of the 4f-rnp and sas-10 overlapping gene sequences in Drosophila menalogaster and discovered that approximately 20% of the 4f-rnp and sas-10 transcripts show marks of RNA editing at random positions (Peters et al., 2003).The extent to which RNA editing plays a biologically significant role in natural antisense transcripts is not fully understood (Wight and Werner 2013).Nevertheless, reports of thousands of human, mouse and fly RNA editing sites are reported in different contexts (Laganà et al., 2012;Wang et al., 2013;Ramaswami & Li, 2014;Zhang et al., 2016).Therefore, the discovery of the links of these sites with NATs on a broader scale is likely a matter of time.
Formation of dsRNA may protect natural antisense transcripts from single-stranded RNase activity and simultaneously limit the prevalence of NATs primarily to the nucleus.Formation or transport of a double-stranded RNA to the cytoplasm could trigger an immune response.Cellular machinery recognizes dsRNA as a viral infection and promotes the inhibition of protein synthesis and the transcriptional induction of interferon and other cytokines, which may ultimately lead to cell death (Wang & Carmichael, 2004;Kumar et al., 2004;Gantier & Williams, 2007).NAT-related dsRNAs are primarily located in the nucleus, possibly to avoid the abovementioned immune response (Faghihi & Wahlestedt, 2006;van Heesch et al., 2014;Portal et al., 2015).The regulatory function of NATs in the cytoplasm is limited to regulation by endo-siRNAs, which are present in the nucleus and cytoplasm (Portal et al., 2015).However, stable sense-antisense duplexes that were predominantly located in the cytoplasm were also reported (Dallosso et al., 2007;Michael et al., 2011).Therefore, the extent to which the cellular interferon pathway is activated in response to naturally occurring dsRNAs is debatable (Wang & Carmichael, 2004).
Genome-wide analyses using RNA-Seq and singlestranded RNA-Seq protocols, strengthened by parallel analyses of small RNA or degradome sequencing, were proposed in recent years to expand our knowledge of the NATs involved in the endogenous RNA interference (Lu et al., 2012;Li et al., 2013;Werner et al., 2014;Yu et al., 2016).These studies revealed that nearly 4% of the Arabidopsis thaliana cis-NATs produce putative endo-siRNAs, and approximately 200 of these endo-siRNAs exhibit relatively high expression levels ≥ 10 RPKM (Reads Per Kilobase per Million mapped reads) (Li et al., 2013).Studies in orchid (Dendrobium officinale) identified 63 natural antisense transcripts that produced endo-siRNAs (Yu et al., 2016).A total of 2292 NATs were reported as a source of endo-siRNA in rice (Oryza sativa) (Lu et al., 2012).A large-scale analysis of the small RNA transcriptome in human embryonic kidney cells revealed that the senseantisense transcription of 169 RefSeq genes may lead to the formation of endo-siRNAs.These endo-siRNAs were also mostly enriched by AGO1 and RNAPII, which correlated with the actively transcribed endo-siRNA precursors (Werner et al., 2014).Our understanding of the biological impact of these findings requires further study, but the use of next-generation sequencing for large-scale endo-siRNA studies has already revealed their widespread occurrence and regulatory potential.

Epigenetic modifications
Natural antisense transcripts may regulate expression levels of the sense genes and mediate chromatin modifications within the gene sequence, promoter or enhancer regions, the entire locus or even surrounding genomic loci (Li & Ramchandran, 2010;Halley et al., 2013;Wight & Werner, 2013).For example, expression of the HBA2 gene may be down-regulated by a repressive chromatin modification within the HBA2 promoter regions by its Biological functions of natural antisense transcripts antisense transcript, LUC7 (Tufarelli et al., 2003).Another example is brain-derived neurotrophic factor (BDNF) and its antisense, BDNF-AS, which may play a role in the guidance, introduction and maintenance of the histone H3K27me3 modification within the BDNF locus (Modarresi et al., 2012b).BDNF-AS was associated with the recruitment of polycomb repressive complex 2 (PCR2), which locally induces the trimethylation of histone H3K27 within the locus, without exerting any effect on the surrounding loci (Modarresi et al., 2012b).Decreased expression of BDNF is associated with Alzheimer's, Parkinson's, and Huntington's diseases (Bathina & Das, 2015).Knockout of BDNF-AS results in an up-regulation of BDNF expression levels, which supports the therapeutic potential of this mechanism (Modarresi et al., 2012a).In contrast to BDNF-AS, which locally induces histone modifications, an antisense of the mouse Kcnq1 gene, Kcnq1ot1, may impact the entire Kcnq1 domain.Kcnq1ot1 is responsible for the recruitment of PCR2 and G9a methylotransferases and the induction of repressive histone modifications of the Kcnq1 gene and several upand down-stream localized genes (Pandey et al., 2008).Notably, large-scale interactions of Kcnq1ot1 were present in a linage-specific manner in mouse placenta, but not fetal liver.The regulatory significance of Kcnq1ot1 is emphasized by the insertion of a premature transcriptional stop signal and synthesis of a truncated Kcnq1ot1 transcript, which results in an up-regulation of all genes in the Kcnq1 domain (Mancini-Dinardo et al., 2006).NAT recruitment of PCR2 is also involved in the X-chromosome inactivation by X-inactive specific transcript (Xist) and its antisense, Tsix (Halley et al., 2013).Tsix biallelic expression before the X-inactivation leads to the silencing of Xist on both X chromosomes via H3K27me3 histone modifications over the Xist promoter regions (Ohhata et al., 2015).Tsix expression becomes monoallelic at the early stage of chromosome inactivation, which results in a de-repression of the Xist promoter and heterochromatization of the chromosome X where Tsix was silenced (Lee et al., 1999, Ohhata et al., 2015).Tsix dysfunctions may lead to several X-linked diseases (Chaligné & Heard, 2014;Charles Richard & Ogawa, 2016).

NAT REGULATORY FUNCTIONS IN TRANS
Natural antisense transcripts that function in a trans arrangements (trans-NAT) have not been studied as intensively as cis-NATs.Trans-NAT sequences within the "overlap" region may not be fully complementary to the target sequence because these sequences originate from different genomic loci (Vanhée-Brossollet & Vaquero, 1998).However, this partial complementarity may still lead to the formation of double-stranded RNAs.Recent studies demonstrated that thousands of human (Szcześniak & Makałowska, 2016) and plant (Szcześniak et al., 2016) transcripts exhibit the potential to form ln-cRNA-RNA duplexes, and new functional trans-NATs are continuously being discovered (Roberts & Morris, 2013).Every DNA-mediated duplication and retrotransposition event is generally a source of a sequence that is complementary to the original sequence.The emerged copy possesses the potential for expression, which may lead to trans-NAT formation (Muro & Andrade-Navarro, 2010;Roberts & Morris, 2013).One well-studied example is the nitric oxide (NO) neurotransmitter in Lymnaea stagnalis, which is involved in long-term memory formation and associated with food-reward conditioning (Kemenes et al., 2002).NO production is catalyzed by NO-synthase (NOS), which is negatively regulated in trans by the NOS pseudogene antisense transcript (antiNOS-2) via dsRNA formation of NOS mRNA and antiNOS-2.Decreased antiNOS-2 expression levels facilitate memory formation in classical conditioning (Korneev et al., 1999;Korneev et al., 2002;Korneev et al., 2013).Pseudogenes in mouse oocytes are also a source of endo-siRNAs that originate from dsRNA that forms between parental mRNA and homologous pseudogene antisense sequences (Tam et al., 2008;Watanabe et al., 2008).The identified endo-siRNAs complex are Dicer-dependent and enriched with Ago2.Target sequence expression levels of the detected complex increased in Dicer and Ago2 knockout mutants (Watanabe et al., 2008).
Three potential endo-siRNA precursor regions, esiRNA1, esiRNA2 and esiRNA3 were identified in the human hepatocellular carcinoma pseudogene ψPPM1K sequence.Endo-siRNA may arise from these precursors in two ways.The first mechanism is related with esiRNA3 and based on dsRNA formation by ψPPM1K antisense and the cognate gene (PPM1K) transcripts.The second mechanisms involves esiRNA1 and leads to endo-siRNA maturation from the hairpin structure formed by the ψPPM1K transcript.esiRNA1 may down-regulate the expression of the cognate PPM1K (protein phosphatase, Mg 2+ /Mn 2+ -dependent) gene and NEK8 (NIMA-related kinase 8) gene.These effects were not observed in a ψPPM1K mutant with deletion of the esiRNA1 precursor region (Chan et al., 2013).
Trans-NATs also induce chromatin epigenetic changes.Methylation of the Oct4 gene promoter region by the recruitment of Ezh2 methylotransferase is linked to an antisense transcript of Oct4-pseudogene 5 -asOct4-pg5.Separate knockdowns of Ezh2 and asOct4-pg5 resulted in Oct4 up-regulation via demethylation of its promoter regions.A down-regulatory influence of the asOct4-pg5 was also RNAi independent (Hawkins & Morris, 2010).
Natural antisense transcripts are able to regulate the sense gene expression levels in a cis and trans manner in some genomic arrangements.For example, the DHRS4 gene cluster is composed of three highly homologous genes, DHRS4, DHRS4L1 and DHRS4L2, and the DHRS4 gene is regulated in cis by AS1DHRS4, a headto-head antisense transcript.AS1DHRS4 also regulates DHRS4L1 and DHRS4L2 genes in trans.AS1DHRS4 controls the epigenetic silencing of all promoter regions within the DHRS4 gene cluster via interaction with EZH2 and G9c methylotransferases.Knockout of AS1DHRS4 increases the expression of genes in the DHRS4 gene cluster (Li et al., 2012).
The number of known pseudogenes that gained a new regulatory function by antisense transcription is small, but yearly discoveries refine our understanding of the potential of trans-NATs to regulate gene expression on another level.

CONCLUDING REMARKS
Natural antisense transcripts possess a great potential to regulate the sense gene expression at transcriptional and post-transcriptional levels, and their functional relevance is supported by the numerous reports of their tissue-specific expression (Lu et al., 2012;Conley & Jordan, 2012;Ling et al., 2013).
The biological significance of natural antisense transcripts will likely be debatable for a long time.However, our increasing understanding of the NAT biology sheds new light on the functional importance of antisense transcription, which may not regulate every single natural sense-antisense pair, but is surely essential for the proper functioning of all living organisms.

Figure 1 .
Figure 1.Types of natural antisense transcript overlap.(A) Head-to-head overlap in cis; (B) Tail-to-tail overlap in cis; (C) Embedded overlap in cis; (D) Overlap in trans, green and blue arrows represent transcripts originated from different genomic loci, and forming double-stranded RNA.Full or partial complementarity between transcripts is indicated by a regularly spaced or disrupted "ladder" of grey vertical lines within the overlap region of cis and trans overlapping transcripts, respectively.

Figure 2 .
Figure 2. Mechanisms of transcriptional interference.(A) Promoter competition; (B) Sitting duck interference; (C) Occlusion; (D) Polymerase collision; RNAPII -RNA polymerase II; Blue/green boxes with arrows indicate the transcription direction and promoter regions of genes A and B, respectively.Arrows next to RNAPIIs indicate the premature end of transcription of a particular RNAPII.Based on Shearwin et al., 2005.

Figure 3 .
Figure 3. Regulatory roles of double-stranded RNA (dsRNA) formation.It may lead to: (A) RNA masking, that may cause the transcript's protection from an RNase activity, interference with translation and splicing machinery, or interference with miRNA binding sites' accessibility; (B) dsRNA editing by the adenosine deaminase acting on RNA (ADAR); (C) RNA interference by Dicer-dependent post-processing of dsRNA to short siRNA, followed by Argonaute (AGO) loading into the RNA-induced silencing complex.