Retroposition as a source of antisense long non-coding RNAs with possible regulatory functions

  • Oleksii Bryzghalov Adam Mickiewicz University in Poznan, Poznan, Poland
  • Michał Wojciech Szcześniak Adam Mickiewicz University in Poznan, Poznan, Poland
  • Izabela Makałowska Adam Mickiewicz University in Poznan, Poznan, Poland
Keywords: lncRNAs, long non-coding RNAs, retroposition, retrocopies, antisense transcription, RNA, RNA duplexes


Long non-coding RNAs (lncRNAs) are a class of intensively studied yet enigmatic molecules that make up a substantial portion of the human transcriptome. In this work, we link the origins and functions of some lncRNAs to retroposition, a process resulting in the creation of intronless copies (retrocopies) of the so-called parental genes. We found 35 human retrocopies transcribed in antisense and giving rise to 58 lncRNA transcripts. These lncRNAs share sequence similarity with the corresponding parental genes but in the sense/antisense orientation, meaning they have the potential to interact with each other and to form RNA:RNA duplexes. We took a closer look at these duplexes and found that 10 of the lncRNAs might regulate parental gene expression and processing at the pre-mRNA and mRNA levels. Further analysis of the co-expression and expression correlation provided support for the existence of functional coupling between lncRNAs and their mate parental gene transcripts.


Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403-10.

Beltran M, Puig I, Pena C, Garcia JM, Alvarez AB, Pena R, Bonilla F, de Herreros AG (2008) A natural antisense transcript regulates Zeb2/Sip1 gene expression during Snail1-induced epithelial-mesenchymal transition. Genes Dev 22(6):756-69.

Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114-20.

Ciomborowska J, Rosikiewicz W, Szklarczyk D, Makalowski W, Makalowska I (2013) "Orphan" retrogenes in the human genome. Mol Biol Evol 30(2):384-96.

Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, Lagarde J, Veeravalli L, Ruan X, Ruan Y, Lassmann T, Carninci P, Brown JB, Lipovich L, Gonzalez JM, Thomas M, Davis CA, Shiekhattar R, Gingeras TR, Hubbard TJ, Notredame C, Harrow J, Guigo R (2012) The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 22(9):1775-89.

ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57-74.

Geisler S, Coller J (2013) RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol 14(11):699-712.

Han SP, Tang YH, Smith R (2010) Functional diversity of the hnRNPs: past, present and perspectives. Biochem J 430(3):379-92.

Herrero J, Muffato M, Beal K, Fitzgerald S, Gordon L, Pignatelli M, Vilella AJ, Searle SM, Amode R, Brent S, Spooner W, Kulesha E, Yates A, Flicek P (2016) Ensembl comparative genomics resources. Database (Oxford) 2016:10.1093/database/bav096 [doi].

Howard TL, Stauffer DR, Degnin CR, Hollenberg SM (2001) CHMP1 functions as a member of a newly defined family of vesicle trafficking proteins. J Cell Sci 114(Pt 13):2395-404.

Jablonski JA, Caputi M (2009) Role of cellular RNA processing factors in human immunodeficiency virus type 1 mRNA metabolism, replication, and infectivity. J Virol 83(2):981-92.

Jiang H, Lin JJ, Tao J, Fisher PB (1997) Suppression of human ribosomal protein L23A expression during cell growth inhibition by interferon-beta. Oncogene 14(4):473-80.

Johnsson P, Ackley A, Vidarsdottir L, Lui WO, Corcoran M, Grander D, Morris KV (2013) A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells. Nat Struct Mol Biol 20(4):440-6.

Kabza M, Ciomborowska J, Makalowska I (2014) RetrogeneDB--a database of animal retrogenes. Mol Biol Evol 31(7):1646-8.

Kielbasa SM, Wan R, Sato K, Horton P, Frith MC (2011) Adaptive seeds tame genomic sequence comparison. Genome Res 21(3):487-93.

Kim D, Langmead B, Salzberg SL (2015) HISAT: a fast spliced aligner with low memory requirements. Nat Methods 12(4):357-60.

Kodama Y, Shumway M, Leinonen R (2012) The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res 40(Database issue):D54-6.

Kong L, Zhang Y, Ye ZQ, Liu XQ, Zhao SQ, Wei L, Gao G (2007) CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res 35(Web Server issue):W345-9.

Kornienko AE, Guenzl PM, Barlow DP, Pauler FM (2013) Gene regulation by the act of long non-coding RNA transcription. BMC Biol 11:59.

Kugel JF, Goodrich JA (2012) Non-coding RNAs: key regulators of mammalian transcription. Trends Biochem Sci 37(4):144-51.

Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(4):357-9.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16):2078-9.

Li J, Belogortseva N, Porter D, Park M (2008) Chmp1A functions as a novel tumor suppressor gene in human embryonic kidney and ductal pancreatic tumor cells. Cell Cycle 7(18):2886-93.

Mayeda A, Munroe SH, Caceres JF, Krainer AR (1994) Function of conserved domains of hnRNP A1 and other hnRNP A/B proteins. EMBO J 13(22):5483-95.

Mayeda A, Munroe SH, Xu RM, Krainer AR (1998) Distinct functions of the closely related tandem RNA-recognition motifs of hnRNP A1. RNA 4(9):1111-23.

Milligan MJ, Harvey E, Yu A, Morgan AL, Smith DL, Zhang E, Berengut J, Sivananthan J, Subramaniam R, Skoric A, Collins S, Damski C, Morris KV, Lipovich L (2016) Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome. Front Genet 7:26.

Milligan MJ, Lipovich L (2014) Pseudogene-derived lncRNAs: emerging regulators of gene expression. Front Genet 5:476.

Morris KV, Santoso S, Turner AM, Pastori C, Hawkins PG (2008) Bidirectional transcription directs both transcriptional gene activation and suppression in human cells. PLoS Genet 4(11):e1000258.

Navarro FC, Galante PA (2015) A Genome-Wide Landscape of Retrocopies in Primate Genomes. Genome Biol Evol 7(8):2265-75.

Necsulea A, Soumillon M, Warnefors M, Liechti A, Daish T, Zeller U, Baker JC, Grutzner F, Kaessmann H (2014) The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505(7485):635-40.

Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33(3):290-5.

Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841-2.

Sievers F, Higgins DG (2014) Clustal Omega, accurate alignment of very large numbers of sequences. Methods Mol Biol. 1079:105-16.

Speir ML, Zweig AS, Rosenbloom KR, Raney BJ, Paten B, Nejad P, Lee BT, Learned K, Karolchik D, Hinrichs AS, Heitner S, Harte RA, Haeussler M, Guruvadoo L, Fujita PA, Eisenhart C, Diekhans M, Clawson H, Casper J, Barber GP, Haussler D, Kuhn RM, Kent WJ (2016) The UCSC Genome Browser database: 2016 update. Nucleic Acids Res 44(D1):D717-25.

Stauffer DR, Howard TL, Nyun T, Hollenberg SM (2001) CHMP1 is a novel nuclear matrix protein affecting chromatin structure and cell-cycle progression. J Cell Sci 114(Pt 13):2383-93.

Sun L, Luo H, Bu D, Zhao G, Yu K, Zhang C, Liu Y, Chen R, Zhao Y (2013) Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts. Nucleic Acids Res 41(17):e166.

Szczesniak MW, Ciomborowska J, Nowak W, Rogozin IB, Makalowska I (2011) Primate and rodent specific intron gains and the origin of retrogenes with splice variants. Mol Biol Evol 28(1):33-7.

Szczesniak MW, Makalowska I (2016) lncRNA-RNA Interactions across the Human Transcriptome. PLoS One 11(3):e0150353.

Szczesniak MW, Rosikiewicz W, Makalowska I (2016) CANTATAdb: A Collection of Plant Long Non-Coding RNAs. Plant Cell Physiol 57(1):e8.

Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511-5.

Tsai MC, Manor O, Wan Y, Mosammaparast N, Wang JK, Lan F, Shi Y, Segal E, Chang HY (2010) Long noncoding RNA as modular scaffold of histone modification complexes. Science 329(5992):689-93.

UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43(Database issue):D204-12.

Washietl S, Kellis M, Garber M (2014) Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals. Genome Res 24(4):616-28.

Weinberg MS, Morris KV (2013) Long non-coding RNA targeting and transcriptional de-repression. Nucleic Acid Ther 23(1):9-14.

Yap KL, Li S, Munoz-Cabello AM, Raguz S, Zeng L, Mujtaba S, Gil J, Walsh MJ, Zhou MM (2010) Molecular interplay of the noncoding RNA ANRIL and methylated histone H3 lysine 27 by polycomb CBX7 in transcriptional silencing of INK4a. Mol Cell 38(5):662-74.

You Z, Xin Y, Liu Y, Sun J, Zhou G, Gao H, Xu P, Chen Y, Chen G, Zhang L, Gu L, Chen Z, Han B, Xuan Y (2012) Chmp1A acts as a tumor suppressor gene that inhibits proliferation of renal cell carcinoma. Cancer Lett 319(2):190-6.

Zhao Y, Li H, Fang S, Kang Y, Wu W, Hao Y, Li Z, Bu D, Sun N, Zhang MQ, Chen R (2016) NONCODE 2016: an informative and valuable data source of long non-coding RNAs. Nucleic Acids Res 44(D1):D203-8.