StructAnalyzer - a tool for sequence vs. structure similarity analysis

  • Jakub Wiedemann Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 2, 60-965 Poznan, Poland
  • Maciej Miłostan Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 2, 60-965 Poznan, Poland and Institute of Bioorganic Chemistry, Polish Academy of Sciences, Z. Noskowskiego 12/14, 61 704 Poznan, Poland
Keywords: sequence similarity, structural similarity, RNA


In the world of RNAs and proteins similarities on the level of primary structures of two comparable molecules usually correspond to structural similarities on the tertiary level. In other words, measures of sequence and structure similarities are in general correlated – high value of sequence similarity impose high value of structural similarity. However important exceptions  that stay in contrary with the general rule can be identified. It is possible to find similar structures with very different sequences and also similar sequences with very different structures. In this paper we focus attention on the latter case and propose  a tool, called StructAnalyzer, supporting analysis of relations between sequence and structure similarities. Recognition of diversity of tertiary structures of molecules with very similar primary structures may be the key for better understanding of mechanisms influencing folding of RNA or proteins and as result their function. StructAnalyzer allows exploration and visualization of structural diversity in relation to sequence similarity. We show how the tool can be used to screen RNA structures in PDB for sequences with structural variants.


Alexander PA, He Y, Chen Y, Orban J, Bryan PN, (2009) A minimal sequence code for switching protein structure and function. Proc Natl Acad Sci USA, 106(50):21149-54 (doi: 10.1073/pnas.0906408106)

Berman HM, Westbrook J, Feng Z., Gilliland G., Bhat TN, Weissig H, Shindyalov IN, Bourne PE, (2000) The Protein Data Bank. Nucleic Acids Research, 28:235-242 (doi: 10.1093/nar/28.1.235)

Cheng CY, Chou F-C, Das R, (2015) Chapter Two - Modeling Complex RNA Tertiary Folds with Rosetta. In: Methods in Enzymology, Chen S-J and Burke-Aguero DH, eds, 55:35-64, Academic Press (doi:10.1016/bs.mie.2014.10.051)

Das R and Baker D, (2007) Automated de novo prediction of native-like RNA tertiary structures, PNAS 104 (37):14664-14669 (doi:10.1073/pnas.0703836104)

Edgar RC, (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res., 32(5):1792-7. Print 2004. PubMed PMID: 15034147 (doi: 10.1093/nar/gkh340)

Edgar RC, (2004a) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, 5:113. PubMed PMID: 15318951 (doi: 10.1186/1471-2105-5-113)

Author (2010)

Lukasiak P, Antczak M, Ratajczak T, Szachniuk M, Popenda M, Adamiak RW, Blazewicz J, (2015) RNAssess - a webserver for quality assessment of RNA 3D structures. Nucleic Acids Research 43(W1): W502-W506 (doi:10.1093/nar/gkv557)

Lukasiak P, Antczak M, Ratajczak T, Bujnicki JM, Szachniuk M, Popenda M, Adamiak RW, Blazewicz J, (2013) RNAlyzer - novel approach for quality analysis of RNA structural models, Nucleic Acids Research 41(12):5978-90 (doi:10.1093/nar/gkt318)

Miao Z, Adamiak RW, Blanchet M-F, Boniecki M, Bujnicki JM, Chen S-J, Cheng C, Chojnowski G, Chou F-C, Cordero P, Cruz JA, Ferre-D'Amare A, Das R, Ding F, Dokholyan NV, Dunin-Horkawicz S, Kladwang W, Krokhotin A, Lach G, Magnus M, Major F, Mann TH, Masquida B, Matelska D, Meyer M, Peselis A, Popenda M, Purzycka KJ, Serganov A., Stasiewicz J., Szachniuk M., Tandon A., Tian S., Wang J., Xiao Y., Xu X., Zhang J, Zhao P., Zok T., Westhof E, (2015) RNA-Puzzles Round II: Assessment of RNA structure prediction programs applied to three large RNA structures. RNA, 21(6):1-19 (doi:10.1261/rna.049502.114)

Popenda M, Szachniuk M, Antczak M, Purzycka KJ, Lukasiak P, Bartol N, Blazewicz J, Adamiak RW, (2012) Automated 3D structure composition for large RNAs. Nucleic Acids Research 40(14):e112 (doi:10.1093/nar/gks339)

Pruitt KD, Tatusova T, Brown GR, Maglott DR, (2012) NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Research. 40(Database issue):D130-5

Rother K, Rother M, Boniecki M, Puton T, Bujnicki JM, (2011) RNA and protein 3D structure modeling: similarities and differences. Journal of Molecular Modeling 17(9):2325-2336. (doi:10.1007/s00894-010-0951-x)

Zok T, Popenda M, Szachniuk M, (2014) MCQ4Structures to compute similarity of molecule structures. Central European Journal of Operations Research 22(3): 457-474 (doi:10.1007/s10100-013-0296-5)