image University of Washington Computer Science & Engineering
  Modeling and Searching for Noncoding RNA
  CSE Home   About Us    Search    Contact Info 

These lectures are part of the 2014 GENOME 540 / GENOME 541 course series, Introduction to Computational Molecular Biology.

Lecture Slides Homework Some Web Resources Software

Core Reading:

  1. Durbin, Richard and Eddy, Sean R. and Krogh, Anders and Mitchison, Graeme, Biological Sequence Analysis: Probabilistic models of proteins and nucleic acids, Cambridge, 1998. Sections 9.5-9.7 and Chapter 10.
  2. RR Breaker, "Complex riboswitches." Science, 319, #5871 (2008) 1795-7. [offcampus]
  3. PP Amaral, ME Dinger, TR Mercer, JS Mattick, "The eukaryotic genome as an RNA machine." Science, 319, #5871 (2008) 1787-9. [offcampus]

Background: Some background papers if you want to brush up. The Eddy papers are short gems; the RNA, HMM and dynamic programming papers are especially relevant.

  1. SR Eddy, "What is dynamic programming?" Nat. Biotechnol., 22, #7 (2004) 909-10. [offcampus]
  2. SR Eddy, "Where did the BLOSUM62 alignment score matrix come from?" Nat. Biotechnol., 22, #8 (2004) 1035-6. [offcampus]
  3. SR Eddy, "What is Bayesian statistics?" Nat. Biotechnol., 22, #9 (2004) 1177-8. [offcampus]
  4. SR Eddy, "What is a hidden Markov model?" Nat. Biotechnol., 22, #10 (2004) 1315-6. [offcampus]
  5. SR Eddy, "How do RNA folding algorithms work?" Nat. Biotechnol., 22, #11 (2004) 1457-8. [offcampus]

Recommended:

  1. J Gorodkin, IL Hofacker, E Torarinsson, Z Yao, JH Havgaard, WL Ruzzo, "De novo prediction of structured RNAs from genomic sequences." Trends Biotechnol., 28, #1 (2010) 9-19. [offcampus]

Optional Reading: Additional background on various of the topics I touched. Sample as much or as little of this as you like.

    Books:

    1. J. Gorodkin, W.L. Ruzzo, RNA Sequence, Structure and Function: Computational and Bioinformatic Methods, Methods in Molecular Biology, vol. 1097. Humana Press, 2014. (Springer Protocols) 533pp.

    Reviews: Some reviews about non-coding RNAs in general

    1. G Storz, "An expanding universe of noncoding RNAs." Science, 296, #5571 (2002) 1260-3. [offcampus]
    2. SR Eddy, "Computational genomics of noncoding RNA genes." Cell, 109, #2 (2002) 137-40. [offcampus]
    3. A Hüttenhofer, P Schattner, N Polacek, "Non-coding RNAs: hope or hype?" Trends Genet., 21, #5 (2005) 289-97. [offcampus]
    4. JS Mattick, IV Makunin, "Non-coding RNA." Hum. Mol. Genet., 15 Spec No 1, (2006) R17-29. [offcampus]
    5. ME Dinger, TR Mercer, JS Mattick, "RNAs as extracellular signaling molecules." J. Mol. Endocrinol., 40, #4 (2008) 151-9. [offcampus]

    Riboswitches: Some of the 698 (as of 4/20/2014) papers in Pubmed with this keyword.

    1. A Peselis, A Serganov, "Themes and variations in riboswitch structure and function." Biochim. Biophys. Acta, (2014) . [offcampus]
    2. JW Nelson, N Sudarsan, K Furukawa, Z Weinberg, JX Wang, RR Breaker, "Riboswitches in eubacteria sense the second messenger c-di-AMP." Nat. Chem. Biol., 9, #12 (2013) 834-9. [offcampus]
    3. A Serganov, E Nudler, "A decade of riboswitches." Cell, 152, #1-2 (2013) 17-24. [offcampus]
    4. RR Breaker, "Prospects for riboswitch discovery and analysis." Mol. Cell, 43, #6 (2011) 867-79. [offcampus]
    5. JE Barrick, RR Breaker, "The distributions, mechanisms, and structures of metabolite-binding riboswitches." Genome Biol., 8, #11 (2007) R239. [offcampus]
    6. MD Kazanov, AG Vitreschak, MS Gelfand, "Abundance and functional diversity of riboswitches in microbial communities." BMC Genomics, 8, (2007) 347. [offcampus]
    7. TE Edwards, DJ Klein, AR Ferré-D'Amaré, "Riboswitches: small-molecule recognition by gene regulatory RNAs." Curr. Opin. Struct. Biol., 17, #3 (2007) 273-9. [offcampus]
    8. D Campbell, RK Oates, "Childhood poisoning--a changing profile with scope for prevention." Med. J. Aust., 156, #4 (1992) 238-40. [offcampus]
    9. BJ Tucker, RR Breaker, "Riboswitches as versatile gene control elements." Curr. Opin. Struct. Biol., 15, #3 (2005) 342-8. [offcampus]
    10. R Welz, RR Breaker, "Ligand binding and gene control characteristics of tandem riboswitches in Bacillus anthracis." RNA, 13, #4 (2007) 573-82. [offcampus]
    11. N Sudarsan, MC Hammond, KF Block, R Welz, JE Barrick, A Roth, RR Breaker, "Tandem riboswitch architectures exhibit complex gene control functions." Science, 314, #5797 (2006) 300-4. [offcampus]
    12. KF Blount, RR Breaker, "Riboswitches as antibacterial drug targets." Nat. Biotechnol., 24, #12 (2006) 1558-64. [offcampus]
    13. KF Blount, JX Wang, J Lim, N Sudarsan, RR Breaker, "Antibacterial lysine analogs that target lysine riboswitches." Nat. Chem. Biol., 3, #1 (2007) 44-9. [offcampus]
    14. N Sudarsan, S Cohen-Chalamish, S Nakamura, GM Emilsson, RR Breaker, "Thiamine pyrophosphate riboswitches are targets for the antimicrobial compound pyrithiamine." Chem. Biol., 12, #12 (2005) 1325-35. [offcampus]

    Other Bacterial ncRNA: Recent transcriptomic evidence for widespread ncRNA in bacteria

    1. JM Liu, J Livny, MS Lawrence, MD Kimball, MK Waldor, A Camilli, "Experimental discovery of sRNAs in Vibrio cholerae by direct cloning, 5S/tRNA depletion and parallel sequencing." Nucleic Acids Res., 37, #6 (2009) e46. [offcampus]

    RNA Folding: Single sequence secondary structure prediction

    1. SR Eddy, "How do RNA folding algorithms work?" Nat. Biotechnol., 22, #11 (2004) 1457-8. [offcampus]
    2. JS McCaskill, "The equilibrium partition function and base pair binding probabilities for RNA secondary structure." Biopolymers, 29, #6-7 (1990 May-Jun) 1105-19. [offcampus]
    3. RB Lyngsø, M Zuker, CN Pedersen, "Fast evaluation of internal loops in RNA secondary structure prediction." Bioinformatics, 15, #6 (1999) 440-5. [offcampus]

    Comparison: Surveys and comparisons of RNA alignment and structure prediction

    1. PP Gardner, R Giegerich, "A comprehensive comparison of comparative RNA structure prediction approaches." BMC Bioinformatics, 5, (2004) 140. [offcampus]
    2. PP Gardner, A Wilm, S Washietl, "A benchmark of multiple sequence alignment programs upon structural RNAs." Nucleic Acids Res., 33, #8 (2005) 2433-9. [offcampus]
    3. EK Freyhult, JP Bollback, PP Gardner, "Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA." Genome Res., 17, #1 (2007) 117-25. [offcampus]
    4. T Babak, BJ Blencowe, TR Hughes, "Considerations in the identification of functional RNA structural elements in genomic alignments." BMC Bioinformatics, 8, (2007) 33. [offcampus]

    CMs: The fundamentals of covariance models (in addition to Durbin, et al. above)

    1. SR Eddy, R Durbin, "RNA sequence analysis using covariance models." Nucleic Acids Res., 22, #11 (1994) 2079-88. [offcampus]
    2. EP Nawrocki, DL Kolbe, SR Eddy, "Infernal 1.0: inference of RNA alignments." Bioinformatics, 25, #10 (2009) 1335-7. [offcampus]

    CM Details: Some algorithmic details

    1. SR Eddy, "A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure." BMC Bioinformatics, 3, (2002) 18. [offcampus]
    2. EP Nawrocki, SR Eddy, "Query-dependent banding (QDB) for faster RNA similarity searches." PLoS Comput. Biol., 3, #3 (2007) e56. [offcampus]
    3. DL Kolbe, SR Eddy, "Local RNA structure alignment with incomplete sequence." Bioinformatics, 25, #10 (2009) 1236-43. [offcampus]

    Rfam: The RNA family data base. A great resource for work in this area

    1. S Griffiths-Jones, A Bateman, M Marshall, A Khanna, SR Eddy, "Rfam: an RNA family database." Nucleic Acids Res., 31, #1 (2003) 439-41. [offcampus]
    2. S Griffiths-Jones, S Moxon, M Marshall, A Khanna, SR Eddy, A Bateman, "Rfam: annotating non-coding RNAs in complete genomes." Nucleic Acids Res., 33, #Database issue (2005) D121-4. [offcampus]
    3. J Daub, PP Gardner, J Tate, D Ramsköld, M Manske, WG Scott, Z Weinberg, S Griffiths-Jones, A Bateman, "The RNA WikiProject: community annotation of RNA families." RNA, 14, #12 (2008) 2462-4. [offcampus]
    4. PP Gardner, J Daub, JG Tate, EP Nawrocki, DL Kolbe, S Lindgreen, AC Wilkinson, RD Finn, S Griffiths-Jones, SR Eddy, A Bateman, "Rfam: updates to the RNA families database." Nucleic Acids Res., 37, #Database issue (2009) D136-40. [offcampus]
    5. PP Gardner, J Daub, J Tate, BL Moore, IH Osuch, S Griffiths-Jones, RD Finn, EP Nawrocki, DL Kolbe, SR Eddy, A Bateman, "Rfam: Wikipedia, clans and the "decimal" release." Nucleic Acids Res., 39, #Database issue (2011) D141-5. [offcampus]
    6. SW Burge, J Daub, R Eberhardt, J Tate, L Barquist, EP Nawrocki, SR Eddy, PP Gardner, A Bateman, "Rfam 11.0: 10 years of RNA families." Nucleic Acids Res., 41, #Database issue (2013) D226-32. [offcampus]

    Pfold, RNAz, and EvoFold: Some of the better comparative folding programs; for aligned sequences

    1. B Knudsen, J Hein, "RNA secondary structure prediction using stochastic context-free grammars and evolutionary history." Bioinformatics, 15, #6 (1999) 446-54. [offcampus]
    2. B Knudsen, J Hein, "Pfold: RNA secondary structure prediction using stochastic context-free grammars." Nucleic Acids Res., 31, #13 (2003) 3423-8. [offcampus]
    3. S Washietl, IL Hofacker, PF Stadler, "Fast and reliable prediction of noncoding RNAs." Proc. Natl. Acad. Sci. U.S.A., 102, #7 (2005) 2454-9. [offcampus]
    4. JS Pedersen, G Bejerano, A Siepel, K Rosenbloom, K Lindblad-Toh, ES Lander, J Kent, W Miller, D Haussler, "Identification and classification of conserved RNA secondary structures in the human genome." PLoS Comput. Biol., 2, #4 (2006) e33. [offcampus]

    CM Filters: My student's work on accelerating CM searches

    1. Z Weinberg, WL Ruzzo, "Faster Genome Annotation of Non-coding RNA Families Without Loss of Accuracy." Eighth Annual International Conference on Research in Computational Molecular Biology (RECOMB 2004) , pp 243-251, March 2004, San Diego, CA. Preprint.
    2. Z Weinberg, WL Ruzzo, "Exploiting conserved structure for faster annotation of non-coding RNAs without loss of accuracy." Bioinformatics, 20 Suppl 1, (2004) i334-41. [offcampus]
    3. Z Weinberg, WL Ruzzo, "Sequence-based heuristics for faster annotation of non-coding RNA families." Bioinformatics, 22, #1 (2006) 35-9. [offcampus]

    CM Inference: My students' work on inference of CM motifs from unaligned sequences

    1. Z Yao, Z Weinberg, WL Ruzzo, "CMfinder--a covariance model based RNA motif finding algorithm." Bioinformatics, 22, #4 (2006) 445-52. [offcampus]

    Applications: Two of the biological examples I discussed. The interplay between computational and experimental approaches is probably clearer in the 6S papers, and the differences in the approaches/results are also interesting.

    1. M Mandal, M Lee, JE Barrick, Z Weinberg, GM Emilsson, WL Ruzzo, RR Breaker, "A glycine-dependent riboswitch that uses cooperative binding to control gene expression." Science, 306, #5694 (2004) 275-9. [offcampus]
    2. JE Barrick, N Sudarsan, Z Weinberg, WL Ruzzo, RR Breaker, "6S RNA is a widespread regulator of eubacterial RNA polymerase that resembles an open promoter." RNA, 11, #5 (2005) 774-84. [offcampus]
    3. AE Trotochaud, KM Wassarman, "A highly conserved 6S RNA structure is required for regulation of transcription." Nat. Struct. Mol. Biol., 12, #4 (2005) 313-9. [offcampus]
    4. DK Willkomm, J Minnerup, A Hüttenhofer, RK Hartmann, "Experimental RNomics in Aquifex aeolicus: identification of small non-coding RNAs and the putative 6S RNA homolog." Nucleic Acids Res., 33, #6 (2005) 1949-60. [offcampus]

    A Tangent: More on the 6S story; not particularly computational, but interesting: not only does 6S mimic an open promoter, it's apparently sometimes a transcription template.

    1. N Gildehaus, T Neusser, R Wurm, R Wagner, "Studies on the function of the riboregulator 6S RNA from E. coli: RNA polymerase binding, inhibition of in vitro transcription and synthesis of RNA-directed de novo transcripts." Nucleic Acids Res., 35, #6 (2007) 1885-96. [offcampus]
    2. KM Wassarman, RM Saecker, "Synthesis-mediated release of a small RNA inhibitor of RNA polymerase." Science, 314, #5805 (2006) 1601-3. [offcampus]
    3. KM Wassarman, "6S RNA: a small RNA regulator of transcription." Curr. Opin. Microbiol., 10, #2 (2007) 164-8. [offcampus]

    More Applications: Additional examples of computational searches, mainly aimed at riboswitches and similar cis-regulatory elements

    1. Z Yao, J Barrick, Z Weinberg, S Neph, R Breaker, M Tompa, WL Ruzzo, "A computational pipeline for high- throughput discovery of cis-regulatory noncoding RNA in prokaryotes." PLoS Comput. Biol., 3, #7 (2007) e126. [offcampus]
    2. Z Weinberg, JE Barrick, Z Yao, A Roth, JN Kim, J Gore, JX Wang, ER Lee, KF Block, N Sudarsan, S Neph, M Tompa, WL Ruzzo, RR Breaker, "Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline." Nucleic Acids Res., 35, #14 (2007) 4809-19. [offcampus]
    3. S Zhang, I Borovok, Y Aharonowitz, R Sharan, V Bafna, "A sequence-based filtering method for ncRNA identification and its application to searching for riboswitch elements." Bioinformatics, 22, #14 (2006) e557-65. [offcampus]

    RNAs with backbones: Recent genome-scale searches in vertebrates.

    1. JS Pedersen, G Bejerano, A Siepel, K Rosenbloom, K Lindblad-Toh, ES Lander, J Kent, W Miller, D Haussler, "Identification and classification of conserved RNA secondary structures in the human genome." PLoS Comput. Biol., 2, #4 (2006) e33. [offcampus]
    2. S Washietl, IL Hofacker, M Lukasser, A Hüttenhofer, PF Stadler, "Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome." Nat. Biotechnol., 23, #11 (2005) 1383-90. [offcampus]
    3. E Torarinsson, M Sawera, JH Havgaard, M Fredholm, J Gorodkin, "Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure." Genome Res., 16, #7 (2006) 885-9. [offcampus]
    4. S Washietl, JS Pedersen, JO Korbel, C Stocsits, AR Gruber, J Hackermüller, J Hertel, M Lindemeyer, K Reiche, A Tanzer, C Ucla, C Wyss, SE Antonarakis, F Denoeud, J Lagarde, J Drenkow, P Kapranov, TR Gingeras, R Guigó, M Snyder, MB Gerstein, A Reymond, IL Hofacker, PF Stadler, "Structured RNAs in the ENCODE selected regions of the human genome." Genome Res., 17, #6 (2007) 852-64. [offcampus]
    5. E Torarinsson, Z Yao, ED Wiklund, JB Bramsen, C Hansen, J Kjems, N Tommerup, WL Ruzzo, J Gorodkin, "Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions." Genome Res., 18, #2 (2008) 242-51. [offcampus]
    6. AX Wang, WL Ruzzo, M Tompa, "How accurately is ncRNA aligned within whole-genome multiple alignments?" BMC Bioinformatics, 8, (2007) 417. [offcampus]
    7. G Lunter, CP Ponting, J Hein, "Genome-wide identification of human functional DNA using a neutral indel model." PLoS Comput. Biol., 2, #1 (2006) e5. [offcampus]
    8. J Ponjavic, CP Ponting, G Lunter, "Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs." Genome Res., 17, #5 (2007) 556-65. [offcampus]


    Larry Ruzzo

    CSE logo Computer Science & Engineering
    University of Washington
    Box 352350
    Seattle, WA  98195-2350
    (206) 543-1695 voice, (206) 543-2969 FAX