|
Modeling and Searching for Noncoding RNA
|
|
These lectures are part of the 2014
GENOME 540 /
GENOME 541
course series, Introduction to Computational Molecular Biology.
Lecture Slides
Homework
Some Web Resources
Software
Core Reading:
- Durbin, Richard and Eddy, Sean R. and Krogh, Anders and Mitchison, Graeme, Biological Sequence Analysis:
Probabilistic models of proteins and nucleic acids, Cambridge, 1998. Sections 9.5-9.7 and Chapter 10.
- RR Breaker, "Complex riboswitches." Science, 319, #5871 (2008) 1795-7.
[offcampus]
- PP Amaral, ME Dinger, TR Mercer, JS Mattick, "The eukaryotic genome as an RNA machine." Science, 319, #5871 (2008) 1787-9.
[offcampus]
Background:
Some background papers if you want to brush up. The Eddy papers are short gems; the RNA, HMM
and dynamic programming papers are especially relevant.
- SR Eddy, "What is dynamic programming?" Nat. Biotechnol., 22, #7 (2004) 909-10.
[offcampus]
- SR Eddy, "Where did the BLOSUM62 alignment score matrix come from?" Nat. Biotechnol., 22, #8 (2004) 1035-6.
[offcampus]
- SR Eddy, "What is Bayesian statistics?" Nat. Biotechnol., 22, #9 (2004) 1177-8.
[offcampus]
- SR Eddy, "What is a hidden Markov model?" Nat. Biotechnol., 22, #10 (2004) 1315-6.
[offcampus]
- SR Eddy, "How do RNA folding algorithms work?" Nat. Biotechnol., 22, #11 (2004) 1457-8.
[offcampus]
Recommended:
- J Gorodkin, IL Hofacker, E Torarinsson, Z Yao, JH Havgaard, WL Ruzzo, "De novo prediction of structured RNAs from genomic sequences." Trends Biotechnol., 28, #1 (2010) 9-19.
[offcampus]
Optional Reading:
Additional background on various of the topics I touched. Sample as much or as little of
this as you like.
Books:
-
J. Gorodkin, W.L. Ruzzo, RNA Sequence, Structure and Function: Computational and Bioinformatic Methods, Methods
in Molecular Biology, vol. 1097. Humana Press, 2014. (Springer Protocols) 533pp.
Reviews:
Some reviews about non-coding RNAs in general
- G Storz, "An expanding universe of noncoding RNAs." Science, 296, #5571 (2002) 1260-3.
[offcampus]
- SR Eddy, "Computational genomics of noncoding RNA genes." Cell, 109, #2 (2002) 137-40.
[offcampus]
- A Hüttenhofer, P Schattner, N Polacek, "Non-coding RNAs: hope or hype?" Trends Genet., 21, #5 (2005) 289-97.
[offcampus]
- JS Mattick, IV Makunin, "Non-coding RNA." Hum. Mol. Genet., 15 Spec No 1, (2006) R17-29.
[offcampus]
- ME Dinger, TR Mercer, JS Mattick, "RNAs as extracellular signaling molecules." J. Mol. Endocrinol., 40, #4 (2008) 151-9.
[offcampus]
Riboswitches:
Some of the 698 (as of 4/20/2014) papers in Pubmed with this keyword.
- A Peselis, A Serganov, "Themes and variations in riboswitch structure and function." Biochim. Biophys. Acta, (2014) .
[offcampus]
- JW Nelson, N Sudarsan, K Furukawa, Z Weinberg, JX Wang, RR Breaker, "Riboswitches in eubacteria sense the second messenger c-di-AMP." Nat. Chem. Biol., 9, #12 (2013) 834-9.
[offcampus]
- A Serganov, E Nudler, "A decade of riboswitches." Cell, 152, #1-2 (2013) 17-24.
[offcampus]
- RR Breaker, "Prospects for riboswitch discovery and analysis." Mol. Cell, 43, #6 (2011) 867-79.
[offcampus]
- JE Barrick, RR Breaker, "The distributions, mechanisms, and structures of metabolite-binding riboswitches." Genome Biol., 8, #11 (2007) R239.
[offcampus]
- MD Kazanov, AG Vitreschak, MS Gelfand, "Abundance and functional diversity of riboswitches in microbial communities." BMC Genomics, 8, (2007) 347.
[offcampus]
- TE Edwards, DJ Klein, AR Ferré-D'Amaré, "Riboswitches: small-molecule recognition by gene regulatory RNAs." Curr. Opin. Struct. Biol., 17, #3 (2007) 273-9.
[offcampus]
- D Campbell, RK Oates, "Childhood poisoning--a changing profile with scope for prevention." Med. J. Aust., 156, #4 (1992) 238-40.
[offcampus]
- BJ Tucker, RR Breaker, "Riboswitches as versatile gene control elements." Curr. Opin. Struct. Biol., 15, #3 (2005) 342-8.
[offcampus]
- R Welz, RR Breaker, "Ligand binding and gene control characteristics of tandem riboswitches in Bacillus anthracis." RNA, 13, #4 (2007) 573-82.
[offcampus]
- N Sudarsan, MC Hammond, KF Block, R Welz, JE Barrick, A Roth, RR Breaker, "Tandem riboswitch architectures exhibit complex gene control functions." Science, 314, #5797 (2006) 300-4.
[offcampus]
- KF Blount, RR Breaker, "Riboswitches as antibacterial drug targets." Nat. Biotechnol., 24, #12 (2006) 1558-64.
[offcampus]
- KF Blount, JX Wang, J Lim, N Sudarsan, RR Breaker, "Antibacterial lysine analogs that target lysine riboswitches." Nat. Chem. Biol., 3, #1 (2007) 44-9.
[offcampus]
- N Sudarsan, S Cohen-Chalamish, S Nakamura, GM Emilsson, RR Breaker, "Thiamine pyrophosphate riboswitches are targets for the antimicrobial compound pyrithiamine." Chem. Biol., 12, #12 (2005) 1325-35.
[offcampus]
Other Bacterial ncRNA:
Recent transcriptomic evidence for widespread ncRNA in bacteria
- JM Liu, J Livny, MS Lawrence, MD Kimball, MK Waldor, A Camilli, "Experimental discovery of sRNAs in Vibrio cholerae by direct cloning, 5S/tRNA depletion and parallel sequencing." Nucleic Acids Res., 37, #6 (2009) e46.
[offcampus]
RNA Folding:
Single sequence secondary structure prediction
- SR Eddy, "How do RNA folding algorithms work?" Nat. Biotechnol., 22, #11 (2004) 1457-8.
[offcampus]
- JS McCaskill, "The equilibrium partition function and base pair binding probabilities for RNA secondary structure." Biopolymers, 29, #6-7 (1990 May-Jun) 1105-19.
[offcampus]
- RB Lyngsø, M Zuker, CN Pedersen, "Fast evaluation of internal loops in RNA secondary structure prediction." Bioinformatics, 15, #6 (1999) 440-5.
[offcampus]
Comparison:
Surveys and comparisons of RNA alignment and structure prediction
- PP Gardner, R Giegerich, "A comprehensive comparison of comparative RNA structure prediction approaches." BMC Bioinformatics, 5, (2004) 140.
[offcampus]
- PP Gardner, A Wilm, S Washietl, "A benchmark of multiple sequence alignment programs upon structural RNAs." Nucleic Acids Res., 33, #8 (2005) 2433-9.
[offcampus]
- EK Freyhult, JP Bollback, PP Gardner, "Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA." Genome Res., 17, #1 (2007) 117-25.
[offcampus]
- T Babak, BJ Blencowe, TR Hughes, "Considerations in the identification of functional RNA structural elements in genomic alignments." BMC Bioinformatics, 8, (2007) 33.
[offcampus]
CMs:
The fundamentals of covariance models (in addition to Durbin, et al. above)
- SR Eddy, R Durbin, "RNA sequence analysis using covariance models." Nucleic Acids Res., 22, #11 (1994) 2079-88.
[offcampus]
- EP Nawrocki, DL Kolbe, SR Eddy, "Infernal 1.0: inference of RNA alignments." Bioinformatics, 25, #10 (2009) 1335-7.
[offcampus]
CM Details:
Some algorithmic details
- SR Eddy, "A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure." BMC Bioinformatics, 3, (2002) 18.
[offcampus]
- EP Nawrocki, SR Eddy, "Query-dependent banding (QDB) for faster RNA similarity searches." PLoS Comput. Biol., 3, #3 (2007) e56.
[offcampus]
- DL Kolbe, SR Eddy, "Local RNA structure alignment with incomplete sequence." Bioinformatics, 25, #10 (2009) 1236-43.
[offcampus]
Rfam:
The RNA family data base. A great resource for work in this area
- S Griffiths-Jones, A Bateman, M Marshall, A Khanna, SR Eddy, "Rfam: an RNA family database." Nucleic Acids Res., 31, #1 (2003) 439-41.
[offcampus]
- S Griffiths-Jones, S Moxon, M Marshall, A Khanna, SR Eddy, A Bateman, "Rfam: annotating non-coding RNAs in complete genomes." Nucleic Acids Res., 33, #Database issue (2005) D121-4.
[offcampus]
- J Daub, PP Gardner, J Tate, D Ramsköld, M Manske, WG Scott, Z Weinberg, S Griffiths-Jones, A Bateman, "The RNA WikiProject: community annotation of RNA families." RNA, 14, #12 (2008) 2462-4.
[offcampus]
- PP Gardner, J Daub, JG Tate, EP Nawrocki, DL Kolbe, S Lindgreen, AC Wilkinson, RD Finn, S Griffiths-Jones, SR Eddy, A Bateman, "Rfam: updates to the RNA families database." Nucleic Acids Res., 37, #Database issue (2009) D136-40.
[offcampus]
- PP Gardner, J Daub, J Tate, BL Moore, IH Osuch, S Griffiths-Jones, RD Finn, EP Nawrocki, DL Kolbe, SR Eddy, A Bateman, "Rfam: Wikipedia, clans and the "decimal" release." Nucleic Acids Res., 39, #Database issue (2011) D141-5.
[offcampus]
- SW Burge, J Daub, R Eberhardt, J Tate, L Barquist, EP Nawrocki, SR Eddy, PP Gardner, A Bateman, "Rfam 11.0: 10 years of RNA families." Nucleic Acids Res., 41, #Database issue (2013) D226-32.
[offcampus]
Pfold, RNAz, and EvoFold:
Some of the better comparative folding programs; for aligned sequences
- B Knudsen, J Hein, "RNA secondary structure prediction using stochastic context-free grammars and evolutionary history." Bioinformatics, 15, #6 (1999) 446-54.
[offcampus]
- B Knudsen, J Hein, "Pfold: RNA secondary structure prediction using stochastic context-free grammars." Nucleic Acids Res., 31, #13 (2003) 3423-8.
[offcampus]
- S Washietl, IL Hofacker, PF Stadler, "Fast and reliable prediction of noncoding RNAs." Proc. Natl. Acad. Sci. U.S.A., 102, #7 (2005) 2454-9.
[offcampus]
- JS Pedersen, G Bejerano, A Siepel, K Rosenbloom, K Lindblad-Toh, ES Lander, J Kent, W Miller, D Haussler, "Identification and classification of conserved RNA secondary structures in the human genome." PLoS Comput. Biol., 2, #4 (2006) e33.
[offcampus]
CM Filters:
My student's work on accelerating CM searches
- Z Weinberg, WL Ruzzo, "Faster Genome Annotation of
Non-coding RNA Families Without Loss of Accuracy." Eighth Annual International
Conference on Research in Computational Molecular Biology (RECOMB
2004) , pp 243-251,
March 2004, San Diego, CA. Preprint.
- Z Weinberg, WL Ruzzo, "Exploiting conserved structure for faster annotation of non-coding RNAs without loss of accuracy." Bioinformatics, 20 Suppl 1, (2004) i334-41.
[offcampus]
- Z Weinberg, WL Ruzzo, "Sequence-based heuristics for faster annotation of non-coding RNA families." Bioinformatics, 22, #1 (2006) 35-9.
[offcampus]
CM Inference:
My students' work on inference of CM motifs from unaligned sequences
- Z Yao, Z Weinberg, WL Ruzzo, "CMfinder--a covariance model based RNA motif finding algorithm." Bioinformatics, 22, #4 (2006) 445-52.
[offcampus]
Applications:
Two of the biological examples I discussed. The interplay between computational and
experimental approaches is probably clearer in the 6S papers, and the differences in the approaches/results are also
interesting.
- M Mandal, M Lee, JE Barrick, Z Weinberg, GM Emilsson, WL Ruzzo, RR Breaker, "A glycine-dependent riboswitch that uses cooperative binding to control gene expression." Science, 306, #5694 (2004) 275-9.
[offcampus]
- JE Barrick, N Sudarsan, Z Weinberg, WL Ruzzo, RR Breaker, "6S RNA is a widespread regulator of eubacterial RNA polymerase that resembles an open promoter." RNA, 11, #5 (2005) 774-84.
[offcampus]
- AE Trotochaud, KM Wassarman, "A highly conserved 6S RNA structure is required for regulation of transcription." Nat. Struct. Mol. Biol., 12, #4 (2005) 313-9.
[offcampus]
- DK Willkomm, J Minnerup, A Hüttenhofer, RK Hartmann, "Experimental RNomics in Aquifex aeolicus: identification of small non-coding RNAs and the putative 6S RNA homolog." Nucleic Acids Res., 33, #6 (2005) 1949-60.
[offcampus]
A Tangent:
More on the 6S story; not particularly computational, but interesting: not only does 6S mimic an
open promoter, it's apparently sometimes a transcription template.
- N Gildehaus, T Neusser, R Wurm, R Wagner, "Studies on the function of the riboregulator 6S RNA from E. coli: RNA polymerase binding, inhibition of in vitro transcription and synthesis of RNA-directed de novo transcripts." Nucleic Acids Res., 35, #6 (2007) 1885-96.
[offcampus]
- KM Wassarman, RM Saecker, "Synthesis-mediated release of a small RNA inhibitor of RNA polymerase." Science, 314, #5805 (2006) 1601-3.
[offcampus]
- KM Wassarman, "6S RNA: a small RNA regulator of transcription." Curr. Opin. Microbiol., 10, #2 (2007) 164-8.
[offcampus]
More Applications:
Additional examples of computational searches, mainly aimed at riboswitches and similar
cis-regulatory elements
- Z Yao, J Barrick, Z Weinberg, S Neph, R Breaker, M Tompa, WL Ruzzo, "A computational pipeline for high- throughput discovery of cis-regulatory noncoding RNA in prokaryotes." PLoS Comput. Biol., 3, #7 (2007) e126.
[offcampus]
- Z Weinberg, JE Barrick, Z Yao, A Roth, JN Kim, J Gore, JX Wang, ER Lee, KF Block, N Sudarsan, S Neph, M Tompa, WL Ruzzo, RR Breaker, "Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline." Nucleic Acids Res., 35, #14 (2007) 4809-19.
[offcampus]
- S Zhang, I Borovok, Y Aharonowitz, R Sharan, V Bafna, "A sequence-based filtering method for ncRNA identification and its application to searching for riboswitch elements." Bioinformatics, 22, #14 (2006) e557-65.
[offcampus]
RNAs with backbones:
Recent genome-scale searches in vertebrates.
- JS Pedersen, G Bejerano, A Siepel, K Rosenbloom, K Lindblad-Toh, ES Lander, J Kent, W Miller, D Haussler, "Identification and classification of conserved RNA secondary structures in the human genome." PLoS Comput. Biol., 2, #4 (2006) e33.
[offcampus]
- S Washietl, IL Hofacker, M Lukasser, A Hüttenhofer, PF Stadler, "Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome." Nat. Biotechnol., 23, #11 (2005) 1383-90.
[offcampus]
- E Torarinsson, M Sawera, JH Havgaard, M Fredholm, J Gorodkin, "Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure." Genome Res., 16, #7 (2006) 885-9.
[offcampus]
- S Washietl, JS Pedersen, JO Korbel, C Stocsits, AR Gruber, J Hackermüller, J Hertel, M Lindemeyer, K Reiche, A Tanzer, C Ucla, C Wyss, SE Antonarakis, F Denoeud, J Lagarde, J Drenkow, P Kapranov, TR Gingeras, R Guigó, M Snyder, MB Gerstein, A Reymond, IL Hofacker, PF Stadler, "Structured RNAs in the ENCODE selected regions of the human genome." Genome Res., 17, #6 (2007) 852-64.
[offcampus]
- E Torarinsson, Z Yao, ED Wiklund, JB Bramsen, C Hansen, J Kjems, N Tommerup, WL Ruzzo, J Gorodkin, "Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions." Genome Res., 18, #2 (2008) 242-51.
[offcampus]
- AX Wang, WL Ruzzo, M Tompa, "How accurately is ncRNA aligned within whole-genome multiple alignments?" BMC Bioinformatics, 8, (2007) 417.
[offcampus]
- G Lunter, CP Ponting, J Hein, "Genome-wide identification of human functional DNA using a neutral indel model." PLoS Comput. Biol., 2, #1 (2006) e5.
[offcampus]
- J Ponjavic, CP Ponting, G Lunter, "Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs." Genome Res., 17, #5 (2007) 556-65.
[offcampus]
Larry Ruzzo
|
|
Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA 98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
|