CSE 527 Computational Biology

Possible project topics

Examples of project topics include:

  • Predicting survival time of cancer patients based on their RNA gene expression levels.

  • Predicting sensitivity to chemotherapy drugs based on RNA gene expression level.

    • Data: We have two datasets consisting of gene expression data (genes x patients) and drug sensitivity profiles (drugs x patients) from cancer patients. [Gene Expression Data] [Drug Sensitivity Data]

    • Goal: Our goal is to develope a prediction system for predicting the drug sensitivity based on the RNA levels of genes.

  • Idenfitying genetic factors for metabolig traits

    • Data: We have one dataset that consists of genotype profiles collected from  3,000 subjects in a genome-wide association study (GWAS), and metabolic trait measurements including cholesterol, insulin level and so on. The metabolic traits were measured in 7 time points in 25 years of period. The data contain detailed information on each individual including age, gender, smoking status, medications each individual has taken. [Genotype Data; first 4 rows are not genetic markers] [Phenotype Data]

    • Goal: Our goal is to identify genetic loci that contribute to these important phenotypes and how they interact with gender or smoking status.

  • Understanding how genes are wired differently in the transcriptional regulatory networks in different subtypes of cancer

    • Data: We have an expression dataset measuring RNA levels of  20,000 genes from 2096 patients suffering from leukemia. There are 18 sub-types of leukemia and it is important to understand the expression signature that characterizes each subtype of leukemia. [Gene Expression Data] [Classification Labels]

    • Goal: Our goal is to understand how differently genes regulate each others’ expression levels in each subtype. One way is to learn the regulatory network (we will cover this in class) in each subtype and interpret how they similar/ different. A different approach is to build a classifier that can predict the subtype of leukemia based on the expression data from a patient, which will enable molecular dianosis of leukemia.

  • Understanding the evolutionary changes in the regulatory network between two yeast species

  • Clustering genes in microarray data

  • Understanding the evolutionary change of transcriptional regulatory netowkrs in yeast.

  • Identifying causal sequence variations for related phenotypes.

  • Detecting genetic interaction from genome-wide association studies data

  • Detecting epistasis from microarray gene expression data.