Erin Wilson

PhD student, Computer Science and Engineering

I’m a fourth year PhD student at the University of Washington where I work at the intersection of data science, synthetic biology, and sustainability!

I'm advised by Mary Lidstrom and Dave Beck.

See my Research


4th year PhD student, Computer Science & Engineering
      University of Washington, Seattle, WA
MS Computer Science & Engineering
      University of Washington, Seattle, WA
      Fall 2019  
BA Computational Biology
      Carleton College, Northfield, MN
      June 2014  

Awards & Honors

NSF Graduate Research Fellow
      University of Washington     2019
Marilyn Fries Fellow
      University of Washington     2017-2018
      First year funding from UW CSE
Graduation Honors
      Carleton College     2014
      Magna Cum Laude
      Received "Distinction" on Senior Thesis
Clare Boothe Luce Scholar
      Carleton College     Summer 2012
      Summer research funding for women in physics and computer science

Research Interests & Projects

I am interested in engineering microorganisms into tiny biological factories that can sustainably produce everyday molecules. To do this, we can edit microorganism genomes to convert renewable feedstocks (sugar) or waste streams (methane) into a desired target molecule such as medicine, biofuel, or theoretically any molecule found in nature. My current research focus is to use computational methods to better understand the "genetic grammar" underlying how these organisms control gene expression and use these insights to more efficiently engineer them for sustainable molecule production.

Current Projects

  Using deep learning approaches to idenitfy regulatory motifs involved in Methanotroph gene regulation

  with Mary Lidstrom & Dave Beck  
Building a framework to apply deep learning tools (CNNs and LSTMs) to predict RNA-seq expression levels from DNA sequences (upstream promoter regions) in the methanotroph Methylotuvimicrobium buryatense 5GB1. Eventually we would like to apply feature attribution methods to identify the sub-sequences within promoters that are particularly important for influencing expression changes across a variety of growth conditions.

Past Projects

 Developing a computational framework to idenitfy strong promoters in non-model organisms

  with Mary Lidstrom & Dave Beck  
Analyzing genomic sequence and RNA-seq data in the methanotroph Methylotuvimicrobium buryatense 5GB1 to identify promoter sequence patterns that confer constitutive, strong expression. Our compuational pipeline may be similarly applied to other non-model organisms that lack extensive genetic characterization to help identify key pieces of their regulatory grammars.
      Project Page
      Git Repo

 Decoding yeast gene regulation from millions of random sequences

  with Georg Seelig  
Training deep and machine learning models on massively parallel reporter data from millions of randomized sequences to characterize gene regulation in yeast.

 Understanding gene expression patterns in developing heart tissue

  with Georg Seelig  
Analyzed single-cell RNA-sequencing data to understand gene expression patterns in differentiating cardiomyocytes. (In collaboration with the Allen Institute for Cell Science)

Industry & Work Experience

I began my research career as a biologist and have since grown into a computer scientist with an interest in understanding biological data. I am excited about opportunities that allow me to span across fields and require computational skillsets to dig into outstanding challenges in biology.

Zymergen, Intern, Data Science

  Seattle, WA     June 2018 - August 2018
Used sci-kit learn and Keras/Tensorflow to build machine learning and convolutional neural network models for predicting DNA regulatory features in non-standard microbe genomes.

Amyris, Associate Scientist, Scientific Computing

  Emeryville, CA     July 2014 - July 2017
In the Scientific Computing group at Amyris, I applied my background in genetics and computer science to various computational projects in R&D. My role as a scientist ranged from the designated computational resource for a given project, to a member on a team of computational experts, and a communication bridge between software engineers and biologists. Several specific projects I worked on include:
  • characterizing the genomic impact of chemical mutagens
  • maintaining the company's whole genome sequencing pipeline
  • developing and training the Amyris community in Genotype Specification Language (a DNA design tool invented at Amyris)
  • building a Genotype Generator tool to translate high level designs for metabolic pathways into concrete build instructions for strains that can carry out pathway designs

Amyris, Intern, Scientific Computing

  Emeryville, CA     December 2013
Coded a data visualization tool to help strain engineers overlay experimental data onto yeast metabolic pathways.

 University of Minnesota, Research Assistant, Myers Lab (Computational Biology)

  Minneapolis, MN     June 2013 - August 2013
Used genetic interaction and chemical genetic interaction data to code a target prediction pipeline in Python. Developed a benchmark standard for accurately predicting gene targets for chemicals of interest.

 Carleton College, Research Assistant, Goings Lab (Evolutionary Computing)

  Northfield, MN     June 2012 - August 2012
Performed experiments on evolving populations of digital organisms to examine the effects of limited CPU resources on the populations’ ability to evolve complex Boolean logic functions.

 UCSF, Research Assistant, Ahituv Lab (Genetics)

  San Francisco, CA     June 2011 - August 2011
Perfomed chromatin immunoprecipitation sequencing experiments on mouse limb tissue to find enhancer candidates involved in limb patterning and development.

Scientific Communication


  • E. H. Wilson, M. E. Lidstrom, and D. A. C. Beck. (2021) "A multi-task learning approach to enhance sustainable biomolecule production in engineered microorganisms." Tackling Climate Change with Machine Learning, workshop at ICML 2021. [Video Recording] [Proposal]

  • E. H. Wilson, J. D. Groom, M. C. Sarfatis, S. M. Ford, M. E. Lidstrom, and D. A. C. Beck. (2021) "A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets." ACS Synthetic Biology. Article

  • E. H. Wilson, C. Macklin, and D. Platt. (2018) "Engineering genomes with Genotype Specification Language." In Methods in Molecular Biology, Synthetic Biology. J.C. Braman, ed. Springer Publishing Company, New York, NY. PubMed

  • E. H. Wilson, S. Sagawa, J. Weis, M. Shubert, M. Bissell, B. Hawthorne, C. Reeves, J. Dean, and D. Platt. (2016) "Genotype Specification Language." ACS Synthetic Biology. 5(6), pp 471-478. PubMed

  • S. W. Simpkins, J. Nelson, R. Deshpande, S.C. Li, J. S. Piotrowski, E. H. Wilson, A. A. Gebre, R. Okamoto, M. Yoshimura, M. Costanzo, Y. Yashiroda, Y. Ohya, H. Osada, M. Yoshida, C. Boone, C. L. Myers. (2018) “Predicting bioprocess targets of chemical compounds through integration of chemical-genetic and genetic interactions.” PLoS Computational Biology. PubMed


  • E. H. Wilson, D. Platt. “Genotype Specification Language: Programming in DNA!” Poster presentation at Synthetic Biology, Engineering, Evolution & Design (SEED) conference in Chicago, July 2016.


  • "Using microorganisms to mitigate macro problems." Virtual Women's Research Day, University of Washington, 2020. [Video Recording]

  • "Using microorganisms to solve macro problems: untangling the genetic circuitry of methane-eating bacteria." MIDAS Data Science Symposium, University of Michigan, 2019.

  • "Can deep learning help us program biology?" Industry Affiliates Research Day, University of Washington, 2018.


General Audience

Youth Education

  • Programming Organisms with DNA Puzzles! - Developed an interactive activity to teach elementary/middle schoolers about genetic engineering.
    • Engineering Discovery Days, University of Washington
    • Introduce a Girl to CoRDS (Coding, Robotics, and Data Science), University of Washington


  • PAWS - Wildlife Hospital Volunteer
  • MeadoWatch - Field Data Collector, Mt. Rainier National Park

Contact Me

I'm always excited to learn more about how a computer/data scientist can help solve problems in biology and sustainability! Feel free to connect :)

Also, if you're considering exploring the intersection of Biology and Computer Science, I'd be happy to chat about my experience navigating undergrad, working in industry, and transitioning back to grad school.

You can reach me at ewilson6

I also have a LinkedIn.