Preprints

  • BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once.
    Theodore Zhao, Yu Gu, Jianwei Yang, Naoto Usuyama, Ho Hin Lee, Tristan Naumann, Jianfeng Gao, Angela Crabtree, Brian Piening, Carlo Bifulco, Mu Wei#, Hoifung Poon#,Sheng Wang#
    arxiv
  • BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs.
    Sheng Zhang, Yanbo Xu, Naoto Usuyama, Hanwen Xu, Jaspreet Bagga, Robert Tinn, Sam Preston, Rajesh Rao, Mu Wei, Naveen Valluri, Cliff Wong, Andrea Tupini, Yu Wang, Matt Mazzola, Swadheen Shukla, Lars Liden, Jianfeng Gao, Matthew P. Lungren, Tristan Naumann, Sheng Wang#, Hoifung Poon#.
    arxiv Under review.
  • Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging.
    Juan Manuel Zambrano Chaves, Shih-Cheng Huang, Yanbo Xu, Hanwen Xu, Naoto Usuyama, Sheng Zhang, Fei Wang, Yujia Xie, Mahmoud Khademi, Ziyi Yang, Hany Awadalla, Julia Gong, Houdong Hu, Jianwei Yang, Chunyuan Li, Jianfeng Gao, Yu Gu, Cliff Wong, Mu Wei, Tristan Naumann, Muhao Chen, Matthew P. Lungren, Serena Yeung-Levy, Curtis P. Langlotz, Sheng Wang#, Hoifung Poon#.
    arxiv
  • Pisces: A multi-modal data augmentation approach for drug combination synergy prediction.
    Hanwen Xu*, Jiacheng Lin*, Addie Woicik, Zixuan Liu, Jianzhu Ma, Sheng Zhang, Hoifung Poon, Liewei Wang, Sheng Wang#
    bioRxiv In revision at Cell Genomics.
  • Poisoning scientific knowledge using large language models.
    Junwei Yang, Hanwen Xu, Srbuhi Mirzoyan, Tong Chen, Zixuan Liu, Wei Ju, Luchen Liu, Ming Zhang#, Sheng Wang#
    bioRxiv In revision at Nature Machine Intelligence.
  • A foundation model for bioactivity prediction using pairwise meta-learning.
    Bin Feng, Zequn Liu, Nanlan Huang, Zhiping Xiao, Haomiao Zhang, Srbuhi Mirzoyan, Hanwen Xu, Jiaran Hao, Yinghui Xu#, Ming Zhang#, Sheng Wang#
    bioRxiv In revision at Nature Machine Intelligence.

    Publications (since tenure-track at UW)


    1. A whole-slide foundation model for digital pathology from real-world data.
      Hanwen Xu, Naoto Usuyama, Jaspreet Bagga, Sheng Zhang, Rajesh Rao, Tristan Naumann, Cliff Wong, Zelalem Gero, Javier González, Yu Gu, Yanbo Xu, Mu Wei, Wenhui Wang, Shuming Ma, Furu Wei, Jianwei Yang, Chunyuan Li, Jianfeng Gao, Jaylen Rosemon, Tucker Bower, Soohee Lee, Roshanthi Weerasinghe, Bill J. Wright, Ari Robicsek, Brian Piening, Carlo Bifulco#, Sheng Wang#, Hoifung Poon#.
      Nature, 2024
      Link to the paper.
      Media coverage: Forbes, Yahoo, Becker’s hospital review, Fierce biotech, CTOL digital solutions, HIT consultant, GeekWire, Cosmic log, HealthXL, RamaOnHealthcare, Providence, nikkei, cryptorank

    2. Enhancing Hi-C contact matrices for loop detection with Capricorn, a multi-view diffusion model.
      Tangqi Fang*, Yifeng Liu*, Addie Woicik*, Minsi Lu, Anupama Jha, Xiao Wang, Gang Li, Borislav Hristov, Zixuan Liu, Hanwen Xu, William S. Noble#, Sheng Wang#
      ISMB, 2024. bioRxiv

    3. MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning.
      Haozhe Zhao, Zefan Cai, Shuzheng Si, Xiaojian Ma, Kaikai An, Liang Chen, Zixuan Liu,Sheng Wang, Wenjuan Han, Baobao Chang
      ICLR, 2024.

    4. Retrieval augmentation of large language models for lay language generation.
      Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, and Trevor Cohen
      Journal of Biomedical Informatics, 2024.

    5. An organism-wide atlas of hormonal signaling based on the mouse lemur single-cell transcriptome.
      Shixuan Liu, Camille Ezran, Michael FZ Wang, Zhengda Li, Kyle Awayan, Jonathan Z Long, Iwijn De Vlaminck, Sheng Wang, Jacques Epelbaum, Christin S Kuo, Jérémy Terrien, Mark A Krasnow, James E Ferrell Jr
      Nature communications, 2024.

    6. Foreseer: Product aspect forecasting using temporal graph embedding.
      Zixuan Liu, Gaurush Hiranandani, Kun Qian, Edward W Huang, Yi Xu, Belinda Zeng, Karthik Subbian, Sheng Wang#
      CIKM, 2023.

    7. Sagittarius: Extrapolating Heterogeneous Time-Series Gene Expression Data.
      Addie Woicik, Mingxin Zhang, Janelle Chan, Jianzhu Ma, Sheng Wang#
      Nature Machine Intelligence, 2023.
      Link to view-only full text.

    8. Gemini: Memory-efficient integration of hundreds of gene networks with high-order pooling.
      Addie Woicik, Mingxin Zhang, Hanwen Xu, Sara Mostafavi, Sheng Wang#
      ISMB, 2023

    9. Supervised biological network alignment with graph neural networks.
      Kerr Ding, Sheng Wang, Yunan Luo
      ISMB, 2023

    10. Multilingual translation for zero-shot biomedical classification using BioTranslator.
      Hanwen Xu, Addie Woicik, Hoifung Poon, Russ Altman, Sheng Wang#
      Nature Communications, 2023

    11. Pisces: A combo-wise contrastive learning approach to synergistic drug combination prediction.
      Jiacheng Lin*, Hanwen Xu*, Addie Woicik, Jianzhu Ma, Sheng Wang#
      RECOMB, 2023

    12. GraphPrompt: Graph-based Prompt Templates For Biomedical Synonym Prediction.
      Hanwen Xu*, Jiayou Zhang*, Zhirui Wang*, Shizhuo Zhang, Megh Bhalerao, Yucong Liu, Dawei Zhu, Sheng Wang#
      AAAI, 2023

    13. POPDx: An Automated Framework for Patient Phenotyping across 392,246 Individuals in the UK Biobank Study.
      Lu Yang, Sheng Wang, Russ Altman
      Journal of the American Medical Informatics Association (JAMIA), 2023

    14. MetaFill: Text Infilling for Meta-Path Generation on Heterogeneous Information Networks.
      Zequn Liu, Kefei Duan, Junwei Yang, Hanwen Xu, Ming Zhang#, Sheng Wang#
      EMNLP, 2022

    15. Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models.
      Shitong Luo, Yufeng Su, Xingang Peng, Sheng Wang, Jian Peng, Jianzhu Ma
      NeurIPS, 2022

    16. The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans.
      Tabula Sapiens Consortium
      Science, 2022

    17. Cell types of origin of the cell-free transcriptome.
      Vorperian, Sevahn K., Mira N. Moufarrej, Tabula Sapiens Consortium, and Stephen R Quake.
      Nature biotechnology, 2022

    18. RNA splicing programs define tissue compartments and cell types at single-cell resolution.
      Julia Eve Olivieri, Roozbeh Dehghannasiri, Peter L Wang, SoRi Jang, Antoine de Morree, Serena Y Tan, Jingsi Ming, Angela Ruohao Wu, Tabula Sapiens Consortium
      Elife, 2022

    19. Brain-Aware Replacements for Supervised Contrastive Learning in Detection of Alzheimer's Disease
      Mehmet Saygın Seyfioğlu, Zixuan Liu, Pranav Kamath, Sadjyot Gangolli, Sheng Wang, Thomas Grabowski, Linda Shapiro
      MICCAI, 2022

    20. Graph-in-Graph Network for Automatic Gene Ontology Description Generation
      Fenglin Liu, Bang Yang, Xian Wu, Shen Ge, Addie Woicik, Sheng Wang#
      KDD, 2022

    21. Pathway2Text: Dataset and Method for Biomedical Pathway Description Generation
      Junwei Yang, Zequn Liu, Ming Zhang#, Sheng Wang#
      NAACL (Findings), 2022

    22. Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds
      Yu Zhang, Yu Meng, Xuan Wang, Sheng Wang, Jiawei Han
      NAACL, 2022

    23. Textomics: A Dataset for Genomics Data Summary Generation
      Mu-Chun Wang*, Zixuan Liu*, Sheng Wang#
      ACL, 2022

    24. Deep Graph Mutual Learning for Cross-domain Recommendation
      Yifan Wang, Yongkang Li, Shuai Li, Weiping Song, Jiangke Fan, Shan Gao, Huan Lou, Bing Cheng, Xunliang Cai, Sheng Wang, and Ming Zhang.
      DASFAA, 2022

    25. scPretrain: Multi-task self-supervised learning for cell type classification
      Ruiyi Zhang, Yunan Luo, Jianzhu Ma, Ming Zhang#, Sheng Wang#
      Bioinformatics, 2022 Code

    26. ProTranslator: zero-shot protein function prediction using textual description
      Hanwen Xu, Sheng Wang#
      RECOMB, 2022

    27. DisenCite: Graph-based Disentangled Representation Learning for Context-specific Citation Generation
      Yifan Wang, Yiping Song, Shuai Li, Chaoran Cheng, Wei Ju, Ming Zhang, Sheng Wang
      AAAI, 2022

    28. Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation
      Fenglin Liu, Chenyu You, Xian Wu, Shen Ge, Sheng Wang#, Xu Sun#
      NeurIPS, 2021

    29. Graphine: A Dataset for Graph-aware Terminology Definition Generation
      Zequn Liu, Shukai Wang, Yiyang Gu, Ruiyi Zhang, Ming Zhang#, Sheng Wang#
      EMNLP, 2021 Dataset Code

    30. HYPON: embedding biomedical ontology with entity sets
      Zhuoyan Li, Sheng Wang#
      ACM BCB, 2021

    31. DrugOrchestra: Jointly predicting drug response, targets, and side effects via deep multi-task learning
      Yuepeng Jiang, Stefano Rensi, Sheng Wang#, Russ B. Altman#
      RECOMB, 2021 bioRxiv Code

    32. Disease GenePrediction with Privileged Information and Heteroscedastic Dropout
      Juan Shu, Yu Li, Sheng Wang, Bowei Xi, Jianzhu Ma.
      The international conference on molecular biology (ISMB), 2021

    Publications (before tenure-track at UW)

    1. Leveraging the Cell Ontology to classify unseen cell types
      Sheng Wang*, Angela Oliveira Pisco*, Aaron McGeever, Maria Brbic, Marinka Zitnik, Spyros Darmanis, Jure Leskovec, Jim Karkanias, Russ B. Altman
      Nature Communications Code Project website

    2. Opportunities and challenges for the computational interpretation of rare variation in clinically important genes
      Gregory McInnes, Andrew G Sharo, Megan L Koleske, Julia EH Brown, Matthew Norstad, Aashish N Adhikari, Sheng Wang, Steven E Brenner, Jodi Halpern, Barbara A Koenig, David C Magnus, Renata C Gallagher, Kathleen M Giacomini, Russ B Altman
      The American Journal of Human Genetics

    3. Set2Gaussian: Embedding GeneSets as Gaussian Distributions for Large-scale Gene Set Analysis
      Sheng Wang, Emily Flynn, Russ B. Altman
      Nature Machine Intelligence
      News & Views in Nature Machine Intelligence

    4. DisenHAN: Disentangled Heterogeneous Graph Attention Network for Recommendation
      Yifan Wang, Suyao Tang, Yuntong Lei, Weiping Song, Sheng Wang and Ming Zhang
      The Conference on Information and Knowledge Management (CIKM), 2020

    5. GRep: Gene Set Representation via Gaussian Embedding
      Sheng Wang, Emily Flynn, Russ B. Altman
      RECOMB, 2019

    6. Identification of Pathways Associated with Chemosensitivity through Network Embedding
      Sheng Wang, Edward Huang, Junmei Cairns, Jian Peng, Liewei Wang, Saurabh Sinha
      PLoS computational biology, 2019 (selected in AMIA Year-In-Review 2020)

    7. Community assessment to advance computational prediction of cancer drugcombinations in a pharmacogenomic screen
      Michael Menden et al.
      Nature Communications, 2019 (Our team ranked 5th/160 teams in Dream Drug Combo Challenge)

    8. Typing tumors using pathways selected by somatic evolution
      Sheng Wang*, Jianzhu Ma*, Wei Zhang, John Paul Shen, Justin Huang, Jian Peng, Trey Ideker
      Nature Communications, 2018

    9. Annotating gene sets by mining large literature collections with protein networks
      Sheng Wang*, Jianzhu Ma*, Michael Ku Yu, Fan Zheng, Edward W Huang, Jiawei Han, Jian Peng, Trey Ideker
      Pacific Symposium on Biocomputing (PSB), 2018

    10. VisAGE: Integrating ExternalKnowledge into Electronic Medical Record Visualization
      Edward Huang, Sheng Wang, ChengXiang Zhai
      Pacific Symposium on Biocomputing (PSB), 2018

    11. Large-Scale Integration ofHeterogeneous Pharmacogenomic Data for Identifying Drug Mechanism of Action
      Yunan Luo,Sheng Wang, Jinfeng Xiao, Jian Peng.
      Pacific Symposium on Biocomputing (PSB), 2018

    12. HEMnet: Integration of Electronic Medical Records with Molecular Interaction Networks and Domain Knowledge for Survival Analysis
      Edward W Huang, Sheng Wang, Bingxue Li, Ran Zhang, Baoyan Liu, Runshun Zhang,Jie Liu, Xuezhong Zhou, Hongsheng Lin, ChengXiang Zhai
      ACM BCB, 2018

    13. Network-assisted target identification for haploinsufficiency and homozygous profiling screens
      Sheng Wang, Jian Peng
      PLOS Computational Biology, 2017

    14. ProSNet: integrating homology with molecular networks for protein function prediction
      Sheng Wang, Meng Qu, Jian Peng
      Pacific Symposium on Biocomputing (PSB), 2017

    15. PaReCat: Patient Record Subcategorization for PrecisionTraditional Chinese Medicine
      Edward Huang, Sheng Wang, Runshun Zhang, Baoyan Liu, Xuezhong Zhou,ChengXiang Zhai.
      ACM BCB, 2017

    16. Prediction of overall survival for patients with metastatic castration-resistantprostate cancer: development of a prognostic model through a crowdsourcedchallenge with open clinical trial data
      Justin Guinney et al.
      The Lancet Oncology, 2017 (Our team is the co-winner of Dream Prostate Cancer Challenge)

    17. Framing Electronic Medical Records as Polylingual Documents in Query Expansion
      Edward W Huang, Sheng Wang, Doris Jung-lin Lee, Runshun Zhang, Baoyan Liu, Xuezhong Zhou, ChengXiang Zhai
      AMIA , 2017

    18. proTCM: An Asymmetric Probabilistic Model for the Joint Analysis of Symptoms, Diseases, and Herbs in Traditional Chinese Medicine Clinical Data
      Sheng Wang, Edward Huang, Bingxue Li, Ran Zhang, Xiaoping Zhang, Baoyan Liu, Jie Liu, Runshun Zhang, Xuezhong Zhou, ChengXiang Zhai
      BIBM, 2017.

    19. Exploiting Ontology Graph for Predicting Sparsely Annotated Gene Function
      Sheng Wang*, Hyunghoon Cho*, ChengXiang Zhai, Bonnie Berger, Jian Peng
      ISMB/ECCB, 2015. Bioinformatics, 2015. (Candidate of Outstanding Student Paper Award)

    20. Early Identification of Adverse Drug Reactions from Search Log Data
      Ryen W. White, Sheng Wang, Apurv Pant, Rave Harpaz, Pushpraj Shukla, Walter Sun, William DuMouchel, Eric Horvitz.
      Journal of Biomedical Informatics,2016. (Editors' Choice Award)

    21. DWCox: A Density-Weighted Cox Model for Robust Prediction of Prostate Cancer Survival
      Jinfeng Xiao, Sheng Wang, Jingbo Shang, Henry Lin, Doris Xin, Xiang Ren, Jiawei Han, Jian Peng
      F1000Research, 2016

    22. SideEffectPTM: An Unsupervised Topic Model to Mine Adverse Drug Reactions from Health Forums
      Sheng Wang, Yanen Li, Duncan Ferguson, Chengxiang Zhai
      ACM BCB, 2014

    23. SUIT: A Supervised User-Item based Topic model for Sentiment Analysis
      Fangtao Li,Sheng Wang, Shenghua Liu, Ming Zhang
      AAAI, 2014

    24. Supervised Topic Model with Consideration of User and Item
      Sheng Wang, Fangtao Li, Ming Zhang.
      AAAI Late breaking paper, 2014

    25. Please Spread: RecommendingTweets for Retweeting with Implicit Feedback
      Sheng Wang, Xiaobo Zhou, Ziqi Wang, Ming Zhang
      CIKM workshop, 2012
    -->

    Last updated date: Aug, 2020