Kuan-Hao Chao 


$

Sr. Deep Learning Scientist

@


More Info

Icon   About me

I am a Senior Deep Learning Scientist at the Illumina AI Lab. I earned my Ph.D. in Computer Science from the Center for Computational Biology, Johns Hopkins University (August 2025), advised by Dr. Steven Salzberg and Dr. Mihaela Pertea. My research focuses on AI for genomics—including sequence-to-function modeling, genome annotation, and DNA language models. I hold a B.S. in Electrical Engineering from National Taiwan University and completed a final-year exchange at the College of Engineering & Computer Science, Australian National University.

🧬 My research interest intersects deep learning with genomics and transcriptomics:

  • In RNA-Seq gene expression prediction, I developed Shorkie, a masked‑DNA language model pretrained on 165 fungal genomes and fine‑tuned on high‑resolution yeast TF induction time‑course RNA‑seq data. (See what my model has learned; Learn more; Talk).
  • In splice site prediction, I built a dilated residual convolutional neural network to decode the complexities of RNA splicing, alternative splicing, and the impact of genetic variants on cryptic splicing (Paper I; Paper II; News; Talk).
  • In genome annotation, I used graph-based methods to stitch together fragmented DNA and protein alignments, thereby assembling them into more accurate annotations. (Paper; Talk).
  • In genome assembly, I assembled and annotated the first gapless Southern Chinese Han genome, Han1, using PacBio HiFi and Oxford Nanopore long reads, with T2T-CHM13 as a guide (Paper; Genome).
  • For pangenome indexing, I applied new renaming heuristics and an SMT solver to make the Wheeler graph recognition problem computationally feasible (Paper; News; Talk).

💻 I am an advocate for open-source software, embracing the philosophy of “build what you need, use what you build”. I invite you to explore my NEWS page for the latest updates on my projects.

💬 Feel free to reach out to me for collaborations, discussions, or just to say hi! Coffee chat! ☕️



  Selected Publication

  more …


  Selected Presentation

  • Doctoral Dissertation Defense Seminar: "Decoding the Language of Genomes: Bridging Sequences and Function through Deep Learning", Baltimore, MD, August 2025, Slides, Website
  • Robbins Prize Awardee Talk, JHU Symposium on High-Performance Computing (HPC 2025), Baltimore, MD, April 2025, Photo, Slides
  • JHU Joint Biostats-Genomics Lab Meeting Talk, Baltimore, MD, Jan 2025, Video, Slides
  • Calico internship 1-hour Talk, Calico, South San Francisco, CA, August 2024, Photo, Slides
  • Invited Google Deep Dive 1-hour Talk, Google Health, Virtual & Mountain View, CA, August 2024, Video [Google internal only], Slides
  • ISMB General Computational COSI Talk, International Conference on Intelligent Systems for Molecular Biology, Montréal, Canada, July 2024, Video, Slides
  • JHU Joint Biostats-Genomics Lab Meeting Talk, Baltimore, MD, May 2024, Video, Slides
  • RECOMB-seq Talk, Research in Computational Molecular Biology on Biological Sequence Analysis, Cambridge, USA, April 2024, Slides
  • RECOMB-seq Proceeding Talk, Research in Computational Molecular Biology on Biological Sequence Analysis, Istanbul, Türkiye, April 2023, Video, Slides
  • ISMB/ECCB Poster, Intelligent Systems for Molecular Biology / European Conference on Computational Biology 2023, Lyon, France, July 2023, Link

  more …


  Education


  Experience


  Honor & Fellowship


  Selected open-source software

  more …


  Side Projects

  more …


  Teaching

  • Johns Hopkins University
    • EN.580.458 / 658 Computing the Transcriptome, Teaching assistant, Spring 2023
  • National Taiwan University
    • CSX 4001 Data Science Programming, Teaching assistant, Spring 2019
    • EE 1006 Cornerstone EECS Design and Implementation, Teaching assistant, Fall 2018

  more …


  Service

  • Co-founder and Organizer
    • Johns Hopkins Deep Learning + Genomics Study Group, 2024 - Present Slides, Repository
  • Reviewer
    • Human Genetics and Genomics Advances: 2025
    • BMC Genomics: 2025
    • Scientific Reports: 2025
    • Genome Research: 2024
    • G3: Genes, Genomes, Genetics: 2024
    • BMC Bioinformatics: 2024
    • International Society for Computational Biology (ISCB): 2024
    • Chromatographia: 2023
  • Sub-reviewer
    • Genome Research: 2024
    • Nature Machine Intelligence: 2023
    • G3: Genes, Genomes, Genetics: 2022