🧑🏻💻 About me
🎓 I’m a fourth-year Ph.D. Candidate in Computer Science at the Center for Computational Biology, Johns Hopkins University, working with Steven Salzberg and Mihaela Pertea. My academic journey started in Electrical Engineering at National Taiwan University (NTU), shifting towards computer science in my final year at the College of Engineering & Computer Science at Australian National University (ANU).
🧬 My research interest intersects deep learning with genomics and transcriptomics:
- In transcriptional regulatory networks, my work uses sequence models to decode DNA patterns, aiming to uncover insights into how cis-regulatory DNA sequences and trans-regulators interact. I am building a yeast large language model (LLM) from hundreds of fungus genomes to better understand the mechanisms of yeast gene expression regulation.
- In splice site predictiong, I built a deep dilated residual convolutional neural network to decode the complexities of RNA splicing, alternative splicing, and the impact of genetic variants on cryptic splicing (Learn more).
- In genome assembly, I assembled and annotated the first gapless Southern Chinese Han genome, Han1, using PacBio HiFi and Oxford Nanopore long reads, with T2T-CHM13 as a guide (Learn more).
- For pangenome indexing, I applied new renaming heuristics and an SMT solver to make the Wheeler graph recognition problem computationally feasible (Learn more).
- In genome annotation, I used graph-based methods to stitch together fragmented DNA and protein alignments, thereby assembling them into more accurate annotations. (Learn more).
💻 I am an advocate for open-source software, embracing the philosophy of “build what you need, use what you build”. I invite you to explore my NEWS page for the latest updates on my projects.
💬 Feel free to reach out to me for collaborations, discussions, or just to say hi! Coffee chat! ☕️
Selected Publication
- Kuan-Hao Chao*, Alan Mao, Steven L. Salzberg, Mihaela Pertea* (2024). Splam: a deep-learning-based splice site predictor that improves spliced alignments, Genome Biology
- Kuan-Hao Chao*, Jakob M. Heinz, Celine Hoh, Alan Mao, Mihaela Pertea, Steven L. Salzberg* (2024) Combining DNA and protein alignments to improve genome annotation with LiftOn, bioRxiv
- Kuan-Hao Chao*, K. Barton, S. Palmer, and R. Lanfear* (2021). sangeranalyseR: simple and interactive processing of Sanger sequencing data in R, Genome Biology and Evolution
- Kuan-Hao Chao*, A.V. Zimin, M. Pertea, S.L. Salzberg* (2023). The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual, G3: Genes, Genomes, Genetics
- Kuan-Hao Chao*†, Pei-Wei Chen†, Sanjit A. Seshia, Ben Langmead* (2023). WGT: Tools and algorithms for recognizing, visualizing and generating Wheeler graphs, iScience
Selected Presentation
- Calico internship 1-hour Talk, Calico, South San Francisco, CA, August 2024, Slides, Photo
- Invited Google Deep Dive 1-hour Talk, Google Health, Virtual & Mountain View, CA, August 2024, Slides, Video [Google internal only]
- ISMB General Computational COSI Talk, International Conference on Intelligent Systems for Molecular Biology, Montréal, Canada, July 2024, Slides, Video coming soon
- JHU Joint Biostats-Genomics Lab Meeting Talk, Baltimore, MD, May 2024, Video, Slides
- RECOMB-seq Talk, Research in Computational Molecular Biology on Biological Sequence Analysis, Cambridge, USA, April 2024, Slides
- RECOMB-seq Proceeding Talk, Research in Computational Molecular Biology on Biological Sequence Analysis, Istanbul, Türkiye, April 2023, Video, Slides
- ISMB/ECCB Poster, Intelligent Systems for Molecular Biology / European Conference on Computational Biology 2023, Lyon, France, July 2023, Link
Education
- Ph.D. Candidate in Computer Science, Johns Hopkins University, Sep/2021 - Present
- M.S.E. in Computer Science, Johns Hopkins University, Sep/2021 - May/2023
- B.S. in Electrical Engineering, National Taiwan University, Sep/2016 - Jan/2021
Experience
- Genomic Machine Learning Research Intern, Kelley Lab, Calico, May/2024 - Aug/2024
- Research Assistant, Institute of Information Science, Academia Sinica, Jul/2020 - Jan/2021
- Research Student, Research School of Biology, The Australian National University, Jul/2019 - Jun/2020
- Research Student, Centers of Genomic and Precision Medicine, National Taiwan University, Aug/2018 - Jul/2019
Honor & Fellowship
- Mark O. Robbins Prize awarded by Advanced Research Computing at Hopkins (ARCH), 2024
- Research highlight by JHU Whiting School of Engineering and CS Department, 2024
- Best Poster Award, Bioconductor Conference (Bioc2021), 2021
- College Student Research Fellowship awarded by Taiwan Ministry of Science and Technology, 2019
Selected open-source software
- Splam, splice site predictor Code Documentation Poster Paper Cite
- LiftOn, annotation lift-over tool Code Documentation Paper
- sangeranalyseR, R package for analyzing Sanger sequence Code Documentation Poster Paper Cite
- Wheele Graph Toolkit Code Poster Paper Cite
Side Projects
- Biobaby, Unity WebGL game, Play it now!
- Flappy penguin, Unity WebGL game, Play it now!
- Tank fire, Unity WebGL game, Play it now!
Teaching
- Johns Hopkins University
- EN.580.458 / 658 Computing the Transcriptome, Teaching assistant, Spring 2023
- National Taiwan University
- CSX 4001 Data Science Programming, Teaching assistant, Spring 2019
- EE 1006 Cornerstone EECS Design and Implementation, Teaching assistant, Fall 2018
Service
- Reviewer
- Genome Research: 2024
- G3: Genes, Genomes, Genetics: 2024
- BMC Bioinformatics: 2024
- International Society for Computational Biology (ISCB): 2024
- Chromatographia: 2023
- Sub-reviewer
- Genome Research: 2024
- Nature Machine Intelligence: 2023
- G3: Genes, Genomes, Genetics: 2022