DNA language models
PreprintShorkie: reading yeast regulatory code with fungal DNA models
A compact fungal DNA language model for predicting budding-yeast RNA-seq coverage and regulatory variant effects.
Connected work
A project-level map of the papers, software, technical posts, talks, and reports behind my research.
Each project below collects the related outputs in one place: paper, code, docs, research summary, and technical notes where available.
DNA language models
PreprintA compact fungal DNA language model for predicting budding-yeast RNA-seq coverage and regulatory variant effects.
Splice prediction
PublishedA modular PyTorch implementation of SpliceAI that can be retrained across species and used for genome-wide variant rescoring.
RNA splicing
PublishedA splice-junction recognizer designed to filter spurious RNA-seq junctions and improve downstream transcriptome assembly.
Genome annotation
PublishedA genome-annotation lift-over tool that combines DNA alignments with protein evidence to preserve accurate coding models.
Genome assembly
PublishedA complete, reference-quality, fully annotated genome assembled from a Southern Han Chinese individual.
Pangenomics
PublishedA toolkit for recognizing, visualizing, generating, and repairing Wheeler graphs used in pangenome indexing.
Sequencing software
PublishedAn R/Bioconductor workflow and Shiny interface for trimming, assembling, inspecting, and reporting Sanger sequencing data.