Why we built Shorkie: reading the regulatory code of yeast
Research summary9 min read
The story behind Shorkie — a fungal DNA language model that learns the regulatory grammar of budding yeast and predicts how its genes turn on and off over time.
Writing
Research summaries and opinions — notes on the work I do and the ideas behind it.
Research summary9 min read
The story behind Shorkie — a fungal DNA language model that learns the regulatory grammar of budding yeast and predicts how its genes turn on and off over time.
Research summary7 min read
The story behind OpenSpliceAI — a faithful, fast PyTorch reimplementation of SpliceAI that anyone can retrain on any species, and use to read the splicing impact of genetic variants.
Research summary10 min read
The story behind LiftOn — a genome-annotation tool that fuses DNA alignment (Liftoff) and protein alignment (miniprot) so you can carry accurate gene annotations onto a new assembly, even across species.
Research summary6 min read
The story behind Splam — a compact deep-learning model that scores a splice junction from its local sequence, more accurately than SpliceAI, and uses it to strip spurious alignments out of RNA-seq.
Research summary8 min read
The story behind the Wheeler Graph Toolkit — software to generate, recognize, and visualize Wheeler graphs, the elegant class of graphs that can be indexed and searched like a BWT, using a fast heuristic and an SMT solver.
Research summary6 min read
The story behind Han1 — the first gapless, reference-quality, fully annotated human genome from a Han Chinese individual, and what comparing two finished human genomes revealed.
Research summary6 min read
The story behind sangeranalyseR — a free, open-source R/Bioconductor package that takes you from raw .ab1 chromatograms to aligned consensus sequences in about four lines of code, with an interactive Shiny GUI.
No posts match your search.