Dragon
Cerdeira, L
(2026).
Dragon.
[Dataset].
Zenodo.
https://doi.org/10.5281/zenodo.19478346
Resource-efficient sequence alignment against millions of prokaryotic genomes using graph-based compressed indexing. Dragon aligns query sequences (genes, plasmids, long reads) against millions of prokaryotic genomes while using dramatically less disk and RAM than existing tools. It achieves this by exploiting the massive sequence redundancy among related genomes through three key innovations: (1) Coloured compacted de Bruijn graph - shared sequence stored once across all genomes, (2) Run-length FM-index - compressed seed index proportional to BWT runs, not text length, and (3) Graph-aware colinear chaining - seed chaining that respects genome graph structure.
Keywords
Machine Learning| Item Type | Dataset |
|---|---|
| Resource Type |
Resource Type Resource Description Software Rust |
| Capture method | Other |
| Date | 9 April 2026 |
| Language(s) of written materials | English |
| Creator(s) |
Cerdeira, L |
| LSHTM Faculty/Department | Faculty of Infectious and Tropical Diseases > Department of Infection Biology |
| Participating Institutions | London School of Hygiene & Tropical Medicine, London, United Kingdom |
| Date Deposited | 09 Apr 2026 08:46 |
| Last Modified | 09 Apr 2026 08:47 |
| Publisher | Zenodo |
Explore Further
ORCID: https://orcid.org/0000-0002-4495-2615