Engineers at UC San Diego published PanMAN, a new compressive pangenome data structure and file format that enables pangenomic analysis at previously infeasible scales by encoding mutation‑annotated trees and networks. PanMAN achieves large compression ratios compared with existing formats while preserving phylogenetic and mutational context needed for analyses. The method allows computations directly on compressed data and supports representation of evolutionary histories alongside variant annotations, reducing storage and compute barriers for population‑scale comparative genomics. The team reported performance improvements that grow with dataset size and presented the work in Nature Genetics. PanMAN addresses a mounting bottleneck in large‑scale genomics where data volume and representational limitations constrain pangenomic approaches and downstream AI or population‑scale analyses.
Get the Daily Brief