Welcome to ProPhyle
ProPhyle brings metagenomic classification from clusters to laptops. This is possible thanks to a novel indexing strategy, based on the bottom-up propagation of k-mers in the phylogenetic/taxonomic tree, assembling contigs at each node and matching using a full-text search.
Compared to other state-of-the-art classifiers, ProPhyle provides several unique features:
- Low memory requirements. Compared to Kraken, ProPhyle has 7x smaller memory footprint for index construction and 5x smaller footprint for querying, while providing a more expressive index.
- Flexibility. ProPhyle is easy to use with any user-provided phylogenetic trees and reference genomics sequences (e.g., reads or assemblies). It can classify short reads, long reads, or even assembled contigs.
- Standard bioinformatics formats. Newick/NHX is used for representing phylogenetic trees and SAM/BAM for reporting assignments.
- Lossless k-mer indexing. ProPhyle stores a list of all genomes containing a k-mer. The classification is, therefore, accurate even with trees containing similar genomes (e.g, phylogenetic trees for a single species).
- Reproducibility. ProPhyle is fully deterministic, with a mathematically well-defined behavior. Databases are versioned and distributed via Zenodo.
 K. Břinda, L. Lima, S. Pignotti, N. Quinones-Olvera, K. Salikhov, R. Chikhi, G. Kucherov, Z. Iqbal, and M. Baym, Efficient and robust search of microbial genomes via phylogenetic compression, bioRxiv 2023.04.15.536996, 2023. https://doi.org/10.1101/2023.04.15.536996.
 Břinda K, Salikhov K, Pignotti S, Kucherov G. ProPhyle 0.3.1.0, Zenodo, 2017. https://doi.org/10.5281/zenodo.1045429.
 Břinda K, Salikhov K, Pignotti S, Kucherov G. ProPhyle: a phylogeny-based metagenomic classifier using the Burrows-Wheeler Transform. Poster at HiTSeq 2017. https://doi.org/10.5281/zenodo.1045427
 Břinda K. Novel computational techniques for mapping and classifying Next-Generation Sequencing data. PhD Thesis, Université Paris-Est, 2016. https://doi.org/10.5281/zenodo.1045317
 Salikhov K. Efficient algorithms and data structures for indexing DNA sequence data. PhD Thesis, Université Paris-Est, 2017.
 introduces phylogenetic compression, which is the fundamental concept behind ProPhyle,  is the main reference for the entire ProPhyle package,  contains a summary of the ProPhyle algorithm,  provides a thorough description (see Chapter 12), and  explains details of the BWT-indexing technique.