PetCaseFinder

Peer-reviewed veterinary case report

performance of a targeted enriched metagenomics approach to inferstrains in milk.

Journal:
Frontiers in veterinary science
Year:
2026
Authors:
Biesheuvel, Marit M et al.
Affiliation:
Faculty of Veterinary Medicine · Canada

Abstract

Strain variation plays a key role in the microbial epidemiology of, yet its true diversity remains incompletely characterized, partly due to limitations of culture-based methods. This study evaluated thesuitability of a targeted enrichment (TE) shotgun sequencing approach to detect and classifystrains in milk metagenomic samples. As a proof of concept, the accuracy of this approach was assessed using milk-derivedstrains. A total of 620whole-genome sequences were downloaded from NCBI, of which 162 (26.1%) originated from milk samples. Genomes were grouped into Genomically Clustered Sequence Variants (GSVs) using MashTree and TreeCluster to enable strain-level classification. To simulate TE sequencing data, genomes from different milk-associated GSVs were randomly selected and fragmentedinto 150-bp reads. Mock milk samples were generated by sampling reads with replacement from these genomes. Sequencing depth was modeled using a Poisson distribution, while mixed-strain DNA samples were simulated by including 1, 3, 6, or 9 GSVs per sample. Enrichment proportions were set at 0.3, 0.5, 0.7, and 0.9. Two classification tools, Kraken2 and Themisto/mSWEEP, were evaluated for their ability to detect and classify the simulated TE reads. Themisto/mSWEEP consistently outperformed Kraken2, achieving an average read classification accuracy of 84.9% compared with 1.4% for Kraken2. Sensitivity for Themisto/mSWEEP was 100% with a single spiked GSV and declined slightly to 97.0% with nine GSVs, whereas Kraken2 achieved sensitivities of only 17.3% and 4.7%, respectively. Positive predictive value (PPV) showed a similar pattern: 98% for Themisto/mSWEEP vs. 4.7% for Kraken2 with a single GSV, and 65.5% vs. 10% with nine GSVs. While Kraken2's PPV increased slightly with additional GSVs, Themisto/mSWEEP's PPV decreased. Both methods maintained high specificity and negative predictive value (>91%) across all scenarios. Enrichment proportion had no measurable effect on performance. Overall, Themisto/mSWEEP demonstrated superior accuracy for GSV-level identification ofstrains. Enrichment to at least 30% of total reads was sufficient to recover strain-level data. Further work is needed to assess the biological relevance and practical applications of these genomic clusters.

Find similar cases for your pet

PetCaseFinder finds other peer-reviewed reports of pets with the same symptoms, plus a plain-English summary of what was tried across them.

Search related cases →

Original publication: https://pubmed.ncbi.nlm.nih.gov/41929272/