An overview of next generation sequencing strategies and genomics tools used for tuberculosis research Free

Tools and pipelines for identifying drug-resistant M.tb strains and analysing TB infections.

Name	Category	Application	Access link	References
kmer-based method (Bugwas)	Software/Database	Uses a linear mixed model method to identify genetic variants causing drug resistance at lineage-level, focusing on differences in genomic regions in bacterial pathogens causing TB infection.	https://github.com/sgearle/bugwas	Earle et al. (2016), Jaillard et al. (2018)
Mykrobe predictor	Software/Database	The Mykrobe predictor software package efficiently analyses raw read sequence data to produce user-friendly reports on drug-resistant M.tb strains	https://github.com/Mykrobe-tools/mykrobe	Hunt et al. (2019)
TnSeq pipeline	Software/Database	TnSeq data analysis maps reads from transposon-junction to the mutant strain’s genome, allowing for strain-specific traits investigation	https://gitlab.com/tbgenomicsunit/tnseq-pipeline	Carey et al. (2018)
TB-DROP	Software/Database	A tailored deep learning model to predict MTB drug resistance using genome mutations	https://github.com/nottwy/TB-DROP	Wang et al. (2024)
Protein druggability database (TuberQ)	Pipeline/Database	It uses 3982 Open Reading Frames from the H37Rv strain for HMMer analysis, microarray expression data, structural homology-modeling, and drug pockets prediction to predict druggable M.tb proteins	http://tuberq.proteinq.com.ar/	Radusky et al. (2014)
SpolLineages	Pipeline/Database	This tool is a Java-based program that mainly relies on components from RuleTB, SITVIT2 database, decision tree, and evolutionary computations. Used to identify M.tb complex through various typing patterns	https://github.com/dcouvin/SpolLineages	Couvin et al. (2020)
CHOPIN	Database	Predicts the structural effect from mutations conferring drug resistance to the M.tb complex	http://structure.bioc.cam.ac.uk/chopin	Ochoa-Montaño et al. (2015)
Tbvar	Database	Annotates and identifies novel variants using the WGS technique	http://genome.igib.res.in/tbvar/	Joshi et al. (2014)
SInCre	Database	Analyse the M. tb proteome, enabling functional domain, homology, binding pockets, and structural annotation	http://proline.biochem.iisc.ernet.in	Metri et al. (2015)
TIBLE	Pipeline/Database	TIBLE is a user-friendly online resource that offers convenient access to information on the minimal inhibitory concentrations of small molecules against various mycobacterial species. Additionally, it provides predictions on target binding and off-target effects for M.tb	http://www-cryst.bioc.cam.ac.uk/tible/	Malhotra et al. (2017)
AntiTbPdb	Database	The AntiTbPdb serves as a repository for experimentally validated peptides with anti-tubercular or anti-mycobacterial properties. It furnishes comprehensive details for each peptide, including sequence, modifications, origin, strain-specific mycobacterium species, inhibition concentration, specific immune response, and more. Additionally, the database incorporates predicted structures for these anti-tubercular peptides	http://webs.iiitd.edu.in/raghava/antitbpdb/	Usmani et al. (2018)
HGV&TB Database	Database	It contains information on 98 TB genes from 307 variants	genome.igig.res.in/hgvtb/index.html	Sahajpal et al. (2014)
SpolPred	Software	Identifies the spoligotype in M.tb from NGS raw read sequences	www.pathogenseq.org/spolpred	Coll et al. (2012)
MycPermCheck	Online prediction tool	A web tool for analysing small molecule permeability in M.tb cells, predicting based on logistic regression, and target molecule physico-chemical features	http://www.mycpermcheck.aksotriffer.pharmazie.uni-wuerzburg.de	Merget et al. (2013)
DeepAMR	Online prediction tool	This tool uses genome sequence data to classify drug-resistance labels with reduced dimensionality, achieving high sensitivity and specificity	http://www.robots.ox.ac.uk/∼davidc/code.php	Yang et al. (2019)
MtbRegList	Database	This tool uses genome sequence data to classify drug-resistance labels with reduced dimensionality, achieving high sensitivity and specificity	http://www.USherbrooke.ca/vers/MtbRegList	Jacques et al. (2005)
TubercuList	Database	This database utilizes up-to-date curated genomes and protein 3D structures information to reannotate previously published TB genomes, enabling accurate prediction of genes and their respective functions	http://genolist.pasteur.fr/TubercuList/	Camus et al. (2002)
SAM-TB	Pipeline/Database	SAM-TB integrates variant detection, genomic cluster inference, detection of mixed NTM and MTB samples, and NTM species identification. SAM-TB also offers confidence levels for resistance predictions and supports batch export of analysis results	http://samtb.szmbzx.com	Yang et al. (2022b)
TB-Profiler	Online profiling tool	Bioinformatics webserve for trimming NGS reads, reference genome alignment, and variant calling	https://tbdr.lshtm.ac.uk/	Phelan et al. (2019)

Name	Category	Application	Access link	References
kmer-based method (Bugwas)	Software/Database	Uses a linear mixed model method to identify genetic variants causing drug resistance at lineage-level, focusing on differences in genomic regions in bacterial pathogens causing TB infection.	https://github.com/sgearle/bugwas	Earle et al. (2016), Jaillard et al. (2018)
Mykrobe predictor	Software/Database	The Mykrobe predictor software package efficiently analyses raw read sequence data to produce user-friendly reports on drug-resistant M.tb strains	https://github.com/Mykrobe-tools/mykrobe	Hunt et al. (2019)
TnSeq pipeline	Software/Database	TnSeq data analysis maps reads from transposon-junction to the mutant strain’s genome, allowing for strain-specific traits investigation	https://gitlab.com/tbgenomicsunit/tnseq-pipeline	Carey et al. (2018)
TB-DROP	Software/Database	A tailored deep learning model to predict MTB drug resistance using genome mutations	https://github.com/nottwy/TB-DROP	Wang et al. (2024)
Protein druggability database (TuberQ)	Pipeline/Database	It uses 3982 Open Reading Frames from the H37Rv strain for HMMer analysis, microarray expression data, structural homology-modeling, and drug pockets prediction to predict druggable M.tb proteins	http://tuberq.proteinq.com.ar/	Radusky et al. (2014)
SpolLineages	Pipeline/Database	This tool is a Java-based program that mainly relies on components from RuleTB, SITVIT2 database, decision tree, and evolutionary computations. Used to identify M.tb complex through various typing patterns	https://github.com/dcouvin/SpolLineages	Couvin et al. (2020)
CHOPIN	Database	Predicts the structural effect from mutations conferring drug resistance to the M.tb complex	http://structure.bioc.cam.ac.uk/chopin	Ochoa-Montaño et al. (2015)
Tbvar	Database	Annotates and identifies novel variants using the WGS technique	http://genome.igib.res.in/tbvar/	Joshi et al. (2014)
SInCre	Database	Analyse the M. tb proteome, enabling functional domain, homology, binding pockets, and structural annotation	http://proline.biochem.iisc.ernet.in	Metri et al. (2015)
TIBLE	Pipeline/Database	TIBLE is a user-friendly online resource that offers convenient access to information on the minimal inhibitory concentrations of small molecules against various mycobacterial species. Additionally, it provides predictions on target binding and off-target effects for M.tb	http://www-cryst.bioc.cam.ac.uk/tible/	Malhotra et al. (2017)
AntiTbPdb	Database	The AntiTbPdb serves as a repository for experimentally validated peptides with anti-tubercular or anti-mycobacterial properties. It furnishes comprehensive details for each peptide, including sequence, modifications, origin, strain-specific mycobacterium species, inhibition concentration, specific immune response, and more. Additionally, the database incorporates predicted structures for these anti-tubercular peptides	http://webs.iiitd.edu.in/raghava/antitbpdb/	Usmani et al. (2018)
HGV&TB Database	Database	It contains information on 98 TB genes from 307 variants	genome.igig.res.in/hgvtb/index.html	Sahajpal et al. (2014)
SpolPred	Software	Identifies the spoligotype in M.tb from NGS raw read sequences	www.pathogenseq.org/spolpred	Coll et al. (2012)
MycPermCheck	Online prediction tool	A web tool for analysing small molecule permeability in M.tb cells, predicting based on logistic regression, and target molecule physico-chemical features	http://www.mycpermcheck.aksotriffer.pharmazie.uni-wuerzburg.de	Merget et al. (2013)
DeepAMR	Online prediction tool	This tool uses genome sequence data to classify drug-resistance labels with reduced dimensionality, achieving high sensitivity and specificity	http://www.robots.ox.ac.uk/∼davidc/code.php	Yang et al. (2019)
MtbRegList	Database	This tool uses genome sequence data to classify drug-resistance labels with reduced dimensionality, achieving high sensitivity and specificity	http://www.USherbrooke.ca/vers/MtbRegList	Jacques et al. (2005)
TubercuList	Database	This database utilizes up-to-date curated genomes and protein 3D structures information to reannotate previously published TB genomes, enabling accurate prediction of genes and their respective functions	http://genolist.pasteur.fr/TubercuList/	Camus et al. (2002)
SAM-TB	Pipeline/Database	SAM-TB integrates variant detection, genomic cluster inference, detection of mixed NTM and MTB samples, and NTM species identification. SAM-TB also offers confidence levels for resistance predictions and supports batch export of analysis results	http://samtb.szmbzx.com	Yang et al. (2022b)
TB-Profiler	Online profiling tool	Bioinformatics webserve for trimming NGS reads, reference genome alignment, and variant calling	https://tbdr.lshtm.ac.uk/	Phelan et al. (2019)

Table 1.

Tools and pipelines for identifying drug-resistant M.tb strains and analysing TB infections.

Name	Category	Application	Access link	References
kmer-based method (Bugwas)	Software/Database	Uses a linear mixed model method to identify genetic variants causing drug resistance at lineage-level, focusing on differences in genomic regions in bacterial pathogens causing TB infection.	https://github.com/sgearle/bugwas	Earle et al. (2016), Jaillard et al. (2018)
Mykrobe predictor	Software/Database	The Mykrobe predictor software package efficiently analyses raw read sequence data to produce user-friendly reports on drug-resistant M.tb strains	https://github.com/Mykrobe-tools/mykrobe	Hunt et al. (2019)
TnSeq pipeline	Software/Database	TnSeq data analysis maps reads from transposon-junction to the mutant strain’s genome, allowing for strain-specific traits investigation	https://gitlab.com/tbgenomicsunit/tnseq-pipeline	Carey et al. (2018)
TB-DROP	Software/Database	A tailored deep learning model to predict MTB drug resistance using genome mutations	https://github.com/nottwy/TB-DROP	Wang et al. (2024)
Protein druggability database (TuberQ)	Pipeline/Database	It uses 3982 Open Reading Frames from the H37Rv strain for HMMer analysis, microarray expression data, structural homology-modeling, and drug pockets prediction to predict druggable M.tb proteins	http://tuberq.proteinq.com.ar/	Radusky et al. (2014)
SpolLineages	Pipeline/Database	This tool is a Java-based program that mainly relies on components from RuleTB, SITVIT2 database, decision tree, and evolutionary computations. Used to identify M.tb complex through various typing patterns	https://github.com/dcouvin/SpolLineages	Couvin et al. (2020)
CHOPIN	Database	Predicts the structural effect from mutations conferring drug resistance to the M.tb complex	http://structure.bioc.cam.ac.uk/chopin	Ochoa-Montaño et al. (2015)
Tbvar	Database	Annotates and identifies novel variants using the WGS technique	http://genome.igib.res.in/tbvar/	Joshi et al. (2014)
SInCre	Database	Analyse the M. tb proteome, enabling functional domain, homology, binding pockets, and structural annotation	http://proline.biochem.iisc.ernet.in	Metri et al. (2015)
TIBLE	Pipeline/Database	TIBLE is a user-friendly online resource that offers convenient access to information on the minimal inhibitory concentrations of small molecules against various mycobacterial species. Additionally, it provides predictions on target binding and off-target effects for M.tb	http://www-cryst.bioc.cam.ac.uk/tible/	Malhotra et al. (2017)
AntiTbPdb	Database	The AntiTbPdb serves as a repository for experimentally validated peptides with anti-tubercular or anti-mycobacterial properties. It furnishes comprehensive details for each peptide, including sequence, modifications, origin, strain-specific mycobacterium species, inhibition concentration, specific immune response, and more. Additionally, the database incorporates predicted structures for these anti-tubercular peptides	http://webs.iiitd.edu.in/raghava/antitbpdb/	Usmani et al. (2018)
HGV&TB Database	Database	It contains information on 98 TB genes from 307 variants	genome.igig.res.in/hgvtb/index.html	Sahajpal et al. (2014)
SpolPred	Software	Identifies the spoligotype in M.tb from NGS raw read sequences	www.pathogenseq.org/spolpred	Coll et al. (2012)
MycPermCheck	Online prediction tool	A web tool for analysing small molecule permeability in M.tb cells, predicting based on logistic regression, and target molecule physico-chemical features	http://www.mycpermcheck.aksotriffer.pharmazie.uni-wuerzburg.de	Merget et al. (2013)
DeepAMR	Online prediction tool	This tool uses genome sequence data to classify drug-resistance labels with reduced dimensionality, achieving high sensitivity and specificity	http://www.robots.ox.ac.uk/∼davidc/code.php	Yang et al. (2019)
MtbRegList	Database	This tool uses genome sequence data to classify drug-resistance labels with reduced dimensionality, achieving high sensitivity and specificity	http://www.USherbrooke.ca/vers/MtbRegList	Jacques et al. (2005)
TubercuList	Database	This database utilizes up-to-date curated genomes and protein 3D structures information to reannotate previously published TB genomes, enabling accurate prediction of genes and their respective functions	http://genolist.pasteur.fr/TubercuList/	Camus et al. (2002)
SAM-TB	Pipeline/Database	SAM-TB integrates variant detection, genomic cluster inference, detection of mixed NTM and MTB samples, and NTM species identification. SAM-TB also offers confidence levels for resistance predictions and supports batch export of analysis results	http://samtb.szmbzx.com	Yang et al. (2022b)
TB-Profiler	Online profiling tool	Bioinformatics webserve for trimming NGS reads, reference genome alignment, and variant calling	https://tbdr.lshtm.ac.uk/	Phelan et al. (2019)

Name	Category	Application	Access link	References
kmer-based method (Bugwas)	Software/Database	Uses a linear mixed model method to identify genetic variants causing drug resistance at lineage-level, focusing on differences in genomic regions in bacterial pathogens causing TB infection.	https://github.com/sgearle/bugwas	Earle et al. (2016), Jaillard et al. (2018)
Mykrobe predictor	Software/Database	The Mykrobe predictor software package efficiently analyses raw read sequence data to produce user-friendly reports on drug-resistant M.tb strains	https://github.com/Mykrobe-tools/mykrobe	Hunt et al. (2019)
TnSeq pipeline	Software/Database	TnSeq data analysis maps reads from transposon-junction to the mutant strain’s genome, allowing for strain-specific traits investigation	https://gitlab.com/tbgenomicsunit/tnseq-pipeline	Carey et al. (2018)
TB-DROP	Software/Database	A tailored deep learning model to predict MTB drug resistance using genome mutations	https://github.com/nottwy/TB-DROP	Wang et al. (2024)
Protein druggability database (TuberQ)	Pipeline/Database	It uses 3982 Open Reading Frames from the H37Rv strain for HMMer analysis, microarray expression data, structural homology-modeling, and drug pockets prediction to predict druggable M.tb proteins	http://tuberq.proteinq.com.ar/	Radusky et al. (2014)
SpolLineages	Pipeline/Database	This tool is a Java-based program that mainly relies on components from RuleTB, SITVIT2 database, decision tree, and evolutionary computations. Used to identify M.tb complex through various typing patterns	https://github.com/dcouvin/SpolLineages	Couvin et al. (2020)
CHOPIN	Database	Predicts the structural effect from mutations conferring drug resistance to the M.tb complex	http://structure.bioc.cam.ac.uk/chopin	Ochoa-Montaño et al. (2015)
Tbvar	Database	Annotates and identifies novel variants using the WGS technique	http://genome.igib.res.in/tbvar/	Joshi et al. (2014)
SInCre	Database	Analyse the M. tb proteome, enabling functional domain, homology, binding pockets, and structural annotation	http://proline.biochem.iisc.ernet.in	Metri et al. (2015)
TIBLE	Pipeline/Database	TIBLE is a user-friendly online resource that offers convenient access to information on the minimal inhibitory concentrations of small molecules against various mycobacterial species. Additionally, it provides predictions on target binding and off-target effects for M.tb	http://www-cryst.bioc.cam.ac.uk/tible/	Malhotra et al. (2017)
AntiTbPdb	Database	The AntiTbPdb serves as a repository for experimentally validated peptides with anti-tubercular or anti-mycobacterial properties. It furnishes comprehensive details for each peptide, including sequence, modifications, origin, strain-specific mycobacterium species, inhibition concentration, specific immune response, and more. Additionally, the database incorporates predicted structures for these anti-tubercular peptides	http://webs.iiitd.edu.in/raghava/antitbpdb/	Usmani et al. (2018)
HGV&TB Database	Database	It contains information on 98 TB genes from 307 variants	genome.igig.res.in/hgvtb/index.html	Sahajpal et al. (2014)
SpolPred	Software	Identifies the spoligotype in M.tb from NGS raw read sequences	www.pathogenseq.org/spolpred	Coll et al. (2012)
MycPermCheck	Online prediction tool	A web tool for analysing small molecule permeability in M.tb cells, predicting based on logistic regression, and target molecule physico-chemical features	http://www.mycpermcheck.aksotriffer.pharmazie.uni-wuerzburg.de	Merget et al. (2013)
DeepAMR	Online prediction tool	This tool uses genome sequence data to classify drug-resistance labels with reduced dimensionality, achieving high sensitivity and specificity	http://www.robots.ox.ac.uk/∼davidc/code.php	Yang et al. (2019)
MtbRegList	Database	This tool uses genome sequence data to classify drug-resistance labels with reduced dimensionality, achieving high sensitivity and specificity	http://www.USherbrooke.ca/vers/MtbRegList	Jacques et al. (2005)
TubercuList	Database	This database utilizes up-to-date curated genomes and protein 3D structures information to reannotate previously published TB genomes, enabling accurate prediction of genes and their respective functions	http://genolist.pasteur.fr/TubercuList/	Camus et al. (2002)
SAM-TB	Pipeline/Database	SAM-TB integrates variant detection, genomic cluster inference, detection of mixed NTM and MTB samples, and NTM species identification. SAM-TB also offers confidence levels for resistance predictions and supports batch export of analysis results	http://samtb.szmbzx.com	Yang et al. (2022b)
TB-Profiler	Online profiling tool	Bioinformatics webserve for trimming NGS reads, reference genome alignment, and variant calling	https://tbdr.lshtm.ac.uk/	Phelan et al. (2019)

Table 2.

Drug resistance mutation and public database designed for TB research.

Name	Category	Application	Access link	References
TB-Lineage	Pipeline/ Online prediction tool	An online tool for classification and analysis of strains of M.tb complex	https://tbinsight.cs.rpi.edu/run_tb_lineage.html	Shabbeer et al. (2012)
The TB Portals	Database	An open-access, web-based platform for global drug-resistant-tuberculosis data sharing and analysis	https://tbportals.niaid.nih.gov/	Rosenthal et al. (2017)
COMBAT-TB-NeoDB	Pipeline/Database	Fostering TB research through integrative analysis using graph database technologies	https://github.com/COMBAT-TB/combat-tb-neodb	Lose et al. (2020)
TB DEPOT	Database	A novel public analytics platform integrating TB clinical, genomic, and radiological data for visual and statistical exploration	https://depot.tbportals.niaid.nih.gov/#/home	Gabrielian et al. (2019)
getTBinR	Software	An R package for accessing and summarizing the World Health Organization Tuberculosis data	https://github.com/seabbs/getTBinR	Abbott (2019)
TBDBT	Database	A TB DataBase template for collection of harmonized TB clinical research data in REDCap, facilitating data standardization for inter-study comparison and meta-analyses	https://github.com/CIDRI-Africa/TBDBT/	Allie et al. (2021)
TBNet	Pipeline/Database	A context-aware graph network for TB diagnosis	https://www.tbnet.eu/	Giehl et al. (2012)

Name	Category	Application	Access link	References
TB-Lineage	Pipeline/ Online prediction tool	An online tool for classification and analysis of strains of M.tb complex	https://tbinsight.cs.rpi.edu/run_tb_lineage.html	Shabbeer et al. (2012)
The TB Portals	Database	An open-access, web-based platform for global drug-resistant-tuberculosis data sharing and analysis	https://tbportals.niaid.nih.gov/	Rosenthal et al. (2017)
COMBAT-TB-NeoDB	Pipeline/Database	Fostering TB research through integrative analysis using graph database technologies	https://github.com/COMBAT-TB/combat-tb-neodb	Lose et al. (2020)
TB DEPOT	Database	A novel public analytics platform integrating TB clinical, genomic, and radiological data for visual and statistical exploration	https://depot.tbportals.niaid.nih.gov/#/home	Gabrielian et al. (2019)
getTBinR	Software	An R package for accessing and summarizing the World Health Organization Tuberculosis data	https://github.com/seabbs/getTBinR	Abbott (2019)
TBDBT	Database	A TB DataBase template for collection of harmonized TB clinical research data in REDCap, facilitating data standardization for inter-study comparison and meta-analyses	https://github.com/CIDRI-Africa/TBDBT/	Allie et al. (2021)
TBNet	Pipeline/Database	A context-aware graph network for TB diagnosis	https://www.tbnet.eu/	Giehl et al. (2012)

Table 2.

Drug resistance mutation and public database designed for TB research.

Name	Category	Application	Access link	References
TB-Lineage	Pipeline/ Online prediction tool	An online tool for classification and analysis of strains of M.tb complex	https://tbinsight.cs.rpi.edu/run_tb_lineage.html	Shabbeer et al. (2012)
The TB Portals	Database	An open-access, web-based platform for global drug-resistant-tuberculosis data sharing and analysis	https://tbportals.niaid.nih.gov/	Rosenthal et al. (2017)
COMBAT-TB-NeoDB	Pipeline/Database	Fostering TB research through integrative analysis using graph database technologies	https://github.com/COMBAT-TB/combat-tb-neodb	Lose et al. (2020)
TB DEPOT	Database	A novel public analytics platform integrating TB clinical, genomic, and radiological data for visual and statistical exploration	https://depot.tbportals.niaid.nih.gov/#/home	Gabrielian et al. (2019)
getTBinR	Software	An R package for accessing and summarizing the World Health Organization Tuberculosis data	https://github.com/seabbs/getTBinR	Abbott (2019)
TBDBT	Database	A TB DataBase template for collection of harmonized TB clinical research data in REDCap, facilitating data standardization for inter-study comparison and meta-analyses	https://github.com/CIDRI-Africa/TBDBT/	Allie et al. (2021)
TBNet	Pipeline/Database	A context-aware graph network for TB diagnosis	https://www.tbnet.eu/	Giehl et al. (2012)

Name	Category	Application	Access link	References
TB-Lineage	Pipeline/ Online prediction tool	An online tool for classification and analysis of strains of M.tb complex	https://tbinsight.cs.rpi.edu/run_tb_lineage.html	Shabbeer et al. (2012)
The TB Portals	Database	An open-access, web-based platform for global drug-resistant-tuberculosis data sharing and analysis	https://tbportals.niaid.nih.gov/	Rosenthal et al. (2017)
COMBAT-TB-NeoDB	Pipeline/Database	Fostering TB research through integrative analysis using graph database technologies	https://github.com/COMBAT-TB/combat-tb-neodb	Lose et al. (2020)
TB DEPOT	Database	A novel public analytics platform integrating TB clinical, genomic, and radiological data for visual and statistical exploration	https://depot.tbportals.niaid.nih.gov/#/home	Gabrielian et al. (2019)
getTBinR	Software	An R package for accessing and summarizing the World Health Organization Tuberculosis data	https://github.com/seabbs/getTBinR	Abbott (2019)
TBDBT	Database	A TB DataBase template for collection of harmonized TB clinical research data in REDCap, facilitating data standardization for inter-study comparison and meta-analyses	https://github.com/CIDRI-Africa/TBDBT/	Allie et al. (2021)
TBNet	Pipeline/Database	A context-aware graph network for TB diagnosis	https://www.tbnet.eu/	Giehl et al. (2012)

Evaluating minor genetic variation in M.tb population

The progress in molecular techniques has unveiled the capacity of M.tb to engage in polyclonal infections (Moreno-Molina et al. 2021). Mixed infections may give rise to multiple unrelated clones within a patient, or microevolution may lead to the emergence of closely related clones from a previously clonal M.tb population. It is crucial to accurately identify minor variants in M.tb population to improve our understanding of hetero-resistance within dynamic M.tb populations. Identification of minor variants in WGS data has always been challenging due to the limitations of trimming, filtering, and standard methods to differentiate low-frequency variants from sequence ambiguities (Said Mohammed et al. 2018). The recent introduction of the bioinformatics tool, BinoSNP, has simplified the process of identifying minor variants by assessing a customized collection of genomic positions through a binomial test method (Dreyer et al. 2020). Nevertheless, its capability is confined to the identification of SNPs in resistance-conferring genes. Therefore, tools like BinoSNP are unsuitable for detecting unspecified variants like de novo detection of non-resistant variation in minor population groups. Subsequent studies have reported that the LoFreq variant calling tool facilitates the identification of minor variants, including both SNPs and indels, within predetermined resistance-associated loci and previously unexplored genomic regions (Wilm et al. 2012). Goosens et al. (2022), assessed LoFreq’s performance in detecting de novo and drug resistance-associated minor variants in both simulated and clinical M.tb NGS data (Goossens et al. 2022). The results show LoFreq as a precise variant caller with high sensitivity, especially for indels. It exhibits exceptional sharpness and accuracy across the entire spectrum of coverage depths assessed, regardless of the minor variant type or frequency. It reliably detects variants with a frequency limit of detection at 0.5% for indels and 3% for SNPs. In clinical data, LoFreq successfully identified minor M.tb variants, even at low allele frequencies. This suggests its potential to reduce false positives due to sequencing errors. These findings aid in determining detection limits and guiding future M.tb variant studies. An additional limitation is small clinical sample size, which precluded the statistical validation of LoFreq’s performance metrics and underscored the need to conduct validation tests on a larger set of clinical samples, covering both SNP and indel mutations. These observations collectively emphasize the ongoing need for benchmarking whole-genome variant calling tools capable of detecting minor M.tb variants at various depths of population coverage.

Mycobacterium tuberculosis pangenome analysis

The pan-genome encompasses the entire genetic repertoire of a microbial population, comprising core orthologous genes, unique strain-specific genes, and accessory genes. Open pan-genomes incorporate novel gene families, whereas closed pan-genomes exhibit no additional extension (Bosi et al. 2015). The pan-genome approach offers insights into the distribution of virulence genes within pathogenic microbial populations, particularly in monitoring the emergence of drug-resistant genes from novel genetic variants (Muzzi et al. 2007). In recent times, pan-genome analysis has gained popularity for investigating genetic signatures related to antibiotic resistance (Kavvas et al. 2018), adaptive evolution (Yang et al. 2018), and assessing genomic distance among M.tb lineages (Jandrasits et al. 2019). PANPASCO, a computational method for pan-genome mapping, utilizes pairwise distance calculations, demonstrating high sensitivity to variations between cases, and leverages WGS for effective transmission surveillance (Jandrasits et al. 2019). Additional research on the Mycobacterium pan-genome has revealed its significance in identifying potential drug targets and understanding the diversification of the Type VII secretion system, which in turn influences the pathogenicity of M.tb strains (Dumas et al. 2016, Dar et al. 2020). However, as indicated by Kim et al. M.tb is not considered a pathogen that is ideally suited for pan-genome studies; this may be due to its high genomic homogeneity and strict clonality (Kim et al. 2020).

WGS for identification of M.tb mixed infections

Mixed infections arise from either concurrent infection by distinct strains in a patient or strain evolution within the host, resulting in two co-existing populations. Mixed M.tb infections and heteroresistance present challenges for the prognosis and treatment of TB disease. Their detection has predominantly been limited to conventional genotyping techniques, which often lack the required sensitivity and result in inaccurate estimations of population diversity in TB infections (Richardson et al. 2002, van Rie et al. 2005, Zetola et al. 2014, Zong et al. 2018, Liang et al. 2020). The GeneXpert assay has revealed that the current diagnostic methods are not very effective in identifying mixed infections that involve both M.tb and nontuberculous Mycobacteria (NTM). This highlights the urgent need to promptly adopt targeted molecular analyses that specifically capture multiple loci of mycobacterial species from specimens. Inadequate exploration of within-host M.tb diversity makes it difficult to distinguish between relapse and reinfection (Zong et al. 2018). Although WGS provides a comprehensive view of the genetic makeup of an individual strain, challenges still exist in interpreting and analysing the data to identify components of a mixed infection. There are limited established methods for identifying mixed TB infections through WGS data. New approaches, such as Bayesian framework analysis and heterozygous allele identification, have emerged to distinguish strains within M.tb population (Yang et al. 2023).

Deep WGS has proven effective in discerning M.tb strains within mixed infections through exploration of phylogenomic databases derived from single nucleotide variant (SNV) analysis (Gan et al. 2016). A recent paper by Lozano et al. (2021) proposed a novel strategy for capturing minority variants and identifying mixed infections using WGS data (Lozano et al. 2021). The researchers designed a platform named MycoCAP, comprising M.tb DNA capture probes, which enables the targeted enrichment of samples with M.tb DNA (Lozano et al. 2021). Subsequently, they conducted WGS on the captured M.tb DNA to enhance the detection of minority variants and mixed infections. To date, two bioinformatics tools have been reported, specifically designed for the classification of strains in mixed infections using WGS data. The first tool, QuantTB, was designed to quantify individual M.tb strains by comparing TB genomes with reference SNPs of each lineage (Anyansi et al. 2020). The second tool, SplitStrains, employs a rigorous statistical method and the Expectation-Maximization algorithm to separate the constituent strains in a mixed infection accurately (Gabbassov et al. 2021).

Transmission network analysis of TB infection

Genomic sequences of M.tb strains obtained at various time points are progressively employed to deduce the initiation of specific outbreaks, the emergence and proliferation of drug-resistant clones, or the introduction of a strain into a particular geographic region (Saavedra Cervera et al. 2022, Yang et al. 2022a). To gain epidemiological insights, the consideration of a temporally calibrated phylogeny is particularly valuable for reconstructing infectious disease transmission patterns from genomic data (Didelot et al. 2021). More recently, the application of deep NGS has demonstrated considerable promise as an effective strategy for genome-based surveillance of pathogens and the establishment of transmission links among sequenced bacterial pathogens (Sobkowiak et al. 2023). A systematic effort has been devoted to the comparative analysis of publicly available transmission reconstruction models (Sobkowiak et al. 2023). The primary objective is to evaluate the accuracy of these models in predicting transmission events in both simulated and real-world outbreaks of M.tb (Sobkowiak et al. 2023). Using SNP thresholds, WGS improves precision in determining the direction and timing of individual transmission events in M.tb infection (Walker et al. 2013, Stimson et al. 2019). A more advanced approach for transmission reconstruction involves using time scale phylogenetic trees, known as phylodynamics (Didelot et al. 2014). However, challenges such as within-host evolution, latency periods, and low genomic heterogeneity complicate the application of phylodynamics in TB transmission analysis (Ypma et al. 2013, Romero-Severson et al. 2014). Various computational tools integrate genomic variation and epidemiological data to estimate the likelihood of individual-level transmission events from genomic data (Table 3). These tools predominantly employ a Bayesian Markov Chain Monte Carlo framework and robust statistical approach with rigorous computational validation of epidemiological parameters. A recent study has described online tools used for the visualization of transmission networks and evaluated their feasibility for real-time analyses of pathogen sequence data (Neher and Bedford et al. 2018).

Table 3.

List of available software and tools used for infection transmission analysis.

Tools	Software applications	Input data	Accesses link	Source
TransPhylo	R and Matlab	Time-stamped phylogenetic tree	https://github.com/xavierdidelot/TransPhylo	Didelot et al. (2014)
SCOTTI	Python	Time-stamped phylogenetic tree	https://bitbucket.org/nicofmay/scotti/src/master/	De Maio et al. (2016)
outbreaker2	R and C++	Time-stamped phylogenetic tree	http://www.repidemicsconsortium.org/outbreaker2/	Campbell et al. (2018)
TransFlow	R and Python	Raw reads and sample metadata	https://github.com/cvn001/transflow	Pan et al. (2023a)
Phybreak	R	Time-stamped phylogenetic tree	https://github.com/donkeyshot/phybreak	Klinkenberg et al. (2017)
QUENTIN	MATLAB	Aligned fasta Sequence	https://github.com/skumsp/QUENTIN	Skums et al. (2018)
PHYLOSCANNER	Python and R	Bam files	https://github.com/BDI-pathogens/phyloscanner	Wymant et al. (2018)
nosoi	R	User defined host parameters	https://slequime.github.io/nosoi/index.html	Lequime et al. (2020)
TNet	Python	Pathogen phylogeny	https://github.com/sauravdhr/tnet_python	Dhar et al. (2022)
LITT	R	SNP matrix and epidemological data	https://github.com/CDCgov/TB_molecular_epidemiology/tree/1.0;	Winglee et al. (2021)
o2geosocial	R	Epidemiological data (do not include genetic sequences)	https://github.com/alxsrobert/o2geosocial	Robert et al. (2021)
GraphSNP	R and Java	SNP distance	https://github.com/nalarbp/graphsnp	Permana et al. (2023)
SOPHIE	Python and MATLAB	Phylogenetic tree and sample meta data	https://github.com/compbel/SOPHIE/	Skums et al. (2022)
P-DOR	Python	Assembled genome and SNP phylogeny	https://github.com/SteMIDIfactory/P-DOR	Batisti Biffignandi et al. (2023)
StrainHub	R	phylogenetic tree and associated metadata	https://github.com/abschneider/StrainHub	de Bernardi Schneider et al. (2020)
Time-scaled haplotypic density (THD)	R	Genetic distances and user-defined parameters	https://github.com/rasigadelab/thd	Wirth et al. (2020)
Visualization of transmission network
Nextstrain	Web application	Nextstrain employs TreeTime to infer time-scaled phylogenies and conduct ancestral sequence inference to determine the likely geographic origins of ancestral nodes	https://nextstrain.org/	Hadfield et al. 2018
Microreact	Web application	Microreact facilitates the exploration of phylogenetic trees as well as spatial and temporal data of samples. Custom datasets can be imported into the application using a Newick tree and sample metadata in tabular format	https://microreact.org/	Argimon et al. 2016
Graphia	Open-source platform	Graphia is a novel visual analytics platform specifically designed for the network-based analysis of large and complex datasets, such as those generated in vast quantities by modern biological analyses	https://graphia.app/	Freeman et al. 2022

Tools	Software applications	Input data	Accesses link	Source
TransPhylo	R and Matlab	Time-stamped phylogenetic tree	https://github.com/xavierdidelot/TransPhylo	Didelot et al. (2014)
SCOTTI	Python	Time-stamped phylogenetic tree	https://bitbucket.org/nicofmay/scotti/src/master/	De Maio et al. (2016)
outbreaker2	R and C++	Time-stamped phylogenetic tree	http://www.repidemicsconsortium.org/outbreaker2/	Campbell et al. (2018)
TransFlow	R and Python	Raw reads and sample metadata	https://github.com/cvn001/transflow	Pan et al. (2023a)
Phybreak	R	Time-stamped phylogenetic tree	https://github.com/donkeyshot/phybreak	Klinkenberg et al. (2017)
QUENTIN	MATLAB	Aligned fasta Sequence	https://github.com/skumsp/QUENTIN	Skums et al. (2018)
PHYLOSCANNER	Python and R	Bam files	https://github.com/BDI-pathogens/phyloscanner	Wymant et al. (2018)
nosoi	R	User defined host parameters	https://slequime.github.io/nosoi/index.html	Lequime et al. (2020)
TNet	Python	Pathogen phylogeny	https://github.com/sauravdhr/tnet_python	Dhar et al. (2022)
LITT	R	SNP matrix and epidemological data	https://github.com/CDCgov/TB_molecular_epidemiology/tree/1.0;	Winglee et al. (2021)
o2geosocial	R	Epidemiological data (do not include genetic sequences)	https://github.com/alxsrobert/o2geosocial	Robert et al. (2021)
GraphSNP	R and Java	SNP distance	https://github.com/nalarbp/graphsnp	Permana et al. (2023)
SOPHIE	Python and MATLAB	Phylogenetic tree and sample meta data	https://github.com/compbel/SOPHIE/	Skums et al. (2022)
P-DOR	Python	Assembled genome and SNP phylogeny	https://github.com/SteMIDIfactory/P-DOR	Batisti Biffignandi et al. (2023)
StrainHub	R	phylogenetic tree and associated metadata	https://github.com/abschneider/StrainHub	de Bernardi Schneider et al. (2020)
Time-scaled haplotypic density (THD)	R	Genetic distances and user-defined parameters	https://github.com/rasigadelab/thd	Wirth et al. (2020)
Visualization of transmission network
Nextstrain	Web application	Nextstrain employs TreeTime to infer time-scaled phylogenies and conduct ancestral sequence inference to determine the likely geographic origins of ancestral nodes	https://nextstrain.org/	Hadfield et al. 2018
Microreact	Web application	Microreact facilitates the exploration of phylogenetic trees as well as spatial and temporal data of samples. Custom datasets can be imported into the application using a Newick tree and sample metadata in tabular format	https://microreact.org/	Argimon et al. 2016
Graphia	Open-source platform	Graphia is a novel visual analytics platform specifically designed for the network-based analysis of large and complex datasets, such as those generated in vast quantities by modern biological analyses	https://graphia.app/	Freeman et al. 2022

All software and tools are freely available for public use under the General Public License version 3.

Table 3.

https://doi.org/10.21105/joss.01260

List of available software and tools used for infection transmission analysis.

Tools	Software applications	Input data	Accesses link	Source
TransPhylo	R and Matlab	Time-stamped phylogenetic tree	https://github.com/xavierdidelot/TransPhylo	Didelot et al. (2014)
SCOTTI	Python	Time-stamped phylogenetic tree	https://bitbucket.org/nicofmay/scotti/src/master/	De Maio et al. (2016)
outbreaker2	R and C++	Time-stamped phylogenetic tree	http://www.repidemicsconsortium.org/outbreaker2/	Campbell et al. (2018)
TransFlow	R and Python	Raw reads and sample metadata	https://github.com/cvn001/transflow	Pan et al. (2023a)
Phybreak	R	Time-stamped phylogenetic tree	https://github.com/donkeyshot/phybreak	Klinkenberg et al. (2017)
QUENTIN	MATLAB	Aligned fasta Sequence	https://github.com/skumsp/QUENTIN	Skums et al. (2018)
PHYLOSCANNER	Python and R	Bam files	https://github.com/BDI-pathogens/phyloscanner	Wymant et al. (2018)
nosoi	R	User defined host parameters	https://slequime.github.io/nosoi/index.html	Lequime et al. (2020)
TNet	Python	Pathogen phylogeny	https://github.com/sauravdhr/tnet_python	Dhar et al. (2022)
LITT	R	SNP matrix and epidemological data	https://github.com/CDCgov/TB_molecular_epidemiology/tree/1.0;	Winglee et al. (2021)
o2geosocial	R	Epidemiological data (do not include genetic sequences)	https://github.com/alxsrobert/o2geosocial	Robert et al. (2021)
GraphSNP	R and Java	SNP distance	https://github.com/nalarbp/graphsnp	Permana et al. (2023)
SOPHIE	Python and MATLAB	Phylogenetic tree and sample meta data	https://github.com/compbel/SOPHIE/	Skums et al. (2022)
P-DOR	Python	Assembled genome and SNP phylogeny	https://github.com/SteMIDIfactory/P-DOR	Batisti Biffignandi et al. (2023)
StrainHub	R	phylogenetic tree and associated metadata	https://github.com/abschneider/StrainHub	de Bernardi Schneider et al. (2020)
Time-scaled haplotypic density (THD)	R	Genetic distances and user-defined parameters	https://github.com/rasigadelab/thd	Wirth et al. (2020)
Visualization of transmission network
Nextstrain	Web application	Nextstrain employs TreeTime to infer time-scaled phylogenies and conduct ancestral sequence inference to determine the likely geographic origins of ancestral nodes	https://nextstrain.org/	Hadfield et al. 2018
Microreact	Web application	Microreact facilitates the exploration of phylogenetic trees as well as spatial and temporal data of samples. Custom datasets can be imported into the application using a Newick tree and sample metadata in tabular format	https://microreact.org/	Argimon et al. 2016
Graphia	Open-source platform	Graphia is a novel visual analytics platform specifically designed for the network-based analysis of large and complex datasets, such as those generated in vast quantities by modern biological analyses	https://graphia.app/	Freeman et al. 2022

Tools	Software applications	Input data	Accesses link	Source
TransPhylo	R and Matlab	Time-stamped phylogenetic tree	https://github.com/xavierdidelot/TransPhylo	Didelot et al. (2014)
SCOTTI	Python	Time-stamped phylogenetic tree	https://bitbucket.org/nicofmay/scotti/src/master/	De Maio et al. (2016)
outbreaker2	R and C++	Time-stamped phylogenetic tree	http://www.repidemicsconsortium.org/outbreaker2/	Campbell et al. (2018)
TransFlow	R and Python	Raw reads and sample metadata	https://github.com/cvn001/transflow	Pan et al. (2023a)
Phybreak	R	Time-stamped phylogenetic tree	https://github.com/donkeyshot/phybreak	Klinkenberg et al. (2017)
QUENTIN	MATLAB	Aligned fasta Sequence	https://github.com/skumsp/QUENTIN	Skums et al. (2018)
PHYLOSCANNER	Python and R	Bam files	https://github.com/BDI-pathogens/phyloscanner	Wymant et al. (2018)
nosoi	R	User defined host parameters	https://slequime.github.io/nosoi/index.html	Lequime et al. (2020)
TNet	Python	Pathogen phylogeny	https://github.com/sauravdhr/tnet_python	Dhar et al. (2022)
LITT	R	SNP matrix and epidemological data	https://github.com/CDCgov/TB_molecular_epidemiology/tree/1.0;	Winglee et al. (2021)
o2geosocial	R	Epidemiological data (do not include genetic sequences)	https://github.com/alxsrobert/o2geosocial	Robert et al. (2021)
GraphSNP	R and Java	SNP distance	https://github.com/nalarbp/graphsnp	Permana et al. (2023)
SOPHIE	Python and MATLAB	Phylogenetic tree and sample meta data	https://github.com/compbel/SOPHIE/	Skums et al. (2022)
P-DOR	Python	Assembled genome and SNP phylogeny	https://github.com/SteMIDIfactory/P-DOR	Batisti Biffignandi et al. (2023)
StrainHub	R	phylogenetic tree and associated metadata	https://github.com/abschneider/StrainHub	de Bernardi Schneider et al. (2020)
Time-scaled haplotypic density (THD)	R	Genetic distances and user-defined parameters	https://github.com/rasigadelab/thd	Wirth et al. (2020)
Visualization of transmission network
Nextstrain	Web application	Nextstrain employs TreeTime to infer time-scaled phylogenies and conduct ancestral sequence inference to determine the likely geographic origins of ancestral nodes	https://nextstrain.org/	Hadfield et al. 2018
Microreact	Web application	Microreact facilitates the exploration of phylogenetic trees as well as spatial and temporal data of samples. Custom datasets can be imported into the application using a Newick tree and sample metadata in tabular format	https://microreact.org/	Argimon et al. 2016
Graphia	Open-source platform	Graphia is a novel visual analytics platform specifically designed for the network-based analysis of large and complex datasets, such as those generated in vast quantities by modern biological analyses	https://graphia.app/	Freeman et al. 2022

All software and tools are freely available for public use under the General Public License version 3.

Tools such as TransPhylo, Quentin, and Phyloscanner can be used to incorporate within-host genomic diversity of strains to infer transmission routes but have certain limitations. For example, TransPhylo uses a time-calibrated phylogeny that takes into account multiple consensus genomes from a single host. However, in the case of a new outbreak, the short timescale involved may make it difficult to generate a clear temporal signal. Quentin and Phyloscanner use different methods to establish transmission links between hosts. Quentin employs graph and network theories to reconstruct within-host phylogenies, while Phyloscanner uses subsample read mapping to generate BAM files. This allows Phyloscanner to identify sub-populations within the host. These within host network reconstruction process might cause biases potentially affecting the accurate distribution of bacterial sub-populations. A comparative analysis was conducted to evaluate the efficacy of tools used in genome-based analysis of TB transmission and proposed Phybreak, Outbreaker2, and TransPhylo are the most effective tools for identifying accurate links in the transmission network derived from TB infection data (Sobkowiak et al. 2023). These tools have demonstrated superior performance in accurately identifying the maximum number of links in the TB transmission network. Study results suggest that these tools could potentially be useful for the development of effective TB control strategies. The accuracy of transmission history inference depends on the rate and extent of genomic heterogeneity (Campbell et al. 2018). Inferring transmission trees from genetic data becomes challenging in pathogens with a high evolutionary rate, primarily due to substantial within-host diversity and genomic dissimilarity among sequenced strains (Morelli et al. 2012). As a result, methods have been developed to integrate both genomic and epidemiological data to infer potential transmission trees (Jombart et al. 2014, Goldstein et al. 2022). According to a recent study, it is crucial to conduct a comprehensive investigation to address potential biases when utilizing a statistical method that combines phylogeny and epidemiological data to analyse TB transmission incidents (Pan et al. 2023a).

Conclusion

The sequencing of the M.tb genomes has revolutionized the field of TB research and positively impacted various aspects of both research and practice. WGS has played a pivotal role in enhancing epidemiological surveillance, facilitating the monitoring of transmission within communities, and tracing the lineage of M.tb across broader geographical and temporal landscapes. The insights derived from the M.tb genome sequencing have potential to accelerate further advancements in the years ahead, as researchers delve into patient-level investigations and employ innovative whole-genome bacteriological methodologies for translational purposes. Emerging NGS strategies facilitate the simultaneous tracing of gene expression patterns while providing comprehensive coverage for studying immune responses to M.tb infections in an efficient manner. Furthermore, the expanding pool of WGS obtained from phenotypically diverse M.tb strains, coupled with advancements in genome-wide association study algorithms, is unlocking the discovery of previously elusive determinants of drug resistance. In TB research, careful selection of genomic pipelines and algorithms has become a critical factor in ensuring standardized WGS results when examining various features of M.tb isolate genomes to investigate genetic heterogeneity, microevolution, and disease transmission events. The study provides systematic information on available software and bioinformatics pipelines that are tailored for Mycobacterium genome analysis, with a focus on emerging NGS approaches and their optimal integration into TB research efforts.

Conflict of interest

The authors declare no conflicts of interest.

Funding

The authors received no specific grant from any funding agency.

Author contributions

Sushanta Deb (Conceptualization, Data curation, Formal analysis, Investigation, Methodology,Writing – original draft), Jhinuk Basu (Investigation, Methodology,Writing – original draft), and Megha Choudhary (Formal analysis, Investigation,Writing – original draft)

References

Abbott

S

.

getTBinR: an R package for accessing and summarising the World Health Organisation Tuberculosis data

.

J Open Source Softw

.

2019

;

4

:

1260

.

Crossref

https://doi.org/10.3389/fmicb.2019.00309

Advani

J

,

Verma

R

,

Chatterjee

O

et al.

Whole genome sequencing of Mycobacterium tuberculosis clinical isolates from India reveals genetic heterogeneity and region-specific variations that might affect drug susceptibility

.

Front Microbiol

.

2019

;

10

:

309

.

Akter

S

,

Khader

SA

.

A protocol to analyze single-cell RNA-seq data from Mycobacterium tuberculosis-infected mice lung

.

STAR Protoc

.

2023

;

4

:

102544

.

https://doi.org/10.1016/j.xpro.2023.102544

Allie

T

,

Jackson

A

,

Ambler

J

et al.

TBDBT: a TB DataBase template for collection of harmonized TB clinical research data in REDCap, facilitating data standardisation for inter-study comparison and meta-analyses

.

PLoS One

.

2021

;

16

:

e0249165

.

https://doi.org/10.1371/journal.pone.0249165

Anyansi

C

,

Keo

A

,

Walker

BJ

et al.

QuantTB—a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data

.

BMC Genomics [Electronic Resource]

.

2020

;

21

:

80

.

https://doi.org/10.1186/s12864-020-6486-3

. https://doi.org/10.1099/mgen.0.000093

Argimón

S

,

Abudahab

K

,

Goater

RJE

,

Fedosejev

A

,

Bhai

J

,

Glasner

C

,

Feil

EJ

,

Holden

MTG

,

Yeats

CA

,

Grundmann

H

,

Spratt

BG

,

Aanensen

DM

.

Microreact: visualizing and sharing data for genomic epidemiology and phylogeography

.

Microb Genom

.

2016 Nov 30

;

2

:

e000093

https://doi.org/10.1093/bioinformatics/btad571

Batisti Biffignandi

G

,

Bellinzona

G

,

Petazzoni

G

et al.

P-DOR, an easy-to-use pipeline to reconstruct bacterial outbreaks using genomics

.

Bioinformatics

.

2023

;

39

:

btad571

.

Björkman

J

,

Nagaev

I

,

Berg

OG

et al.

Effects of environment on compensatory mutations to ameliorate costs of antibiotic resistance

.

Science

.

2000

;

287

:

1479

–

82

.

https://doi.org/10.1126/science.287.5457.1479

Bosi

E

,

Fani

R

,

Fondi

M

.

Defining orthologs and pangenome size metrics

.

Methods Mol Biol

.

2015

;

1231

:

191

–

202

.

https://doi.org/10.1007/978-1-4939-1720-4_13

Brosch

R

,

Gordon

SV

,

Marmiesse

M

et al.

A new evolutionary scenario for the Mycobacterium tuberculosis complex

.

Proc Natl Acad Sci USA

.

2002

;

99

:

3684

–

9

.

https://doi.org/10.1073/pnas.052548299

Cai

Y

,

Wang

Y

,

Shi

C

et al.

Single-cell immune profiling reveals functional diversity of T cells in tuberculous pleural effusion

.

J Exp Med

.

2022

;

219

:

e20211777

.

https://doi.org/10.1084/jem.20211777

Caminero

JA

.

Multidrug-resistant tuberculosis: epidemiology, risk factors and case finding

.

Int J Tuberc Lung Dis

.

2010

;

14

:

382

–

90

.

https://doi.org/10.1371/journal.ppat.1006885

Campbell

F

,

Strang

C

,

Ferguson

N

et al.

When are pathogen genome sequences informative of transmission events?

.

PLoS Pathog

.

2018

;

14

:

e1006885

.

Camus

J-C

,

Pryor

MJ

,

Médigue

C

et al.

Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv

.

Microbiology (Reading)

.

2002

;

148

:

2967

–

73

.

https://doi.org/10.1099/00221287-148-10-2967

Carey

AF

,

Rock

JM

,

Krieger

IV

et al.

TnSeq of Mycobacterium tuberculosis clinical isolates reveals strain-specific antibiotic liabilities

.

PLoS Pathog

.

2018

;

14

:

e1006939

.

https://doi.org/10.1371/journal.ppat.1006939

Castro

RAD

,

Ross

A

,

Kamwela

L

et al.

The genetic background modulates the evolution of fluoroquinolone-resistance in Mycobacterium tuberculosis

.

Mol Biol Evol

.

2020

;

37

:

195

–

207

.

https://doi.org/10.1093/molbev/msz214

Coll

F

,

Mallard

K

,

Preston

MD

et al.

SpolPred: rapid and accurate prediction of Mycobacterium tuberculosis spoligotypes from short genomic sequences

.

Bioinformatics

.

2012

;

28

:

2991

–

3

.

https://doi.org/10.1093/bioinformatics/bts544

Cornejo-Granados

F

,

López-Leal

G

,

Mata-Espinosa

DA

et al.

Targeted RNA-seq reveals the M. tuberculosis transcriptome from an in vivo infection model

.

Biology (Basel)

.

2021

;

10

:

848

.

https://doi.org/10.1093/database/baaa108

Couvin

D

,

Segretier

W

,

Stattner

E

et al.

Novel methods included in SpolLineages tool for fast and precise prediction of Mycobacterium tuberculosis complex spoligotype families

.

Database (Oxford)

.

2020

;

2020

:

baaa108

.

Dar

HA

,

Zaheer

T

,

Ullah

N

et al.

Pangenome analysis of Mycobacterium tuberculosis reveals core-drug targets and screening of promising lead compounds for drug discovery

.

Antibiotics (Basel)

.

2020

;

9

:

819

.

de Bernardi Schneider

A

,

Ford

CT

,

Hostager

R

et al.

StrainHub: a phylogenetic tool to construct pathogen transmission networks

.

Bioinformatics

.

2020

;

36

:

945

–

7

.

https://doi.org/10.1093/bioinformatics/btz646

De Maio

N

,

Wu

C-H

,

Wilson

DJ

.

SCOTTI: efficient reconstruction of transmission within outbreaks with the structured coalescent

.

PLoS Comput Biol

.

2016

;

12

:

e1005130

.

https://doi.org/10.1371/journal.pcbi.1005130

Dhar

S

,

Zhang

C

,

Mandoiu

II

et al.

TNet: transmission network inference using within-host strain diversity and its application to geographical tracking of COVID-19 spread

.

IEEE/ACM Trans Comput Biol Bioinform

.

2022

;

19

:

230

–

42

.

https://doi.org/10.1109/TCBB.2021.3096455

Didelot

X

,

Gardy

J

,

Colijn

C

.

Bayesian inference of infectious disease transmission from whole-genome sequence data

.

Mol Biol Evol

.

2014

;

31

:

1869

–

79

.

https://doi.org/10.1093/molbev/msu121

Didelot

X

,

Kendall

M

,

Xu

Y

et al.

Genomic epidemiology analysis of infectious disease outbreaks using TransPhylo

.

Curr Protoc

.

2021

;

1

:

e60

.

https://doi.org/10.1002/cpz1.60

Dreyer

V

,

Utpatel

C

,

Kohl

TA

et al.

Detection of low-frequency resistance-mediating SNPs in next-generation sequencing data of Mycobacterium tuberculosis complex strains with binoSNP

.

Sci Rep

.

2020

;

10

:

7874

.

https://doi.org/10.1038/s41598-020-64708-8

Dumas

E

,

Christina Boritsch

E

,

Vandenbogaert

M

et al.

Mycobacterial pan-genome analysis suggests important role of plasmids in the radiation of type VII secretion systems

.

Genome Biol Evol

.

2016

;

8

:

387

–

402

.

https://doi.org/10.1093/gbe/evw001

Earle

SG

,

Wu

C-H

,

Charlesworth

J

et al.

Identifying lineage effects when controlling for population structure improves power in bacterial association studies

.

Nat Microbiol

.

2016

;

1

:

16041

.

https://doi.org/10.1038/nmicrobiol.2016.41

Estévez

O

,

Anibarro

L

,

Garet

E

et al.

An RNA-seq based machine learning approach identifies latent tuberculosis patients with an active tuberculosis profile

.

Front Immunol

.

2020

;

11

:

1470

.

https://doi.org/10.3389/fimmu.2020.01470

Fenner

L

,

Egger

M

,

Bodmer

T

et al.

Effect of mutation and genetic background on drug resistance in Mycobacterium tuberculosis

.

Antimicrob Agents Chemother

.

2012

;

56

:

3047

–

53

.

https://doi.org/10.1128/AAC.06460-11

Freeman

TC

,

Horsewell

S

,

Patir

A

,

Harling-Lee

J

,

Regan

T

,

Shih

BB

,

Prendergast

J

,

Hume

DA

,

Angus

T

.

Graphia: A platform for the graph-based visualisation and analysis of high dimensional data

.

PLoS Comput Biol

.

2022 Jul 25

;

18

:

e1010310

.

doi

:

10.1371/journal.pcbi.1010310

Crossref

https://doi.org/10.1038/s41467-021-26248-1

Freschi

L

,

Vargas

R

,

Husain

A

et al.

Population structure, biogeography and transmissibility of Mycobacterium tuberculosis

.

Nat Commun

.

2021

;

12

:

6099

.

Gabbassov

E

,

Moreno-Molina

M

,

Comas

I

et al.

SplitStrains, a tool to identify and separate mixed Mycobacterium tuberculosis infections from WGS data

.

Microb Genom

.

2021

;

7

:

000607

.

https://doi.org/10.1371/journal.pone.0217410

Gabrielian

A

,

Engle

E

,

Harris

M

et al.

TB DEPOT (data exploration portal): a multi-domain tuberculosis data analysis resource

.

PLoS One

.

2019

;

14

:

e0217410

.

Gagneux

S

,

Long

CD

,

Small

PM

et al.

The competitive cost of antibiotic resistance in Mycobacterium tuberculosis

.

Science

.

2006

;

312

:

1944

–

6

.

https://doi.org/10.1126/science.1124410

Galagan

JE

.

Genomic insights into tuberculosis

.

Nat Rev Genet

.

2014

;

15

:

307

–

20

.

https://doi.org/10.1038/nrg3664

Gan

M

,

Liu

Q

,

Yang

C

et al.

Deep whole-genome sequencing to detect mixed infection of Mycobacterium tuberculosis

.

PLoS One

.

2016

;

11

:

e0159029

.

https://doi.org/10.1371/journal.pone.0159029

Genestet

C

,

Hodille

E

,

Berland

J-L

et al.

Whole-genome sequencing in drug susceptibility testing of Mycobacterium tuberculosis in routine practice in Lyon, France

.

Int J Antimicrob Agents

.

2020

;

55

:

105912

.

https://doi.org/10.1016/j.ijantimicag.2020.105912

Giehl

C

,

Lange

C

,

Duarte

R

et al.

TBNET—collaborative research on tuberculosis in Europe

.

Eur J Microbiol Immunol (Bp)

.

2012

;

2

:

264

–

74

.

https://doi.org/10.1556/EuJMI.2.2012.4.4

Goig

GA

,

Cancino-Muñoz

I

,

Torres-Puente

M

et al.

Whole-genome sequencing of Mycobacterium tuberculosis directly from clinical samples for high-resolution genomic epidemiology and drug resistance surveillance: an observational study

.

Lancet Microbe

.

2020

;

1

:

e175

–

83

.

https://doi.org/10.1016/S2666-5247(20)30060-4

Goldstein

IH

,

Bayer

D

,

Barilar

I

et al.

Using genetic data to identify transmission risk factors: statistical assessment and application to tuberculosis transmission

.

PLoS Comput Biol

.

2022

;

18

:

e1010696

.

https://doi.org/10.1371/journal.pcbi.1010696

Gómez-González

PJ

,

Campino

S

,

Phelan

JE

et al.

Portable sequencing of Mycobacterium tuberculosis for clinical and epidemiological applications

.

Brief Bioinform

.

2022

;

23

:

bbac256

.

https://doi.org/10.1093/bib/bbac256

Gómez-González

PJ

,

Perdigao

J

,

Gomes

P

et al.

Genetic diversity of candidate loci linked to Mycobacterium tuberculosis resistance to bedaquiline, delamanid and pretomanid

.

Sci Rep

.

2021

;

11

:

19431

.

https://doi.org/10.1038/s41598-021-98862-4

Goossens

SN

,

Heupink

TH

,

De Vos

E

et al.

Detection of minor variants in Mycobacterium tuberculosis whole genome sequencing data

.

Brief Bioinform

.

2022

;

23

:

bbab541

.

https://doi.org/10.1093/bib/bbab541

Gygli

SM

,

Borrell

S

,

Trauner

A

et al.

Antimicrobial resistance in Mycobacterium tuberculosis: mechanistic and evolutionary perspectives

.

FEMS Microbiol Rev

.

2017

;

41

:

354

–

73

.

https://doi.org/10.1093/femsre/fux011

Hadfield

J

,

Megill

C

,

Bell

SM

,

Huddleston

J

,

Potter

B

,

Callender

C

,

Sagulenko

P

,

Bedford

T

,

Neher

RA

.

Nextstrain: real-time tracking of pathogen evolution

.

Bioinformatics

.

2018

;

34

:

4121

–

4123

.

doi

:

10.1093/bioinformatics/bty407

.

Hall

MB

,

Rabodoarivelo

MS

,

Koch

A

et al.

Evaluation of nanopore sequencing for Mycobacterium tuberculosis drug susceptibility testing and outbreak investigation: a genomic analysis

.

Lancet Microbe

.

2023

;

4

:

e84

–

92

.

https://doi.org/10.1016/S2666-5247(22)00301-9

Helmy

M

,

Awad

M

,

Mosa

KA

.

Limited resources of genome sequencing in developing countries: challenges and solutions

.

Appl Transl Genom

.

2016

;

9

:

15

–

9

.

https://doi.org/10.12688/wellcomeopenres.15603.1

Hunt

M

,

Bradley

P

,

Lapierre

SG

et al.

Antibiotic resistance prediction for Mycobacterium tuberculosis from genome sequence data with Mykrobe

.

Wellcome Open Res

.

2019

;

4

:

191

.

Jacques

P-E

,

Gervais

AL

,

Cantin

M

et al.

MtbRegList, a database dedicated to the analysis of transcriptional regulation in Mycobacterium tuberculosis

.

Bioinformatics

.

2005

;

21

:

2563

–

5

.

https://doi.org/10.1093/bioinformatics/bti321

Jaillard

M

,

Lima

L

,

Tournoud

M

et al.

A fast and agnostic method for bacterial genome-wide association studies: bridging the gap between k-mers and genetic events

.

PLoS Genet

.

2018

;

14

:

e1007758

.

https://doi.org/10.1371/journal.pgen.1007758

Jandrasits

C

,

Kröger

S

,

Haas

W

et al.

Computational pan-genome mapping and pairwise SNP-distance improve detection of Mycobacterium tuberculosis transmission clusters

.

PLoS Comput Biol

.

2019

;

15

:

e1007527

.

https://doi.org/10.1371/journal.pcbi.1007527

Jombart

T

,

Cori

A

,

Didelot

X

et al.

Bayesian reconstruction of disease outbreaks by combining epidemiologic and genomic data

.

PLoS Comput Biol

.

2014

;

10

:

e1003457

.

https://doi.org/10.1371/journal.pcbi.1003457

Joshi

KR

,

Dhiman

H

,

Scaria

V

.

tbvar: a comprehensive genome variation resource for Mycobacterium tuberculosis

.

Database (Oxford)

.

2014

;

2014

:

bat083

.

https://doi.org/10.1093/database/bat083

Kamerbeek

J

,

Schouls

L

,

Kolk

A

et al.

Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology

.

J Clin Microbiol

.

1997

;

35

:

907

–

14

.

https://doi.org/10.1128/jcm.35.4.907-914.1997

Karikari

TK

,

Quansah

E

,

Mohamed

WMY

.

Widening participation would be key in enhancing bioinformatics and genomics research in Africa

.

Appl Transl Genom

.

2015

;

6

:

35

–

41

.

https://doi.org/10.1172/JCI11426

Kato-Maeda

M

,

Bifani

PJ

,

Kreiswirth

BN

et al.

The nature and consequence of genetic variability within Mycobacterium tuberculosis

.

J Clin Invest

.

2001

;

107

:

533

–

7

.

Kavvas

ES

,

Catoiu

E

,

Mih

N

et al.

Machine learning and structural analysis of Mycobacterium tuberculosis pan-genome identifies genetic signatures of antibiotic resistance

.

Nat Commun

.

2018

;

9

:

4306

.

https://doi.org/10.1038/s41467-018-06634-y

Kim

Y

,

Gu

C

,

Kim

HU

et al.

Current status of pan-genome analysis for pathogenic bacteria

.

Curr Opin Biotechnol

.

2020

;

63

:

54

–

62

.

https://doi.org/10.1016/j.copbio.2019.12.001

Klinkenberg

D

,

Backer

JA

,

Didelot

X

et al.

Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks

.

PLoS Comput Biol

.

2017

;

13

:

e1005495

.

https://doi.org/10.1371/journal.pcbi.1005495

Lai

RPJ

,

Cortes

T

,

Marais

S

et al.

Transcriptomic characterization of tuberculous sputum reveals a host warburg effect and microbial cholesterol catabolism

.

mBio

.

2021

;

12

:

e0176621

.

https://doi.org/10.1128/mBio.01766-21

Lequime

S

,

Bastide

P

,

Dellicour

S

et al.

nosoi: a stochastic agent-based transmission chain simulation framework in r

.

Methods Ecol Evol

.

2020

;

11

:

1002

–

7

.

https://doi.org/10.1111/2041-210X.13422

Liang

Q

,

Shang

Y

,

Huo

F

et al.

Assessment of current diagnostic algorithm for detection of mixed infection with Mycobacterium tuberculosis and nontuberculous mycobacteria

.

J Infect Public Health

.

2020

;

13

:

1967

–

71

.

https://doi.org/10.1016/j.jiph.2020.03.017

López-Agudelo

VA

,

Baena

A

,

Barrera

V

et al.

Dual RNA sequencing of Mycobacterium tuberculosis-infected Human splenic macrophages reveals a strain-dependent host-pathogen response to infection

.

Int J Mol Sci

.

2022

;

23

:

1803

.

https://doi.org/10.3390/ijms23031803

Lose

T

,

van Heusden

P

,

Christoffels

A

.

COMBAT-TB-NeoDB: fostering tuberculosis research through integrative analysis using graph database technologies

.

Bioinformatics

.

2020

;

36

:

982

–

3

.

https://doi.org/10.1093/bioinformatics/btz658

Lozano

N

,

Lanza

VF

,

Suárez-González

J

et al.

Detection of minority variants and mixed infections in Mycobacterium tuberculosis by direct whole-genome sequencing on noncultured specimens using a specific-DNA capture strategy

.

mSphere

.

2021

;

6

:

e0074421

.

https://doi.org/10.1128/mSphere.00744-21

Malhotra

S

,

Mugumbate

G

,

Blundell

TL

et al.

TIBLE: a web-based, freely accessible resource for small-molecule binding data for mycobacterial species

.

Database (Oxford)

.

2017

;

2017

:

bax041

.

https://doi.org/10.1093/database/bax041

Meehan

CJ

,

Goig

GA

,

Kohl

TA

et al.

Whole genome sequencing of Mycobacterium tuberculosis: current standards and open issues

.

Nat Rev Microbiol

.

2019

;

17

:

533

–

45

.

https://doi.org/10.1038/s41579-019-0214-5

Merget

B

,

Zilian

D

,

Müller

T

et al.

MycPermCheck: the Mycobacterium tuberculosis permeability prediction tool for small molecules

.

Bioinformatics

.

2013

;

29

:

62

–

8

.

https://doi.org/10.1093/bioinformatics/bts641

Metri

R

,

Hariharaputran

S

,

Ramakrishnan

G

et al.

SInCRe-structural interactome computational resource for Mycobacterium tuberculosis

.

Database (Oxford)

.

2015

;

2015

:

bav060

.

https://doi.org/10.1093/database/bav060

Mikheecheva

NE

,

Zaychikova

MV

,

Melerzanov

AV

et al.

A nonsynonymous SNP catalog of Mycobacterium tuberculosis virulence genes and its use for detecting new potentially virulent sublineages

.

Genome Biol Evol

.

2017

;

9

:

887

–

99

.

https://doi.org/10.1093/gbe/evx053

Modlin

SJ

,

Robinhold

C

,

Morrissey

C

et al.

Exact mapping of Illumina blind spots in the Mycobacterium tuberculosis genome reveals platform-wide and workflow-specific biases

.

Microb Genom

.

2021

;

7

:

mgen000465

.

https://doi.org/10.1371/journal.pcbi.1002768

Morelli

MJ

,

Thébaud

G

,

Chadœuf

J

et al.

A bayesian inference framework to reconstruct transmission trees using epidemiological and genetic data

.

PLoS Comput Biol

.

2012

;

8

:

e1002768

.

Moreno-Molina

M

,

Shubladze

N

,

Khurtsilava

I

et al.

Genomic analyses of Mycobacterium tuberculosis from human lung resections reveal a high frequency of polyclonal infections

.

Nat Commun

.

2021

;

12

:

2716

.

https://doi.org/10.1038/s41467-021-22705-z

Muzzi

A

,

Masignani

V

,

Rappuoli

R

.

The pan-genome: towards a knowledge-based discovery of novel targets for vaccines and antibacterials

.

Drug Discov Today

.

2007

;

12

:

429

–

39

.

https://doi.org/10.1016/j.drudis.2007.04.008

Neher

RA

,

Bedford

T

.

Real-Time Analysis and Visualization of Pathogen Sequence Data J Clin Microbiol

.

2018

,

56

:

10

https://doi.org/10.1128/jcm.00480-18

Ochoa-Montaño

B

,

Mohan

N

,

Blundell

TL

.

CHOPIN: a web resource for the structural and functional proteome of Mycobacterium tuberculosis

.

Database (Oxford)

.

2015

;

2015

:

bav026

.

https://doi.org/10.1093/database/bav026

Pan

J

,

Li

X

,

Zhang

M

et al.

TransFlow: a Snakemake workflow for transmission analysis of Mycobacterium tuberculosis whole-genome sequencing data

.

Bioinformatics

.

2023a

;

39

:

btac785

.

https://doi.org/10.1093/bioinformatics/btac785

Pan

J

,

Zhang

X

,

Xu

J

et al.

Landscape of exhausted T cells in tuberculosis revealed by single-cell sequencing

.

Microbiol Spectr

.

2023b

;

11

:

e0283922

.

https://doi.org/10.1128/spectrum.02839-22

Peker

N

,

Schuele

L

,

Kok

N

et al.

Evaluation of whole-genome sequence data analysis approaches for short- and long-read sequencing of Mycobacterium tuberculosis

.

Microb Genom

.

2021

;

7

:

000695

.

https://doi.org/10.1186/s12859-023-05332-x

Permana

B

,

Beatson

SA

,

Forde

BM

.

GraphSNP: an interactive distance viewer for investigating outbreaks and transmission networks using a graph approach

.

BMC Bioinf

.

2023

;

24

:

209

.

Crossref

https://doi.org/10.1186/s13073-016-0385-x

Phelan

J

,

O'Sullivan

DM

,

Machado

D

et al.

The variability and reproducibility of whole genome sequencing technology for detecting resistance to anti-tuberculous drugs

.

Genome Med

.

2016

;

8

:

132

.

Phelan

JE

,

O'Sullivan

DM

,

Machado

D

et al.

Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs

.

Genome Med

.

2019

;

11

:

41

.

https://doi.org/10.1186/s13073-019-0650-x

Pisu

D

,

Huang

L

,

Grenier

JK

et al.

Dual RNA-seq of mtb-infected macrophages in vivo reveals ontologically distinct host–pathogen interactions

.

Cell Rep

.

2020a

;

30

:

335

–

50.e4

.

https://doi.org/10.1016/j.celrep.2019.12.033

Pisu

D

,

Huang

L

,

Narang

V

et al.

Single cell analysis of M. tuberculosis phenotype and macrophage lineages in the infected lung

.

J Exp Med

.

2021

;

218

:

e20210615

.

https://doi.org/10.1084/jem.20210615

Pisu

D

,

Huang

L

,

Rin Lee

BN

et al.

Dual RNA-sequencing of Mycobacterium tuberculosis-infected cells from a murine infection model

.

STAR Protoc

.

2020b

;

1

:

100123

.

https://doi.org/10.1016/j.xpro.2020.100123

Pisu

D

,

Russell

DG

.

Protocol for multi-modal single-cell RNA sequencing on M. tuberculosis-infected mouse lungs

.

STAR Protoc

.

2023

;

4

:

102102

.

https://doi.org/10.1016/j.xpro.2023.102102

Quan

TP

,

Bawa

Z

,

Foster

D

et al.

Evaluation of whole-genome sequencing for mycobacterial species identification and drug susceptibility testing in a clinical setting: a large-scale prospective assessment of performance against line probe assays and phenotyping

.

J Clin Microbiol

.

2018

;

56

:

e01480

–

17

.

https://doi.org/10.1128/JCM.01480-17

Radusky

L

,

Defelipe

LA

,

Lanzarotti

E

et al.

TuberQ: a Mycobacterium tuberculosis protein druggability database

.

Database (Oxford)

.

2014

;

2014

:

bau035

.

https://doi.org/10.1093/database/bau035

Repasy

T

,

Lee

J

,

Marino

S

et al.

Intracellular bacillary burden reflects a burst size for Mycobacterium tuberculosis in vivo

.

PLoS Pathog

.

2013

;

9

:

e1003190

.

https://doi.org/10.1371/journal.ppat.1003190

Richardson

M

,

Carroll

NM

,

Engelke

E

et al.

Multiple Mycobacterium tuberculosis strains in early cultures from patients in a high-incidence community setting

.

J Clin Microbiol

.

2002

;

40

:

2750

–

4

.

https://doi.org/10.1128/JCM.40.8.2750-2754.2002

Rienksma

RA

,

Suarez-Diez

M

,

Mollenkopf

H-J

et al.

Comprehensive insights into transcriptional adaptation of intracellular mycobacteria by microbe-enriched dual RNA sequencing

.

BMC Genomics [Electronic Resource]

.

2015

;

16

:

34

.

https://doi.org/10.1186/s12864-014-1197-2

https://doi.org/10.1093/bib/bbaa246

Rivière

E

,

Heupink

TH

,

Ismail

N

et al.

Capacity building for whole genome sequencing of Mycobacterium tuberculosis and bioinformatics in high TB burden countries

.

Brief Bioinform

.

2021

;

22

:

bbaa246

.

Robert

A

,

Funk

S

,

Kucharski

AJ

.

o2geosocial: reconstructing who-infected-whom from routinely collected surveillance data

.

F1000Res

.

2021

;

10

:

31

.

https://doi.org/10.12688/f1000research.28073.2

Romero-Severson

E

,

Skar

H

,

Bulla

I

et al.

Timing and order of transmission events is not directly reflected in a pathogen phylogeny

.

Mol Biol Evol

.

2014

;

31

:

2472

–

82

.

https://doi.org/10.1093/molbev/msu179

Rosenthal

A

,

Gabrielian

A

,

Engle

E

et al.

The TB portals: an open-access, web-based platform for global drug-resistant-tuberculosis data sharing and analysis

.

J Clin Microbiol

.

2017

;

55

:

3267

–

82

.

https://doi.org/10.1128/JCM.01013-17

Ruesen

C

,

Riza

AL

,

Florescu

A

et al.

Linking minimum inhibitory concentrations to whole genome sequence-predicted drug resistance in Mycobacterium tuberculosis strains from Romania

.

Sci Rep

.

2018

;

8

:

9676

.

https://doi.org/10.1038/s41598-018-27962-5

Saavedra Cervera

B

,

López

MG

,

Chiner-Oms

Á

et al.

Fine-grain population structure and transmission patterns of Mycobacterium tuberculosis in southern Mozambique, a high TB/HIV burden area

.

Microb Genom

.

2022

;

8

:

mgen000844

https://doi.org/10.12688/wellcomeopenres.13538.1

Sahajpal

R

,

Kandoi

G

,

Dhiman

H

et al.

HGV&TB: a comprehensive online resource on human genes and genetic variants associated with tuberculosis

.

Database

.

2014

;

2014

:

bau112

.

Said Mohammed

K

,

Kibinge

N

,

Prins

P

et al.

Evaluating the performance of tools used to call minority variants from whole genome short-read data

.

Wellcome Open Res

.

2018

;

3

:

21

.

Sanger

F

,

Nicklen

S

,

Coulson

AR

.

DNA sequencing with chain-terminating inhibitors

.

Proc Natl Acad Sci USA

.

1977

;

74

:

5463

–

7

.

https://doi.org/10.1073/pnas.74.12.5463

Shabbeer

A

,

Cowan

LS

,

Ozcaglar

C

et al.

TB-lineage: an online tool for classification and analysis of strains of Mycobacterium tuberculosis complex

.

Infect Genet Evol

.

2012

;

12

:

789

–

97

.

https://doi.org/10.1016/j.meegid.2012.02.010

Skums

P

,

Mohebbi

F

,

Tsyvina

V

et al.

SOPHIE: viral outbreak investigation and transmission history reconstruction in a joint phylogenetic and network theory framework

.

Cell Syst

.

2022

;

13

:

844

–

56.e4

.

https://doi.org/10.1016/j.cels.2022.07.005

Skums

P

,

Zelikovsky

A

,

Singh

R

et al.

QUENTIN: reconstruction of disease transmissions from viral quasispecies genomic data

.

Bioinformatics

.

2018

;

34

:

163

–

70

.

https://doi.org/10.1093/bioinformatics/btx402

Sobkowiak

B

,

Romanowski

K

,

Sekirov

I

et al.

Comparing Mycobacterium tuberculosis transmission reconstruction models from whole genome sequence data

.

Epidemiol Infect

.

2023

;

151

:

e105

.

https://doi.org/10.1017/S0950268823000900

Stimson

J

,

Gardy

J

,

Mathema

B

et al.

Beyond the SNP threshold: identifying outbreak clusters using inferred transmissions

.

Mol Biol Evol

.

2019

;

36

:

587

–

603

.

https://doi.org/10.1093/molbev/msy242

Streicher

EM

,

Bergval

I

,

Dheda

K

et al.

Mycobacterium tuberculosis population structure determines the outcome of genetics-based second-line drug resistance testing

.

Antimicrob Agents Chemother

.

2012

;

56

:

2420

–

7

.

https://doi.org/10.1128/AAC.05905-11

Supply

P

,

Magdalena

J

,

Himpens

S

et al.

Identification of novel intergenic repetitive units in a mycobacterial two-component system operon

.

Mol Microbiol

.

1997

;

26

:

991

–

1003

.

https://doi.org/10.1046/j.1365-2958.1997.6361999.x

Supply

P

,

Mazars

E

,

Lesjean

S

et al.

Variable human minisatellite-like regions in the Mycobacterium tuberculosis genome

.

Mol Microbiol

.

2000

;

36

:

762

–

71

.

https://doi.org/10.1046/j.1365-2958.2000.01905.x

Tsolaki

AG

,

Hirsh

AE

,

DeRiemer

K

et al.

Functional and evolutionary genomics of Mycobacterium tuberculosis: insights from genomic deletions in 100 strains

.

Proc Natl Acad Sci USA

.

2004

;

101

:

4865

–

70

.

https://doi.org/10.1073/pnas.0305634101

Tyler

AD

,

Christianson

S

,

Knox

NC

et al.

Comparison of sample preparation methods used for the next-generation sequencing of Mycobacterium tuberculosis

.

PLoS One

.

2016

;

11

:

e0148676

.

https://doi.org/10.1371/journal.pone.0148676

Usmani

SS

,

Kumar

R

,

Kumar

V

et al.

AntiTbPdb: a knowledgebase of anti-tubercular peptides

.

Database (Oxford)

.

2018

;

2018

:

bay025

.

https://doi.org/10.1093/database/bay025

van Beek

J

,

Haanperä

M

,

Smit

PW

et al.

Evaluation of whole genome sequencing and software tools for drug susceptibility testing of Mycobacterium tuberculosis

.

Clin Microbiol Infect

.

2019

;

25

:

82

–

86

.

https://doi.org/10.1016/j.cmi.2018.03.041

van Embden

JD

,

Cave

MD

,

Crawford

JT

et al.

Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology

.

J Clin Microbiol

.

1993

;

31

:

406

–

9

.

https://doi.org/10.1128/jcm.31.2.406-409.1993

van Rie

A

,

Victor

TC

,

Richardson

M

et al.

Reinfection and mixed infection cause changing Mycobacterium tuberculosis drug-resistance patterns

.

Am J Respir Crit Care Med

.

2005

;

172

:

636

–

42

.

https://doi.org/10.1164/rccm.200503-449OC

Walker

TM

,

Ip

CLC

,

Harrell

RH

et al.

Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: a retrospective observational study

.

Lancet Infect Dis

.

2013

;

13

:

137

–

46

.

https://doi.org/10.1016/S1473-3099(12)70277-3

Wang

L

,

Ma

H

,

Wen

Z

et al.

Single-cell RNA-sequencing reveals heterogeneity and intercellular crosstalk in human tuberculosis lung

.

J Infect

.

2023

;

87

:

373

–

84

.

https://doi.org/10.1016/j.jinf.2023.09.004

Wang

L

,

Yang

J

,

Chen

L

et al.

Whole-genome sequencing of Mycobacterium tuberculosis for prediction of drug resistance

.

Epidemiol Infect

.

2022

;

150

:

e22

.

https://doi.org/10.1017/S095026882100279X

Wang

Y

,

Jiang

Z

,

Liang

P

et al.

TB-DROP: deep learning-based drug resistance prediction of Mycobacterium tuberculosis utilizing whole genome mutations

.

BMC Genomics [Electronic Resource]

.

2024

;

25

:

167

.

https://doi.org/10.1186/s12864-024-10066-y