Descriptive figures of the pangenome of 253 complete Staphylococcusaureus genomes inferred using PIRATE. PIRATE was run with default parameters over a range of amino acid identity values (45–98%). (A) The proportion of genomes in which gene families are found, indicating stable gene families (green) with a single allele at 98% amino acid identity, and diverged with >1 allele (yellow). (B) The minimum amino acid percentage identity cut-off at which all loci were present per gene family (core = blue, accessory = red). (C) The number of unique alleles at each amino acid percentage threshold. A unique allele is characterized as the highest percentage identity threshold at which a unique sub-cluster of isolates from a single gene family was identified by MCL. (D) Comparison of core and accessory gene/allele estimates for PIRATE (red), PanX (orange), Roary (blue), and Roary with paralog splitting switched off (green). The estimates represent “allelic” variation reported by PIRATE in contrast to “gene content” variation reported by the other tools. PanX provided a single estimate of core and accessory genome content because it has no analogous command to -s in PIRATE or -i in Roary to allow comparison. Core gene families are characterized as being present in >95% of genomes. All tools were run on default parameters. Roary was run over a range of thresholds matching those used for PIRATE with and without paralog splitting (-s).
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.