Background: Historically, M- and T-serotypes have provided a framework for defining ‘strains’ of group A streptococci (GAS). M- and T-type encoding genes are present in all GAS genomes and display the greatest genetic diversity, indicative of strong host immune pressures.
Methods: Unique combinations of emm type and distant core genes were identified among >2,500 whole genome sequences of GAS, from >30 countries over ~100 years. Selected genomes were analyzed for adhesin and backbone pilin gene sequences and phylogenetic clusters were assigned; pilin genotyping will be at www.PubMLST.org. T-typing reference strains from the CDC were also analyzed.
Results: Unique mixtures of emm type (N=183) and pilin adhesin and backbone gene cluster combinations (N=115) yield ~400 distinct GAS ‘strains.' Phylogenetic analysis indicates at least 6 discrete ancestral forms of adhesin and backbone loci pairs within this species. Several emm types (N=14) were recovered in association with 3 or more FCT-region forms. There is extensive recombination between adhesin and pilin genes both within and between the FCT-3 and FCT-4 regions. Most T-typing strains correspond to pilin backbone genes, but there are notable exceptions supporting previous findings that adhesin pilins can contribute to T-type. emm pattern genotypes for throat and skin specialists and generalists exhibit nonrandom associations with FCT-region forms; correlations between adhesin gene clusters and emm pattern support the idea that pilin may contribute to host tissue tropisms.
Conclusions: The history of extensive horizontal gene transfer involving both emm and pilin genes underscores a mechanism for the emergence of new GAS strains and the introduction of antigenic novelty into the human host population.