Background: Gene-by-gene approaches allow for a genome-wide, high resolution typing of Streptococcus pyogenes (Group A Streptococcus, GAS) requiring no reference genome, nor the removal of recombination regions. We aimed to evaluate the ability of a novel whole-genome multilocus sequence typing (wgMLST) schema to discriminate two recently emerged GAS lineages and to identify outbreak-related isolates.
Methods: Allele calling with the wgMLST schema was done with chewBBACA on public assemblies from: (i) 135 non-invasive emm1 isolates from the UK; (ii) 201 emm89 isolates recovered worldwide; 289 emm-matched outbreak and sporadic isolates from the UK. PHYLOViZ was used to construct minimum spanning trees based on the core loci of each dataset (core-genome MLST, cgMLST).
Results: Isolates from the recently emerged lineages within emm1 and emm89 [M1UK, carrying 27 characteristic single nucleotide polymorphisms (SNPs), and emm89 clade 3, carrying variant 3 of the nga-ifs-slo promoter and lacking the capsule locus] were clustered together, and apart from the remaining isolates of the same emm type (Figures 1 and 2). Based on cgMLST analysis, 10 isolates were excluded from the respective outbreaks, matching the results previously obtained by SNP analysis. The mean distance among outbreak isolates (range 0-8 allelic differences) was lower than among sporadic isolates of the same emm type (Figure 3).
Conclusions: Analyses of previously published datasets using the novel wgMLST schema showed a performance comparable to that of SNP-based methods in distinguishing recently emerged intra-emm type sublineages, as well as in clarifying the genetic relatedness among isolates recovered in outbreak contexts.