TY - JOUR
T1 - The Salmonella enterica Pan-genome
AU - Jacobsen, Annika
AU - Hendriksen, Rene S.
AU - Aarestrup, Frank Møller
AU - Ussery, David
AU - Friis, Carsten
PY - 2011
Y1 - 2011
N2 - Salmonella enterica is divided into four subspecies
containing a large number of different serovars, several
of which are important zoonotic pathogens and some show
a high degree of host specificity or host preference. We
compare 45 sequenced S. enterica genomes that are
publicly available (22 complete and 23 draft genome
sequences). Of these, 35 were found to be of sufficiently
good quality to allow a detailed analysis, along with two
Escherichia coli strains (K-12 substr. DH10B and the avian
pathogenic E. coli (APEC O1) strain). All genomes
were subjected to standardized gene finding, and the core
and pan-genome of Salmonella were estimated to be
around 2,800 and 10,000 gene families, respectively. The
constructed pan-genomic dendrograms suggest that gene
content is often, but not uniformly correlated to serotype.
Any given Salmonella strain has a large stable core, whilst
there is an abundance of accessory genes, including the
Salmonella pathogenicity islands (SPIs), transposable
elements, phages, and plasmid DNA. We visualize
conservation in the genomes in relation to chromosomal
location and DNA structural features and find that
variation in gene content is localized in a selection of
variable genomic regions or islands. These include the
SPIs but also encompass phage insertion sites and
transposable elements. The islands were typically well
conserved in several, but not all, isolates—a difference
which may have implications in, e.g., host specificity.
AB - Salmonella enterica is divided into four subspecies
containing a large number of different serovars, several
of which are important zoonotic pathogens and some show
a high degree of host specificity or host preference. We
compare 45 sequenced S. enterica genomes that are
publicly available (22 complete and 23 draft genome
sequences). Of these, 35 were found to be of sufficiently
good quality to allow a detailed analysis, along with two
Escherichia coli strains (K-12 substr. DH10B and the avian
pathogenic E. coli (APEC O1) strain). All genomes
were subjected to standardized gene finding, and the core
and pan-genome of Salmonella were estimated to be
around 2,800 and 10,000 gene families, respectively. The
constructed pan-genomic dendrograms suggest that gene
content is often, but not uniformly correlated to serotype.
Any given Salmonella strain has a large stable core, whilst
there is an abundance of accessory genes, including the
Salmonella pathogenicity islands (SPIs), transposable
elements, phages, and plasmid DNA. We visualize
conservation in the genomes in relation to chromosomal
location and DNA structural features and find that
variation in gene content is localized in a selection of
variable genomic regions or islands. These include the
SPIs but also encompass phage insertion sites and
transposable elements. The islands were typically well
conserved in several, but not all, isolates—a difference
which may have implications in, e.g., host specificity.
U2 - 10.1007/s00248-011-9880-1
DO - 10.1007/s00248-011-9880-1
M3 - Journal article
SN - 0095-3628
VL - 62
SP - 487
EP - 504
JO - Microbial Ecology
JF - Microbial Ecology
IS - 3
ER -