pnas.202411894
pnas.202411894
Contributed by David J. Lipman; received June 14, 2024; accepted September 16, 2024; reviewed by Edward G. Dudley, Lance B. Price, and Abigail Snyder
may be contaminated at levels sufficient to cause illness in most to 5 million aligned bases. We continue to add cases to the cluster
consumers. But other affected servings may have relatively small as long as they are within this threshold distance from at least one
levels that have a low probability of causing symptoms in healthy case already in the cluster, i.e., we perform single linkage cluster-
adults. However, in more susceptible individuals (e.g., young chil- ing. The results presented here are based on a four SNP threshold
dren and the elderly), these levels may have a higher probability but using smaller or larger thresholds did not qualitatively change
of causing disease. In countries with modern food production the results (Materials and Methods and SI Appendix).
systems and active safety programs, such as the US, one might Timelines for a few Salmonella clusters are depicted in Fig. 1.
expect most contamination episodes to be small—impacting only Each case/isolate is represented by a short vertical line at a position
a small number of servings and with infectious doses that have corresponding to its isolation date. Within each cluster, cases from
only a low probability of causing illness in most individuals. These the same state are given the same color. Though it contains only
smaller contamination episodes are called sporadic food poisoning five isolates, the Newport cluster is spread over approximately 4 y
and are known to be responsible for the vast majority of foodborne and each case is from a different state. The 82 Typhi cluster isolates
illness (3). are spread over 8 y and 22 states. Below we analyze these and other
Sporadic food poisoning, because it does not trigger an epide- characteristics of the clusters across the entire dataset.
miological investigation, has largely been studied by case–control The Fig. 2 is a cumulative frequency plot of the fraction of all
studies. Because the genome sequences now allow us to identify cases in clusters of size 1 on up to size 30 for Salmonella, E. coli,
clusters of cases associated with a contamination episode, we can Campylobacter, and Listeria. For Salmonella, we separate out
analyze these contamination episodes more directly. In other Enteritidis isolates because the assumption that clinical isolates
words, the WGS data help us to connect the genetic clusters of differing by a small number of SNPs generally share the same
clinical cases to the underlying contamination episodes. Note that proximal source of contamination may be less valid for Enteritidis
the approach described here does not fully account for contami- than the other foodborne pathogens (Discussion). Other than
nation episodes that are polyclonal since each of the strains in Enteritidis, the four pathogen species have over 50% of all cases
polyclonal contamination episodes will appear as a single cluster in singleton clusters, i.e., these cases are more than four SNPs
(12, 13).
By examining the composition of these clusters, we can obtain
estimates of the fraction of contamination episodes occurring 100
upstream of the distribution from a central source of production
Cumulative Frequency (%)
because clinical cases from the same cluster are found in multiple 80
states. Likewise, by using the isolation dates of the clinical cases
in a cluster, we can observe the persistence of the contamination
60
episode. Finally, by using the ages of the members of a contami-
nation cluster, we can get important clues as to whether the serving
of food was contaminated prior to entering the household or 40
whether there was cross-contamination of the serving from other
Salmonella (Ent.) Escherichia coli
foods or environmental sources in the household. 20 Salmonella (non-Ent.) Listeria
Campylobacter
Results 0
3 6 9 12 15 18 21 24 27 30
The Materials and Methods section describes how clusters of clin- Cluster Size
ical food poisoning cases are generated. Briefly, we use a threshold
genetic distance for pairs of genomes that is low enough to infer Fig. 2. Cumulative frequency of cases by cluster size. Cumulative frequency
plot showing the fraction of all cases in clusters of size 1 up to size 30 for
a high likelihood that the associated pathogen isolates are derived Salmonella, E. coli, Campylobacter, and Listeria. Enteritidis isolates are separated
from the same source of contamination, e.g., four SNPs out of ~2 from other Salmonella serovars due to their distinct clustering behavior.
2 of 8 https://doi.org/10.1073/pnas.2411894121 pnas.org
Salmonella Campylobacter
3
Adjusted % of Total Counts
4
1
2
0 0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Age Age
4 6
4
2
2
0 0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Age Age
Fig. 3. Age distribution for all cases versus those in large clusters (>10). Age distribution of all cases versus large clusters (cluster size >10, red lines) across all
four pathogens. The case counts have been normalized based on the US population for different age groups. For Salmonella, the vertical bars and the red solid
line are excluding Enteritidis and the dashed black line is for Enteritidis (all cluster sizes). The curves for large clusters (red line) and for Enteritidis (black dashed)
were smoothed for legibility with a Savitzky–Golay filter (a window of seven for red and five for black and a third-order polynomial).
away from all other cases, and over 75% of the cases are in clusters The age distributions for Salmonella in Fig. 3 are consistent with
of size 10 or less. Among the major Salmonella serovars, Enteritidis those reported in published surveillance studies though the latter
stands out in having a far higher fraction of its cases in large do not break them down into, e.g., large and small cluster sizes
clusters (Fig. 3, black dashed line) and accounts for 40% of the (16). For Salmonella and E. coli, and to a lesser extent,
Salmonella cases in clusters larger than size 10. Campylobacter, the elevation of case rates among the younger age
We will examine some of the properties of clusters of different groups is notably less pronounced for the larger cluster sizes (red
sizes to determine whether there are epidemiological signals asso- line) than for total cases, i.e., there is a substantially higher fraction
ciated with the computed clusters. Fig. 3 shows the age distribu- of young individuals in the smallest clusters which dominate the
tion of the cases for all four pathogens in all cluster sizes (blue overall counts (Fig. 2).
bars) compared with larger clusters (size >10, red line). For Is a different composition of pathogen strains in the smaller
Salmonella, we separate Enteritidis in all cluster sizes (dashed black clusters responsible for the greater proportion of young individuals
line) from the rest of the serovars. The case counts have been in the small clusters? The three stacked bar charts in Fig. 4 show
normalized for the age population structure (14). E. coli and the serovar composition of all Salmonella cases, for individuals of
Campylobacter show an increasing number of cases from approx- age ≦1, and for cases in clusters of size 1 (all age groups). The
imately age 10 down to age 1 before a drop for infants under 1 y serovars have been grouped into four abundance categories based
old. Salmonella (excluding Enteritidis) is similar, though case on total counts:
counts appear to be increasing through age 0. However, nonmono-
1. Enteritidis (17.4% of cases),
tonicity becomes apparent when ages less than 1 y are binned more
finely (SI Appendix, Fig. S1). Thus, the incidence of Salmonella 2. the next five most common serovars (36% of cases with
also decreases as age approaches birth, but the downturn occurs Newport: 11.55%, Typhimurium: 9.84%, Javiana: 5.84%, I
at 4 mo of age and is not apparent in Fig. 3. 4:I:-: 4.74%, and Infantis: 4.04%),
For Salmonella there is a subtle increase in older adult cases 3. the next 22 most common (29.6% of cases), and
while for E. coli, there is a small increase in the 18 to 30 y age 4. the next 838 serovars (17.2% of cases).
interval followed by a decrease to a plateau from age 40 upward.
For Salmonella, E. coli, and Campylobacter we see a trough in case Both the infant set and the cluster size =1 set have a higher pro-
counts between the infant/young child peak and the teenage years. portion of rare serovars and a lower proportion of Enteritidis than
Listeria looks quite different, with case counts increasing with age the overall Salmonella set (Fig. 4). And the overall serovar com-
along with a narrow peak associated with newborn listeriosis (15). positions of these sets are extremely similar as shown in the scatter
17% 6% The blue line in Fig. 7 shows how the fraction of clusters with
5%
cases from at least two different states changes with increasing
60 cluster size. Larger clusters are more likely to be multistate for the
33% 35% trivial reason that, with more cases, there is a greater chance of
30%
including a case from a different state. We can control for this
40
trivial explanation by examining all pairs of cases within clusters
of a given size, to determine whether the fraction of multistate
20 36% 38% 37% pairs varies systematically with cluster size (red line in Fig. 7).
The red line shows that the contamination episodes underlying
the larger clusters are inherently more likely to be distributed from
0
All Cases Age 1 Cluster Size 1
a central site. While larger clusters are more likely to be geograph-
ically dispersed than smaller clusters, for Salmonella, over 30% of
Fig. 4. Salmonella serovar composition for all cases, age ≦1 y, and cluster size the clusters of size 2 are multistate and by cluster size 3, over 50%
=1 (all ages). Salmonella serovars were grouped into four abundance categories are multistate (Fig. 7, blue line). Similar results were obtained for
with respect to all cases: Enteritidis, the next five most common serovars
(Newport: 11.6%, Typhimurium: 9.8%, Javiana: 5.8%, I 4:I:-: 4.7%, and Infantis: Listeria, Campylobacter, and E. coli (SI Appendix, Fig. S4).
4.0%), the next 22 most common, and the next 838 serovars. Fig. 8 compares the persistence of single-state versus multistate
clusters for all four pathogens. We restrict the analysis to clusters
plot in SI Appendix, Fig. S2. The inverse relationship between of size 2 or 3 since the fraction of single-state clusters is substan-
serovar diversity and age is consistent with previously published tially lower with increasing cluster size (Fig. 7). As noted above,
results (17) and we provide an analysis of diversity by cluster size while the contamination for some of the single-state clusters may
in SI Appendix and Fig. S3 Thus a possible explanation for the age be distributed from, e.g., a central site, most of the multistate
distribution results seen in Fig. 3 is that the younger age groups clusters, however, are likely to have been distributed from central
are more susceptible to and/or more frequently exposed to the sources. The degree of shift to higher persistence of the multistate
serovars seen primarily in the smaller clusters e.g. perhaps from clusters compared to single-state appears roughly similar for all
environmental sources. four pathogens and is highly significant (all P-values < 10−4,
The Fig. 5 compares the fraction of cluster size 1 cases by age Mann–Whitney U test).
range for the same categories of serovars. The age ranges were We can also examine how the geographical dispersion of cases
chosen to include approximately the same numbers of cases and within a cluster varies among age groups (SI Appendix, Fig. S5). The
the fractions were normalized using the mean for each serovar youngest and oldest age groups have the highest fraction of cases in
category. We see the same pattern for all four categories of serovars, single-state clusters, which is consistent with their skew toward
i.e., there is a higher fraction of younger cases and of older cases smaller clusters (Figs. 3 and 5) and the relationship between cluster
in the smaller clusters. Thus, the higher frequencies of younger size and geographical dispersion (Fig. 7). The fraction of cases in
cases in small clusters seen in Fig. 3 (and to a lesser extent, older clusters where all cases in the cluster are within the same age range
cases) seem to be an inherent property of small clusters and do correlates very well with the geographical dispersion (Pearson
not primarily depend on the mix of serovars. Correlation 0.84 P-value 0.0024) and is only slightly higher than
Because we know the dates of collection of the clinical isolates, the fraction of cases in clusters where all cases are in the same state
we can examine the persistence of the clusters greater than size 1, and same age range. Thus, although overall only 22% of Salmonella
i.e., the number of days between the first isolate collected and the isolates are in single-state clusters, contamination episodes resulting
most recent isolate in the cluster. Fig. 6 shows the cumulative in same-age clusters are virtually all single-state (SI Appendix,
distribution of cluster persistence time among cases. Median per- Fig. S5). This would be consistent with these contamination
sistence times are highest for Salmonella and lowest for E. coli.
Note that we are underestimating the persistence of clusters begin-
ning or ending outside of the dates of our sample collection. An Enteritidis Common
even greater number of cases are likely to be missed because, as Most Common Rare
Normalized % (1 = Avg.)
1.2
mentioned in the Introduction, they have escaped detection by
PulseNet surveillance. 1.1
We can also examine nonsingleton clusters to see whether mul-
tiple US states are represented within them. The majority of cases
1.0
were found in multistate clusters, with all four pathogens showing
a high proportion of cases occurring in multiple states:
0.9
• Salmonella (78%)
• Listeria (70%) 0.8
• E. coli (65%)
8
0
12
2
-2
-4
-5
-6
-9
0-
• Campylobacter (63%)
2-
12
28
44
57
69
Age Group
While some contamination episodes are likely from a source in
the local environment, a majority of the contamination episodes Fig. 5. Normalized fraction of cluster size 1 cases by age range and serovar
appear to be geographically dispersed. That is, the contaminated category. Normalized fraction of cluster size 1 cases by age range and serovar
category. The number of serovars and the percent of cases they account for
servings of food are likely to have been distributed from a central are listed in the legend. Fractions for each serovar were normalized by the
point source (e.g., a food production or packaging facility) because mean for each serovar category.
4 of 8 https://doi.org/10.1073/pnas.2411894121 pnas.org
100
60
50% of cases
40 1071 days
Salmonella
841 days Campylobacter
20 Escherichia coli
385 days
Listeria
0
0 500 1000 1500 2000 2500 3000 3500 4000
Persistence (days)
Fig. 6. Persistence of clusters. Cumulative distribution of cluster persistence: The cumulative fraction of cases within nonsingleton clusters, showing the range
of cluster persistence measured by the number of days from the first to the last isolate collected within each cluster. The graph highlights the distribution of
persistence times across all observed clusters, with a dashed horizontal line indicating the median persistence level, where 50% of cases are found in clusters
with a duration exceeding this value.
episodes occurring in settings such as elder care facilities, daycare foodborne outbreaks that are investigated by the CDC, FDA,
facilities, schools, and direct environmental exposure. USDA, as well as state and local health authorities (10). The vast
The diet of the age group less than or equal to 3 mo old is distinct majority of food poisoning cases are classified as sporadic cases,
from all other age groups in that it consists primarily of breast milk which have primarily been studied through case–control methods
and/or infant formula: Only approximately 15.6% of infants receive and routine surveillance systems (3, 19–21). By analyzing the
complementary foods prior to 4 mo of age (18). Infant formula is clusters formed from closely related pathogen genomes of clinical
distributed nationally from a small number of production facilities. isolates within CDC’s PulseNet system, we can derive a more
If infant formula were a significant source of Salmonella contami- detailed picture of the contamination episodes underlying food-
nation then one might expect to see a substantial fraction of mul- borne illness in the United States.
tistate clusters solely composed of infants, contrary to the results The clusters of a given size shown in Fig. 2 are a mixture of
described above (SI Appendix, Fig. S5). However, of the 7,994 infant contamination episodes with a range of different sizes since, as
cases in clusters ≥size 2, the largest infant-only cluster has eight cases noted in the Introduction, only a variable fraction of the cases
and all clusters ≥size 4 are single-state clusters. Note that because would be reported within the PulseNet system. In addition, the
we are focusing on the possibility of infant formula as a source of SNP threshold used to generate the clusters may be too stringent
contamination, we are including those individuals less than 1 y old in some instances and thus split cases from the same contamina-
rather than ≤3 mo old since this is the age in which most children tion episode into different clusters, while in other instances it may
transition from infant formula to cow’s milk. be too high and thus merge cases from different episodes. This is
Both of these points provide strong support for the conclusion particularly an issue for Enteritidis. Eggs and poultry are the most
that Salmonella contamination of infant formula at the production common source of Enteritidis cases in the United States (22), and
site could only be responsible for a very small fraction of cases, if because the same breeder site may supply multiple poultry pro-
any, and any possible contamination episodes from this source duction facilities, identical or nearly identical strains may be found
would be exceedingly small. at these separate production facilities (23–25). Thus, Enteritidis
If infant formula contaminated at the site of production is not a clusters formed using low SNP thresholds may lump together
major source of salmonellosis then what is the source of contami- multiple independent proximal contamination episodes and the
nation for cases at age ≤3 mo? The diets of infants under 4 mo of interpretation of the Enteritidis results must account for that pos-
age are largely restricted to infant formula and breast milk, neither sibility (26, 27). Note also that some US states began the use of
of which are commonly consumed by individuals of age 10 or older. WGS before others so the coverage is more comprehensive later
Co-occurrence of these different age groups in a cluster therefore in the time interval of the sample set (Materials and Methods).
suggests cross-contamination between noninfant food and infant Furthermore, since we are likely to be missing cases from contam-
formula or breast milk, early complementary feeding, or exposure ination episodes toward the beginning and ends of the time inter-
to a common nonfood source of contamination. Over 80% of the val corresponding to our sample collection, the clusters associated
infants ≦3 mo of age are cluster members with individuals older with these episodes will be incomplete. Despite these caveats, the
than 10 y old (SI Appendix, Fig. S6). Furthermore, the fraction of results presented here demonstrate substantial differences in the
cluster membership with older individuals only increases slightly epidemiological properties of the clusters of different sizes, i.e.,
with young children whose dietary intake is more similar to that of there are strong, epidemiologically relevant signals associated with
older children and adults, e.g., 2 to 5 y of age (SI Appendix, Fig. S6). the computed clusters.
This is consistent with data presented above and in SI Appendix, We see a consistent pattern for Campylobacter, E. coli, and
Fig. S5. Similar patterns are evident in Campylobacter and E. coli. Salmonella:
assumptions: 1000
6 of 8 https://doi.org/10.1073/pnas.2411894121 pnas.org
home, reduced environmental persistence could lead to incidence pathogens because the contamination episodes causing most cases
in younger age groups that is closer to adult incidence. Possibly are larger and a significantly higher fraction of these are likely to
relevant here is the observation that seasonal variability of Enteritidis have been distributed widely from central sites. Moreover, we
is also markedly reduced compared to the other major serovars (e.g., know the primary source of Salmonella Enteritidis: poultry and
ref. 33). eggs (37). Quantitative risk assessment models have been created
Although the smallest contamination episodes account for the based on the pathogen survival rates in the cooking process that
majority of clinical cases of food poisoning, these do not appear have fairly good correlations between the prevalence of Salmonella
to be primarily local, like some sort of direct environmental expo- on, e.g., poultry and the fraction of cases in the US (38, 39). And
sure or associated with exposure from a restaurant outbreak. while Enteritidis may, in principle, be a more feasible target for
Rather, a majority of these cases are in geographically dispersed risk reduction at central sites, improved food safety practices in
clusters. For example, 78% of Salmonella cases (cluster size ≧2) the household would be helpful here as well.
are in multistate clusters. This implies that the servings that cause While WGS has already proven to be a useful tool for iden-
these cases are being distributed from central sites where the con- tifying and investigating foodborne outbreaks, we have demon-
tamination is occurring, which would primarily be commercially strated that the increasingly comprehensive set of pathogen
distributed foodborne transmission, but could also include contact genomes can also reveal important aspects of sporadic foodborne
with commercially distributed animals, pet food, and returning illness which accounts for most cases of food poisoning.
travelers. Furthermore, a high fraction of cases are in clusters that
are persistent, e.g., half the Salmonella cases are in clusters that
persist for over 1,071 d. As discussed in the Materials and Methods Materials and Methods
section, though the results on the fraction of multistate clusters Datasets. The pathogen isolates used in this project were collected and
and on persistence of clusters are dependent on the choice of SNP sequenced by CDC’s PulseNet national laboratory network for foodborne
thresholds for the generation of the clusters, the qualitative picture outbreak detection. The pathogen genome data and SNP distances were
remains the same: over a wide range of thresholds, the majority downloaded from the National Center for Biotechnology Information (NCBI)
of Salmonella cases are the result of contamination distributed Pathogen Detection site (40). The identifiers for the pathogen isolates used
from central sites and persisting for extended periods of time. As for these analyses are available in spreadsheets listed in Supplementary files
noted above, because we are missing isolates that could be mem- (PDT*) along with identifiers for the clusters and the size of each cluster based
bers of clusters prior to the beginning and after the end of our on thresholds of two, four, and eight SNPs. For Salmonella, the serovars as iden-
sample window as well as the larger number of cases that are not tified in the NCBI pipeline are listed as well. The Pathogen Detection releases
detected, e.g., by PulseNet, these are likely to be conservative used for each pathogen are
estimates. • Campylobacter PDG000000003.2084 (11/2023)
If infant formula contaminated at the production facility were a • Ecoli_Shigella PDG000000004.4162 (11/2023)
significant cause of salmonellosis, we would expect to see a substantial • Listeria PDG000000001.3486 (11/2023)
fraction of infant-only clusters (size ≥2). However, they represent only • Salmonella PDG000000002.2848 (11/2023)
17.6% of the cases. Furthermore, almost all infant-only clusters are PulseNet clinical isolates in study set:
single-state: there are only two infant-only clusters at the maximum
size of eight cases, and both are single-state clusters (SI Appendix, • Salmonella:1961-08-23: 2023-10-27 265,449 isolates
Fig. S5 and above). Rather, as seen in SI Appendix, Fig. S6, it appears • >98% of all isolates are from 2015-01-01
that most cases of salmonellosis in infants (age ≤3 mo)—with the • E. coli:1981-01-02: 2023-11-04 68,527 isolates
caveat that we can only analyze nonsingleton clusters—are due to • >96% of all isolates are from 2015-01-01
• Campylobacter: 1983-01-01: 2023-11-07 23,577 isolates
contamination sources shared with individuals too old to consume
• >95% of all isolates are from 2015-01-01
infant formula: cross-contamination from noninfant food, pet food,
• Listeria:1965-11-20: 2023-12-15 8,205 isolates
complementary feeding, and environmental sources. • >90% of all isolates are from 2013-01-01
Our results imply that most cases of foodborne illness are due to
many very small contamination episodes distributed from central Metadata. The PulseNet metadata used for these analyses was accessed from
sites over quite an extended period, similar to the Newport cluster CDC’s SEDRIC database under their data-use agreement (41). The fields used
shown in Fig. 1. These characteristics greatly increase the challenge were
of detecting and eliminating the source of contamination, e.g., in • the age of the affected individual,
a production facility or the environment. This may help explain the • the date of collection, and
slow progress in reducing the overall burden of foodborne illness in • the state where the isolate was collected.
the United States (34). Given the high case rates of the youngest
Generating Clusters.
age groups, their increased susceptibility, and the likelihood that a
SNP distances. The SNP distances for these analyses were the patristic (i.e., tree-
high fraction of these cases are due to sources within the household,
based) distances computed by the NCBI Pathogen Detection resource (40).
a greater emphasis on improving food safety in the consumer house-
This resource has been available since 2016 and has been used intensively by
hold should be considered. While improving food safety practices the FDA, CDC, USDA, and US state public health laboratories for outbreak detection
in the household would be quite challenging (35), identifying the and source tracking. A summary of the pipeline is available (https://ftp.ncbi.nlm.
minimal changes needed to reduce cross-contamination of infant nih.gov/pathogen/Methods.txt).
formula may be more feasible. Currently, state laws require infant SNP clustering. There are many ways to generate clusters using distance meas-
car seats and hospitals provide training on installing and using a car ures such as SNP distances. We start with the NCBI Pathogen Detection SNP
seat for parents bringing their newborn home from the hospital. clusters that are updated regularly. These SNP clusters include all genomes that
While breastfeeding is encouraged, much less attention is given to are within 50 SNPs of each other (NCBI pipeline summary https://ftp.ncbi.nlm.
educating new parents and caregivers on the safe preparation of nih.gov/pathogen/Methods.txt). We determine the case cluster for the genomes
infant formula and the use of feeding bottles (36). of each PulseNet clinical isolate within an NCBI SNP cluster as follows: Using
Salmonella Enteritidis seems to be a more feasible target for single linkage clustering based on a threshold SNP distance, isolates are added to
improving food safety outside the household than the other major a case cluster as long as they are less than or equal to the threshold SNP distance
1. B. R. Jackson et al., Implementation of nationwide real-time whole-genome sequencing to enhance 23. J. Zhang et al., High genetic similarity of Salmonella Enteritidis as a predominant serovar by an
listeriosis outbreak detection and investigation. Clin. Infect. Dis. 63, 380–386 (2016). independent survey in 3 large-scale chicken farms in China. Poult. Sci. 100, 100941 (2021).
2. B. Brown, M. Allard, M. C. Bazako, J. Blankenship, T. Minor, An economic evaluation of the Whole 24. T.-M. La et al., Whole-genome analysis of multidrug-resistant Salmonella Enteritidis strains isolated
Genome Sequencing source tracking program in the U.S.. PLoS ONE 16, e0258262 (2021). https:// from poultry sources in Korea. Pathogens 10 (2021).
doi.org/10.1371/journal.pone.0258262. 25. C.-W. Lei et al., Vertical transmission of Salmonella Enteritidis with heterogeneous antimicrobial
3. E. D. Ebel et al., Comparing characteristics of sporadic and outbreak-associated foodborne illnesses, resistance from breeding chickens to commercial chickens in China. Vet. Microbiol. 240, 108538
United States, 2004–2011. Emerg. Infect. Dis. 22, 1193–1200 (2016). (2020).
4. E. Brown, U. Dessai, S. McGarry, P. Gerner-Smidt, Use of whole-genome sequencing for food safety 26. D. J. Baker et al., Challenges associated with investigating Salmonella Enteritidis with Low genomic
and public health in the United States. Foodborne Pathog. Dis. 16, 441–450 (2019). diversity in New York State: The impact of adjusting analytical methods and correlation with
5. B. Jagadeesan et al., The use of next generation sequencing for improving food safety: Translation epidemiological data. Foodborne Pathog. Dis. 20, 230–236 (2023).
into practice. Food Microbiol. 79, 96–115 (2019). 27. T. Dallman et al., Phylogenetic structure of European Salmonella Enteritidis outbreak correlates with
6. J. Ronholm, N. Nasheri, N. Petronella, F. Pagotto, Navigating microbiological food safety in the era of national and international egg distribution network. Microb. Genom. 2, e000070 (2016).
whole-genome sequencing. Clin. Microbiol. Rev. 29, 837–857 (2016). 28. A. K. Simon, G. A. Hollander, A. McMichael, Evolution of the immune system in humans from infancy
7. E. L. Stevens et al., Use of whole genome sequencing by the federal interagency collaboration for to old age. Proc. Biol. Sci. 282, 20143085 (2015).
genomics for food and feed safety in the United States. J. Food Protoc. 85, 755–772 (2022). 29. A. Georgountzou, N. G. Papadopoulos, Postnatal innate immune development: From birth to
8. A. W. Pightling et al., Interpreting whole-genome sequence analyses of foodborne bacteria for adulthood. Front. Immunol. 8, 957 (2017).
regulatory applications and outbreak investigations. Front. Microbiol. 9, 1482 (2018). 30. R. de Alwis et al., The role of maternally acquired antibody in providing protective immunity against
9. B. Tolar et al., An overview of PulseNet USA databases. Foodborne Pathog. Dis. 16, 457–462 (2019). nontyphoidal Salmonella in urban vietnamese infants: A birth cohort study. J. Infect. Dis. 219,
10. E. Scallan et al., Foodborne illness acquired in the United States–major pathogens. Emerg. Infect. 295–304 (2019).
Dis. 17, 7–15 (2011). 31. S. Basha, N. Surendran, M. Pichichero, Immune responses in neonates. Expert Rev. Clin. Immunol.
11. M. K. Thomas et al., Estimates of foodborne illness-related hospitalizations and deaths in Canada for 10, 1171–1184 (2014).
30 specified pathogens and unspecified agents. Foodborne Pathog. Dis. 12, 820–827 (2015). 32. CDC, Breastfeeding Report Card, Centers for Disease Control and Prevention. (2023). https://www.
12. E. Sarno, D. Pezzutto, M. Rossi, E. Liebana, V. Rizzi, A review of significant european foodborne cdc.gov/breastfeeding/data/reportcard.htm. Accessed 28 July 2023.
outbreaks in the last decade. J. Food Protoc. 84, 2059–2070 (2021). 33. Salmonella Atlas, (2020). https://www.cdc.gov/salmonella/reportspubs/salmonella-atlas/index.
13. P. Gerner-Smidt et al., Whole genome sequencing: Bridging one-health surveillance of foodborne html. Accessed 19 October 2023.
diseases. Front. Public Health 7, 172 (2019). 34. H. J. Shah et al., Reported incidence of infections caused by pathogens transmitted commonly
14. United States Population by Age and Sex, https://www.census.gov/popclock/data_tables. through food: Impact of increased use of culture-independent diagnostic tests -foodborne
php?component=pyramid. Accessed 24 October 2023. diseases active surveillance network, 1996–2023. MMWR Morb. Mortal. Wkly. Rep. 73,
15. C. Charlier, O. Disson, M. Lecuit, Maternal-neonatal listeriosis. Virulence 11, 391–397 (2020). 584–593 (2024).
16. A. L. Boore et al., Salmonella enterica infections in the United States and assessment of coefficients 35. Kitchen Life 2, https://www.food.gov.uk/research/behaviour-and-perception/kitchen-life-2.
of variation: A novel approach to identify epidemiologic characteristics of individual serotypes, Accessed 30 November 2023.
1996–2011. PLoS One 10, e0145416 (2015). 36. E. C. Redmond, C. J. Griffith, The importance of hygiene in the domestic kitchen: Implications for
17. M. C. Judd, R. M. Hoekstra, B. E. Mahon, P. I. Fields, K. K. Wong, Epidemiologic patterns of human preparation and storage of food and infant formula. Perspect. Public Health 129, 69–76 (2009).
Salmonella serotype diversity in the USA, 1996–2016. Epidemiol. Infect. 147, e187 (2019). 37. L. H. Gould et al., Surveillance for foodborne disease outbreaks—United States, 1998–2008. MMWR
18. K. V. Chiang, H. C. Hamner, R. Li, C. G. Perrine, Timing of introduction of complementary foods— Surveill. Summ. 62, 1–34 (2013).
United States, 2016–2018. MMWR Morb. Mortal. Wkly. Rep. 69, 1969–1973 (2023). 38. T. P. Oscar, A quantitative risk assessment model for Salmonella and whole chickens. Int. J. Food
19. B. Devleesschauwer et al., Associating sporadic, foodborne illness caused by Shiga toxin-producing Microbiol. 93, 231–247 (2004).
Escherichia coli with specific foods: A systematic review and meta-analysis of case-control studies. 39. K. Rajan, Z. Shi, S. C. Ricke, Current aspects of Salmonella contamination in the US poultry
Epidemiol. Infect. 147, e235 (2019). production chain and the potential application of risk strategies in understanding emerging
20. K. E. Fullerton et al., Case-control studies of sporadic enteric infections: A review and discussion hazards. Crit. Rev. Microbiol. 43, 370–392 (2017).
of studies conducted internationally from 1990 to 2009. Foodborne Pathog. Dis. 9, 281–292 40. Home–Pathogen Detection–NCBI, https://www.ncbi.nlm.nih.gov/pathogens/. Accessed 16 August
(2012). 2023.
21. A. R. Domingues, S. M. Pires, T. Halasa, T. Hald, Source attribution of human salmonellosis using a meta- 41. SEDRIC: System for Enteric Disease Response, Investigation, and Coordination, (2022). https://www.
analysis of case-control studies of sporadic infections. Epidemiol. Infect. 140, 959–969 (2012). cdc.gov/foodsafety/outbreaks/tools/sedric.html. Accessed 6 January 2024.
22. B. R. Jackson, P. M. Griffin, D. Cole, K. A. Walsh, S. J. Chai, Outbreak-associated Salmonella enterica 42. M. A. Chattaway, A. Painset, G. Godbole, S. Gharbia, C. Jenkins, Evaluation of genomic typing methods
serotypes and food commodities, United States, 1998–2008. Emerg. Infect. Dis. 19, 1239–1244 (2013). in the Salmonella reference laboratory in public health, England, 2012–2020. Pathogens 12 (2023).
8 of 8 https://doi.org/10.1073/pnas.2411894121 pnas.org