Each year 48 million people get sick from a foodborne illness, 128,000 are hospitalized, and 3,000 die. A recent paper published in JAMIA, Journal of the American Medical Informatics Association, illustrates the success of an improved system that tracks foodborne illness via online Yelp restaurant reviews developed by the Columbia University Department of Computer Science.
Since 2012 this system has been used by the New York City (NYC) Department of Health and Mental Hygiene (DOHMH) to identify instances of foodborne illness in NYC restaurants.
‘The popularity of online food reviews and the incorporation of social media data into public health surveillance systems are more common nowadays.’
The system identifies Yelp reviews that indicate a foodborne illness and reviews that suggest multiple people experienced foodborne illness. To achieve this, the system identifies key words within reviews such as use of the words "sick" and "multiple." Since this system was introduced in the DOHMH, epidemiologists have identified 8,523 reviews consistent with food poisoning, which helped to identify 10 outbreaks of foodborne illness associated with NYC restaurants. The work illustrated in this paper describes the evaluation of methods to increase the sensitivity and specificity to improve system performance.
Foodborne illness is major health problem in the United States. The Centers for Disease Control and Prevention estimates that there are 48 million illnesses and over 3,000 deaths caused by contaminated food in the United States each year. When these instances are related to restaurants, they have traditionally been identified via health department complaint registration systems. However, social media now provides a public platform to disclose incidents that may not have been reported through established complaint systems.
Younger people are less likely to report foodborne illness via traditional channels. The popularity of online reviews and the incorporation of social media data into public health surveillance systems are, however, becoming more common. Data from internet search engines and social media has been used to monitor outbreaks of various infectious diseases, such as influenza, and an evaluation comparing the use of social media and internet data against traditional methods of detecting outbreaks of infectious diseases found that these new methods were the first to report outbreaks in 70% of cases.
In line with this, the initial pilot study of the DOHMH system, from July 1 2012 to March 31 2013, found that only 3% of illness incidents discovered via online reviews had been reported via NYC's established complaint system. Due to the success of the pilot study, DOHMH has integrated Yelp reviews into its foodborne illness complaint surveillance system and continues to mine Yelp reviews and investigate those pertaining to foodborne illness. DOHMH looks forward to implementing the improved system and continues to work with Columbia University to integrate new data sources, including Twitter, into the foodborne illness complaint system.
"We find that the application of machine learning, specifically in the form of document classification techniques, can contribute greatly to public health surveillance in social media, said lead researcher Thomas Effland. "Our future work will improve upon these techniques and target their application to foodborne illness surveillance in other forms of social media, such as Twitter, as well as other key indicators in public health."