What if scientists were able to forecast the spread of flu the way meteorologists forecast the weather? What if they could track the spread of the virus and predict if it has a 75 or 80 percent chance of striking ... you?
“The flu happens every year, but we still don't have a good idea of [important] factors, like who's going to be affected first, where that will happen and when exactly the peak week might be,” says Rumi Chunara, an assistant professor at the College of Global Public Health and the Department of Computer Science and Engineering at New York University.
In 2008, Google launched Flu Trends, an attempt to track the spread of influenza based on “aggregated search queries.” But the effort has had its problems and Google recently announced Flu Trends would no longer continue on its own website. Instead, Google will provide data directly to infectious disease researchers at institutions like Columbia University’s Mailman School of Public Health.
One of those Columbia researchers is Jeffrey Shaman, an associate professor of Environmental Health Sciences. He led a team that won the Centers for Disease Control and Prevention ‘Predict the Influenza Season Challenge’ last year by creating a ‘flu forecast.’ Much like a weather forecast that predicts a 70 percent chance of rain, Shaman’s models sought to predict a 70 percent chance of influenza.
Shaman says making a flu prediction requires three ingredients, just as in weather predictions.
- A ‘dynamical model’ that describes a system. For weather, this is a model that describes the dynamics and thermodynamics of the atmosphere. For flu, it’s one that describes the propagation of the pathogen through a local population.
- Specific observations. Weather forecasters use satellites, ground-based stations, balloons and other techniques to gather data. Shaman uses estimates of influenza incidence from the CDC, hospital networks, Google Flu Trends and other sources.
- Statistical methods that combine the data from from the first two ingredients. “The idea is to try to optimize the model before you make the forecast,” Shaman explains. “Because if you make a model and use it to make a forecast without informing it of what's been going on in the local populations, it does a really bad job.”
Chunara is working on the same problem, but at a more granular level. For her Go Viral study, she has developed a do-it-yourself test kit that people can use to submit samples for her research.
“As part of our kit, which anyone can sign up for on the website, we have two collection methods,” Chunara explains. “One is for saliva collection and one is the nasal swab. The idea is that when you get sick you can follow our instructions and in a couple minutes generate these specimens and send them to us and we can actually figure out what was causing the illness.”
Like Shaman’s work, Chunara’s is designed to track when and where illness and diseases are spreading through a community. “There are a lot of data sources that we can now get at scale, [but] we’re hoping to add a bit more specificity,” Chunara says. “The number of specimens we can process is smaller than the number of people searching on Google, but we're trying to combine these two [types of ] data sources to look at exactly where flu is around the country and when it's happening.”
“The way we surveil infectious diseases is very passive,” Shaman adds. “We can have an active surveillance system where we're going out and seeing what people have and [using] that data to improve our forecast model. ... We don't want to just say flu is going to peak in New York City or in Seattle in five weeks; we want to say there's an eighty percent chance or twenty percent chance, the same way they do it in weather prediction."
Public health officials are more likely to take action if they know there's an 80 percent chance the flu is going to peak in their area in three weeks than if they know there's a 20 percent chance, Shaman points out.
Ideally, he says, people should receive a flu forecast every night on the news. “You already get a pollen count, you already get pollution levels — why not have a local pathogen forecast?”