Abundance of Data Has Created New Ways to Gather Insights
All eyes are on Rio as much for Olympic highlights as it is for the unfolding Zika health crisis. But with 42 countries and territories reporting confirmed local, vector-borne transmission of the virus, the risk may be closer to home.
Infectious diseases are now spreading geographically much faster than at any time in history and appear to be emerging more quickly than ever before says the World Health Organization. With an expected 3.6 billion airline passengers in 2016, an outbreak from any part of the world may be just hours away from a gate near you.
Disease surveillance continues to be crucial to global health. The latest weapon, big data, is providing public health officials with new opportunities to predict and prevent disease outbreaks.
To that end, the Oswaldo Cruz Foundation (Fiocruz), a prominent research organization based in Rio, will employ data analytics to help track the spread of Zika using the Spatiotemporal Epidemiological Modeler Project. STEM has been used to study the spread of other infectious diseases like Ebola and the mosquito-borne dengue. Fiocruz, in collaboration with IBM, will also analyze social media to help identify location and severity of outbreaks of Zika, dengue, and Chikungunya. For these vector-borne diseases with no vaccines currently available, prevention and preparedness are top strategies.
Tools like HealthMap aggregate diverse data sources including crowdsourcing, online news, and traditional health reporting systems and provide real-time surveillance of public health threats. HealthMap gets more than a million visitors per year and collaborates with the CDC and other public health offices.
Fighting Dengue with Data
While the enormous volume of data presents new opportunities for researchers, each data source has its own pluses and minuses says Dr. Lakshminarayanan Subramanian, an associate professor in the Courant Institute of Mathematical Sciences at NYU. “You have to be really careful how you extract the right signal. Because there is a lot of noise in the data,” he says.
Dr. Subramanian along with Dr. Umar Saif, chairman of the Punjab Information Technology Board, and a team of computer researchers were able to find the signal in the noise from more than 300,000 phone calls about dengue, a virus that is similar in transmission to Zika.
Those calls were to a health hotline set up to improve the surveillance and response times following a 2011 dengue epidemic in the densely populated province of Punjab. In that outbreak 21,000 cases were reported and more than 350 lives lost.
The team analyzed many data sources, some at a fine-grained town level, including call volume, hospital records, and weather conditions that impact the vector lifecycle. What emerged were patterns on case distribution that led to a forecasting idea says Dr. Subramanian. The team created a working dashboard that not only flags outbreaks but also provides an accurate forecast by location two to three weeks ahead of time.
“If you give a signal to a hospital two weeks before an outbreak, it’s a big deal,” Dr. Subramanian says. Their results were published in Science Advances.
Hospitals in that region use that lead time to prepare special wards and prevent patient-to-patient spread of the infection and public health officials are able to implement targeted containment activities. The researchers plan to publish follow up data on these efforts. “So far the results have been promising,” he says.
Another study also on dengue transmission in Pakistan published in the Proceedings of the National Academy of Science, found predictive clues in much bigger data—more than 40 million mobile phone records. Dengue has been prevalent for decades in the southern port city of Karachi, but newly emerging outbreaks in the northeast provided researchers with the opportunity to analyze how and when the disease spreads in a wired world.
The researchers were able to generate fine-scale dynamic risk maps with direct application to dengue containment and epidemic preparedness.
“Mobile phone data provide dynamic population mobility estimates that can be combined with infectious disease surveillance data and seasonally varying environmental data to map these changing patterns of vulnerability,” says author Dr. Caroline Buckee, assistant professor at the Harvard T.H. Chan School of Public Health in the study.
Digital Divide
For Fiocruz and others mining data for patterns on Zika, progress may be slow. For one, the Zika virus infection is often silent with up to 80% of infected individuals experiencing no symptoms and likely creating no online chatter.
In the case of Brazil, while the country has experienced rapid Internet usage growth, a large part of the population has never been online says the Internet Society. Pew Research Center puts those numbers at 60% of the population using the Internet and only 41% having smart phones and access to the mobile Internet.
“The digital divide affects user demographics and what type of input you get,” says Dr. Subramanian. There is some risk that what is being captured are demographics of affluent populations and less so of the less affluent, he says. With the growth of smart phones, this will continue to change he says.
For now, the high flux of data has created new ways to gather insights on both a community and individual level. For public health experts mining data, the opportunity is huge right now says Dr. Subramanian.