What Twitter Knows About Flu
Twitter may turn out to be a great tool for tracking epidemics and how people deal with them.
Some scientists tracked tweets about swine flu back in 2009 and 2010, then looked at how the tweets lined up with vaccination rates.
By comparing the Twitter data with vaccination estimates from the Center for Disease Control and Prevention, the group saw patterns between what people were saying about flu shots and whether or not they were getting sick.
For example New England, the region with the highest vaccination rate, also had the highest percentage of Twitter users who posted positive messages about the vaccine. The study findings were published in PLoS Computational Biology.
Health care workers could use the data to target education, predict disease outbreaks, and, since tweets happen in real time, respond to epidemics more quickly.
Of course, social media networks are also an excellent way to spread misinformation, so they're a source of plenty of background noise and false alarms, too, as groups like the World Health Organization are learning.
But as an increasing number of scientific papers based on tweeted data are published in peer-reviewed journals, any debate seems to be over how best to use Twitter and what its limitations are, rather than whether or not it's a legitimate source of information.
"You can observe how people feel about certain things in a real-world context," Penn State biologist Marcel Salathe, lead author on the flu study, tells Shots. "On Twitter, people talk about things that matter to them in a completely normal context."
Salathe concedes that data from tweets probably shows some biases based on the demographics of Twitter users, but he feels that those biases were no worse than what you would experience in taking a phone survey. That's a growing problem given the declining number of households with land lines and the number of people who decline to participate in phone surveys. So Twitter might actually provide a more accurate picture, he says.
One of the challenges researchers face with Twitter is the sheer amount of data available. With hundreds of thousands of vaccine-related messages to sort through, the scientists turned to a technique called machine learning for help.
If you've ever gotten book recommendations from Amazon or let Netflix choose a movie for you, then you're already familiar with machine learning. Based on the customers' input, the computer learns which things they're more likely to buy and finds similar items in the database.
The technique Salathe developed is very similar. A group of students rated 70,000 tweets by hand, and those were used to teach a computer to sort the rest.
The algorithm that researchers developed to analyze whether vaccine-related tweets were positive, negative, or neutral can be applied to other infectious diseases, but Salathe has even bigger plans for it.
While the trails of viruses and bacteria are worth monitoring, the most prevalent health problems in the U.S. today are linked to behavior, such as smoking or unhealthy eating habits, rather than to infectious agents. "These behaviors can be, in a way, infectious," Salathe said. Many of these issues, most notably obesity, have been found to spread through social groups.
Cases of biology are easier to track than those involving behavior. People who post negative messages about a flu vaccination are less likely to get that vaccination, and therefore more likely to get sick. Whether they became friends as a result of a shared idea about vaccines or whether one person influenced another's views is irrelevant. That's not true when you're tracking something like obesity, and proving a direct influence of one individual's behavior on another will be tricky.
But it's a challenge Salathe feels ready to tackle. "Understanding behavior contagion is definitely a big issue for us," he said, and he feels Twitter might hold the answer. "These kinds of data that weren't around five years ago are going to revolutionize the way we think about health."