Where is the vaccine hesitancy? USC AI researchers can predict at the ZIP code level, in real time – USC Viterbi

Photo credit: Thirdman/Pexels

The term “vaccine hesitant” has tied up in nearly every conversation surrounding COVID-19, its latest variants, and evolving news about vaccine developments. It’s almost hard to imagine a day when such topics don’t creep into casual lunchtime conversations and our waking thoughts.

With new variants constantly emerging, the phenomenon of vaccine hesitancy is only becoming more powerful and present in communities across the United States. But before public health officials can understand and engage these communities, they must first solve an initial problem: how to effectively identify where large communities of vaccine-hesitant people reside in the first place.

In a new article published in Digital Health PLOSresearchers from the USC Viterbi School of Engineering have proposed natural language processing (NLP) software that learns where vaccine skepticism lies in real time.

Mayank Kejriwal, assistant research professor of industrial and systems engineering and research team leader at USC’s Institute for Information Science (ISI), was inspired by the current deficits in predicting the vaccine hesitancy. The software brings enhancements to NLP strategies, including word embedding algorithms that detect vaccine-related keywords. These advances make collecting vaccine hesitancy data at the postal code level remarkably simpler, faster and more accurate.

By using publicly available Twitter data and already existing machine algorithms to process it, the study’s system outperforms local and national survey data in its intent to reflect public opinion on the COVID-19 vaccine.

Not all data is equal

Sara Melotte, a master’s student in computer science at USC Viterbi School of Engineering and a research assistant at ISI, commented on the study’s measures to acquire such data and how they contribute to the goal of making such predictions at the community level.

“We show that only the text tweet and hashtags are sufficient to predict vaccine hesitancy at the zip code level with reasonable accuracy, even though not all tweets are related to the COVID-19 pandemic,” said Melotte.

It also eliminates the possibility of bias inherent in surveys, an inevitable consequence that arises when individuals know that their personal information is being collected. In fact, the algorithm retrieves hashtags without the additional need for personal information or deterrence from people expressing their unflinching opinions.

“Historically, a lot of things depend on investigations. When you see the poll numbers, those are collected by polls, which are expensive,” Kejriwal said. Not only is cost becoming a limitation, but the factor of speed and ever-changing opinions further complicates the issue of acquiring accurate and up-to-date data.

“What usually ends up happening is that we have to wait for the survey to come out, and by then you’d already be too late,” Kejriwal said. “But we’ve shown that you can take publicly available Twitter data and retrieve it programmatically,” and get real-time results.

Guided by real intuitions, the model also uses external data as sources, such as the number of hospitals or scientific establishments in a district. “We are investigating how the use of these independent sets of features helps improve the model,” Kejriwal said.

However, one of the caveats of collecting such data includes varying state and municipal regulations that limit the availability of public information. Yet the study provides reliable methods and data for predicting vaccine hesitancy in metropolitan cities — high-traffic Twitter areas — that can be replicated and confirmed using independent survey data.

A tool for decision makers

The study provides local communities, public health experts and policy makers with an additional source for detecting and addressing vaccine reservations. A tool to enact policies that benefit the communities that need them most—before it’s too late.

“We provide an early warning system,” continued Melotte.

Historically, federal policies often overlook the nuances of each community’s historical compositions and backgrounds. This has led to distrust of federal institutions and the policies that flow from them. Kejriwal stressed the importance of using the study’s methods to help rebuild such trust in a bottom-up, community-driven way.

“We can help communities design local policies and make their own decisions that will build trust,” Kejriwal said. Vaccine hesitancy highlights the need to rethink current and general approaches to vaccine policy. This attempt to approach the situation from a renewed perspective supports the creation of more organic solutions that will meet the needs of each community.

If vaccine uncertainty fluctuates in intensity and zip code, policies and resources can reassess and appropriately modify approaches to vaccine administration and communication.

“For any public health crisis, there will always be social media signals,” Kejriwal said. “This [study] is an opportunity because it is a living record and can provide us with a blueprint for getting signals in any public health crisis.

Posted on May 16, 2022

Last updated May 16, 2022

James G. Williams