A group of scientists from the Department of Computer Science and Artificial Intelligence at the University of Granada (UGR) has created a new system based on artificial intelligence (AI) methods that help predict election results by studying the opinions posted on Twitter.
The UGR team has explained their descriptive Big Data system in a study published in the international journal, IEEE Access. The system has been designed to handle large amounts of unstructured data (in the form of a “data lake”) taken from Twitter.
With the help of this method, the researchers were able to develop a political forecasting system and verify it with the real-life U.S. elections that took place in 2016 and in which Hillary Clinton lost against Donald Trump.
Perhaps, political talk is more common than ever before—for instance, the social networks can vouch for this, and not to mention the staggering volume of posts and threads dedicated to political topics every day.
For such purposes, Twitter is one of the most extensively used social networks. In this network, the opinions of activists, leaders, and parties integrate with those who are simply focused on politics. The potential to successfully process this information and transform it into knowledge is an arduous task that provides advantages for countless fields, ranging from business to academia or journalism.
The UGR project is the outcome of an effort to “summarize” a huge amount of information and decrease it to clear, crisp data that can add value to a research inquiry. José Ángel Díaz García, María Dolores Ruiz, and María José Martín-Bautista from the UGR’s Department of Computer Science and Artificial Intelligence developed the system in question.
The new system was tested on a real-life comparative issue related to a couple of politicians and their respective policies: that of Hillary Clinton and Donald Trump, in their head-to-head competition in the U.S. general elections that were held in November 2016.
Analysis of Sentiments and Emotions
Developed by the UGR researchers, the system offers a set of links between debates and concepts on the Twitter platform about the two politicians—in a format that can be easily understood and explained—along with the emotions and sentiments created by such discussions.
At the heart of our system are what we call unsupervised artificial intelligence techniques—that is, techniques that do not rely on databases having been pre-labelled in order to be trained and used.
Among these methods, “association rules” are practically important because these rules help perform sentiment analysis with the help of dictionaries and sentiment lexicons.
Today, these techniques are of enormous value because they provide readily interpretable and easily understandable solutions. They enable straightforward data traceability and provide easily-explained results that may be used by people with no technical knowledge, thus democratizing access to artificial intelligence.
The latest descriptive technique is different from the conventional “machine learning” models designed for predictive sentiment analysis. These models need huge pre-labeled databases (extremely difficult to realize with respect to social networks, because of the precariousness of the topics concerned), and usually provide solutions that are very hard to understand because of the extremely complex mathematical adaptations.
Analysis of the outcomes obtained by the novel system supports its ability to make sentiment patterns and association rules with considerable descriptive value in the case of its use in the U.S. elections. Therefore, experts can draw parallels between those patterns and real-life events.
Diaz-Garcia, J. A, et al. (2020) Non-Query-Based Pattern Mining and Sentiment Analysis for Massive Microblogging Online Texts. IEEE Access. doi.org/10.1109/ACCESS.2020.2990461.