Posted in | News | Machine-Vision

Innovative, Machine Learning-Based Tool for Identifying and Combating Fake News

Feb 4 2019

Fake news, such as distorted facts and invented stories, is spreading like wildfire over the Internet and is usually shared without a second thought, specifically on social media.

To identify fake news, Fraunhofer FKIE’s new machine learning tool analyzes both text and metadata. (Image credit: Fraunhofer FKIE)

As a result, Fraunhofer scientists have created a system with the ability to automatically verify social media posts, intentionally filtering out disinformation and fake news. The tool performs this by analyzing both content and metadata, sorting it with the help of machine learning techniques and relying on user interaction to optimize the results on the go.

Fake news is intended to provoke a particular response or stimulate agitation against an individual or a group of people. Its objective is to have an impact on and to manipulate public opinion on targeted topics of the day. Such fake news can spread similar to a wildfire on the Internet, specifically on social media like Twitter or Facebook. In addition, it could be a challenging task to identify it. This is precisely the area where a classification tool created by the Fraunhofer Institute for Communication, Information Processing and Ergonomics FKIE plays a vital role, by automatically verifying social media posts and processing enormous amounts of data.

Apart from processing text, the tool also takes metadata into account for its analysis and delivers its conclusions in visual form.

Our software focuses on Twitter and other websites. Tweets are where you find the links pointing to the web pages that contain the actual fake news. In other words, social media acts as a trigger, if you like. Fake news items are often hosted on websites designed to mimic the web presence of news agencies and can be difficult to distinguish from the genuine sites. In many cases, they will be based on official news items, but in which the wording has been altered.

Ulrich Schade, Professor, Fraunhofer FKIE.

The tool was developed by Schade’s research team.

Schade and his colleagues start the process by constructing libraries formed of serious news pieces as well as texts that have been identified by users as fake news. Then, these form the learning sets used to train the system. The researchers use machine-learning methods to filter out fake news, which involves automatically searching for particular markers in metadata and texts. For example, in a political context, it could be combinations of words or formulations that occur in rare instances in journalistic reporting or in day-to-day language, such as “the current chancellor of Germany.” Linguistic errors are also a red flag. This is specifically common if the author of the fake news writes in a language different from their native tongue. In such instances, incorrect spelling, punctuation, sentence structure, or verb forms are all signs of a prospective fake news item. Other indicators might include cumbersome formulations or out-of-place expressions.

When we supply the system with an array of markers, the tool will teach itself to select the markers that work. Another decisive factor is choosing the machine learning approach that will deliver the best results. It’s a very time-consuming process, because you have to run the various algorithms with different combinations of markers.

Ulrich Schade, Professor, Fraunhofer FKIE.

Metadata Yields Vital Clues

Metadata can also be used as a marker. In fact, it has an important role to play in distinguishing between authentic sources of information and fake news: For example, how often are posts being published, when is a tweet scheduled, and at what time? A post’s timing could reveal many things. For example, it can tell about the country and time zone of the news originator. A high send frequency indicates bots, increasing the probability of a fake news piece. Social bots publish their links to a number of users, for example, to spread uncertainty among the public. The connections and followers of an account could also turn out to be a perfect ground for analysts.

This is due to the fact that it enables researchers to construct heat maps and graphs of send frequency, send data, and follower networks. It is possible to use these network structures as well as their individual nodes to calculate the node in the network that initiated a fake news campaign or circulated a fake news item.

Another attribute of the automated tool is its potential to find out hate speech. Posts posing as news but also including hate speech usually link to fake news. “The important thing is to develop a marker capable of identifying clear cases of hate speech. Examples include expressions such as ‘political scum’ or ‘nigger’,” says the linguist and mathematician.

The team is able to adapt the system to different types of text to classify them. The tool can be used by public bodies as well as businesses to identify and fight against fake news. “Our software can be personalized and trained to suit the needs of any customer. For public bodies, it can be a useful early warning system,” concludes Schade.