Using machine learning and robotic process automation for misinformation detection on Twitter: Analysing the tweets on Covid-19 pandemic
Pragya Lahoti, Suneel Prasad
- 发表年份
- 2022
- 引用次数
- 2
- 访问权限
- 开放获取
摘要
Social media is a place where vast data is continually generated. These days news first surfaces on micro blogs before it pass to major media outlets. Micro blogging websites are rich sources of information, and Twitter is one of the micro blogging interfaces. Twitter is also much used to share information with other social network users. Event detection systems based on Twitter can get information from a huge number of tweets posted by users. It is among the main reasons Twitter is considered an effective source of data as it can provide substantial near real-time data to identify incidents. However, it also creates a low perception problem; this means that systems fail to identify events correctly if too much false information is included, which is called rumor. Rumor can be characterized as a proclamation whose real or true value is unverifiable or intentionally false. Rumor detection has recently been studied to allow for accurate event detection. Rumors can propagate to millions of users quickly without fact-checking, and it may cause significant harm. More precisely, the current technique detects rumors by identifying and analyzing the content of the tweet, retweets count, sentiment of the tweet, follower count, etc., from the Twitter metadata, which can be useful for classifying it as rumor or non-rumor. A systematic literature review of existing research work on various machine learning techniques for misinformation detection was carried out to arrive at the optimal approach that can be taken for the paper. In this paper, tweets during the Covid-19 situation have been taken into account for misinformation detection. In this research, a two-way approach has been taken to classify Twitter messages (Tweets) as rumor or non-rumor related. The first approach is text-based analysis, while the other is media-based analysis. For the first approach, different machine learning classifiers were performed and evaluated based on the F1-score. In the second approach, tweets containing images are extracted for Web Detection using Robotic Process Automation. In this, Google Cloud Vision is used to match specified images with the images on the web to find their original or multiple sources and, thereby, authenticity. In this way, text-based and media-based messages containing falsified details can be detected.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002