A Sentiment Analysis Service Platform For Streamed Multilingual Tweets

Ioanna Karageorgou,Panagiotis Liakos,Alex Delis

2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)(2020)

引用 1|浏览3
暂无评分
摘要
Micro-blogging and social-media platforms are now prominent forums for disseminating information, opinions and commentaries. Among these, Twitter enjoys an in-excess of 330M base of users who continually produce and consume information snippets. Users collectively create a voluminous and multilingual corpus in a very broad range of topics on a daily basis. The discourse generated in the blogosphere is often of prime interest and importance to individuals, organizations, and companies. These actors would certainly like to periodically receive an overall assessment of demonstrated "sentiments" on specific issues by automatically classifying tweets expressed in different languages in conjunction with big-data analytics. In this paper, we propose a scalable service platform that employs multilingual sentiment analysis to classify streamed-tweets and yields analytics for selected topics in real-time. We discuss the main component of our Spark-enabled platform as we seek to offer an effective big-data service that can: 1) dynamically handle voluminous as well as high-rate tweet traffic through a multicomponent application exploiting the latest software developments, 2) accurately identify messages originated by non-genuine user-accounts, and 3) utilize the Spark machine-learning library (MLib) to successfully classify streamed multi-lingual messages in real-time, using multiple potentially distributed executors. To empower our service platform, we have adopted training sets and developed sentiment analysis (SA) models for English, French, and Greek that help classify streamed tweets with high accuracy. While experimenting with our distributed analytical platform, we establish both accurate and real-time classification for tweets expressed in the above European languages.
更多
查看译文
关键词
Service Platform for Language Analytics, Real-time Big-data Analysis for Streamed tweets, Spark-enabled Big-data Architecture, Classification of Multilingual tweets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要