222
Views
2
CrossRef citations to date
0
Altmetric
Articles

Unsupervised Event Detection Using Self-learning-based Max-margin Clustering: Analysis on Streaming Tweets

ORCID Icon &
 

ABSTRACT

We propose an unsupervised approach for tweet clustering from large-scale Twitter repository in this paper. The amount of acquired data from streaming media like Twitter is vast in nature. They contain readily available information regarding important events taking place during the time span. Hence, it is indeed difficult to deploy supervised learning strategies for analyzing the tweets for meaningful information extraction. On top of that, the tweets are unstructured in nature given the diversities of the end-users who put the tweets. Given that, an unsupervised tweet-processing technique can be of immense help for different inference tasks including event extraction, sentiment analysis, to name a few. Based on the aforementioned bottlenecks of the majority of the existing techniques, we propose a novel unsupervised event detection strategy from streaming tweets. In this regard, we propose a self-learning max-margin clustering which deploys the notion of SVM in an unsupervised setup. We evaluate proposed system and compare it with the popular techniques from the literature using 6.5 million streaming tweets, collected in June 2017. In our experiments, self-learning-based max-margin clustering outperforms the techniques of literature in terms of precision, Silhouette score, and Calinski–Harabasz score.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

Additional information

Notes on contributors

Swati Gupta

Swati Gupta is a research scholar at the department of Computer Science and Engineering in Indian Institute Technology Roorkee, India. She received her MTech degree in information technology from Indian Institute of Information Technology, Allahabad, in 2014, and BTech degree in information technology from GBTU in 2011. Her research interests include text mining, machine learning and image processing.

Biplab Banerjee

Biplab Banerjee is currently working as an assistant professor at the dept. of comp science, IIT Roorkee. He obtained his PhD from IIT Mumbai in 2015 and his thesis was awarded “Excellence in PhD” from IIT Mumbai. He subsequently worked as research assistant at the Image group, University of Caen, France and the Vision group of Istituto Italiano di Tecnologia Genova, Italy. He is the recipient of the prestigious ‘Early Career Research Award’ from Science & Engineering Board, DST, GoI. His research interests include computer vision, machine learning and deep learning. He is a member of IEEE Signal Processing Society. His research findings have been published in leading journals from IEEE and Springer. Email: [email protected]

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.