222
Views
2
CrossRef citations to date
0
Altmetric
Articles

Unsupervised Event Detection Using Self-learning-based Max-margin Clustering: Analysis on Streaming Tweets

ORCID Icon &
Pages 569-578 | Published online: 29 Aug 2018
 

ABSTRACT

We propose an unsupervised approach for tweet clustering from large-scale Twitter repository in this paper. The amount of acquired data from streaming media like Twitter is vast in nature. They contain readily available information regarding important events taking place during the time span. Hence, it is indeed difficult to deploy supervised learning strategies for analyzing the tweets for meaningful information extraction. On top of that, the tweets are unstructured in nature given the diversities of the end-users who put the tweets. Given that, an unsupervised tweet-processing technique can be of immense help for different inference tasks including event extraction, sentiment analysis, to name a few. Based on the aforementioned bottlenecks of the majority of the existing techniques, we propose a novel unsupervised event detection strategy from streaming tweets. In this regard, we propose a self-learning max-margin clustering which deploys the notion of SVM in an unsupervised setup. We evaluate proposed system and compare it with the popular techniques from the literature using 6.5 million streaming tweets, collected in June 2017. In our experiments, self-learning-based max-margin clustering outperforms the techniques of literature in terms of precision, Silhouette score, and Calinski–Harabasz score.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

Additional information

Notes on contributors

Swati Gupta

Swati Gupta is a research scholar at the department of Computer Science and Engineering in Indian Institute Technology Roorkee, India. She received her MTech degree in information technology from Indian Institute of Information Technology, Allahabad, in 2014, and BTech degree in information technology from GBTU in 2011. Her research interests include text mining, machine learning and image processing.

Biplab Banerjee

Biplab Banerjee is currently working as an assistant professor at the dept. of comp science, IIT Roorkee. He obtained his PhD from IIT Mumbai in 2015 and his thesis was awarded “Excellence in PhD” from IIT Mumbai. He subsequently worked as research assistant at the Image group, University of Caen, France and the Vision group of Istituto Italiano di Tecnologia Genova, Italy. He is the recipient of the prestigious ‘Early Career Research Award’ from Science & Engineering Board, DST, GoI. His research interests include computer vision, machine learning and deep learning. He is a member of IEEE Signal Processing Society. His research findings have been published in leading journals from IEEE and Springer. Email: [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 100.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.