50
Views
0
CrossRef citations to date
0
Altmetric
Cloud Computing for Big Data Processing

Chinese WeChat and Blog Hot Words Detection Method Based on Chinese Semantic Clustering

, , &
 

Abstract

This paper proposes a hot topic detection method based on Chinese semantic clustering. The method is aimed at high-dimensional Chinese WeChat and fragmentation of information. In order to analysis the sparse and content fragmentation features of Chinese WeChat and Blog data, we combine multiple strategies that repeated string computation, context adjacency analysis and linguistic rule filtering to abstract meaningful sentences, which can express independent and complete semantics. Then we construct the model of Chinese WeChat data in a relatively small and meaningful string space, and generate candidates’ topics via feature clustering and pick up the hot topics according to the heat sorting. The experimental result on the WeChat data and Blog data shows that the method can reduce the dimension of high-dimension sparse space of the blog in a way, which is effective and feasible to the WeChat hot topic detection method.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.