ABSTRACT
In the thesis, the new algorithm (CFCKE_SE) is illustrated mainly based on a large quantity of semantic knowledge in the Word Sense Code in accordance with the method of lexical chain and semantic expansion degree. The information of various characteristics in the chain to which the vocabulary belongs is sufficiently analyzed through the disambiguation of keywords and calculation of semantic relativity and similarity; then, the weight calculation is optimized for extraction. We find through inspection that the co-occurrence rate model can reduce ambiguity problem to a large extent and prevent the redundant expression of synonyms in the process of synonyms combination. If this new technology is applied to the text with a large quantity of synonyms, better results can be obtained and the keywords obtained can cover several topics comprehensively and accurately and the aggregative indicator F - measure, accuracy rate and recall rate can be improved compared with common algorithms.