Abstract
Web forums become the means of online communication and information sharing sources for the learning about health care and related treatment knowledge. By adopting web crawlers and natural language processing techniques, the automatic identification approach of the concerned HIV-related messages is proposed to facilitate the health authorities and social support groups in instant counseling. The proposed supervised GA/k-means for classification approach can help construct an effective identification and classification model with acceptable classification performance accompanied with its full flexibility to develop different fitness functions in accordance with the need of different requirements. Furthermore, with the aid of correspondence analysis, the most frequently used terms in concerned HIV-related messages are identified and focus on risky sexual behavior whereas unconcerned messages are those who of worried well.
Acknowledgements
We would like to thank the National Science Council of the Republic of China (Taiwan) for financial support under contract number NSC94-2416-H-155-014.