Search in:

Advanced search

Journal of Computer Information Systems Volume 51, 2010 - Issue 1

Submit an article Journal homepage

502

Views

CrossRef citations to date

Altmetric

Original Articles

An Empirical Comparison of Four Text Mining Methods

Sangno LeeTexas Tech University, Lubbock, Texas79409

Jaeki SongTexas Tech University, Lubbock, TX79409;Sogang University, Seoul, Korea

Yongjin KimSogang University, Seoul, Korea

Pages 1-10 | Received 16 Nov 2009, Accepted 01 Mar 2010, Published online: 11 Dec 2015

Cite this article

References
Citations
Metrics
Reprints & Permissions

References

Ahrendt, P., Goutte, C., and Larsen, J., “Co-occurrence models in music genre classification”, IEEE International workshop on Machine Learning for Signal Processing, 2005, 247–252.
Google Scholar
Aldous, D., “Exchangeability and related topics”, Ecole d'Ete de Probabilites de Saint-Flour XII, Springer Lecture Notes in Mathematics, 1117, 1985, 1–198.
Google Scholar
Androutsopoulos, I., Koutsias, J., Chandrinos, K., and Spyropoulos, C., “An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages”, ACM New York, NY, USA, 2000, 160–167.
Google Scholar
Bao, S., Xu, S., Zhang, L., Yan, R., Su, Z., Han, D., and Yu, Y., “Joint Emotion-Topic Modeling for Social Affective Text Mining”, Data Mining, 2009. ICDM '09. Ninth IEEE International Conference, 2009, 699–704.
Google Scholar
Bellegarda, J., Naik, D., and Silverman, K., “Automatic junk e-mail filtering based on latent content”, 2003, 465–470.
Google Scholar
Bergholz, A., Chang, J., Paaß, G., Reichartz, F., and Strobel, S., “Improved phishing detection using model-based features”, 2008.
Google Scholar
Bíró, I., Szabó, J., and Benczúr, A., “Latent dirichlet allocation in web spam filtering”, ACM New York, NY, USA, 2008, 29–32.
Google Scholar
Blei, D.M., and Lafferty, J.D., “A Correlated Topic Model of Science”, The Annals of Applied Statistics, 1 (1), 2007, 17–35.
Web of Science ®Google Scholar
Blei, D.M., and Lafferty, J.D., “Topic Models”, Ashok N. Srivastava, Meharn Sahami ed., CRC Press, 2009.
Google Scholar
Blei, D.M., Ng, A.Y., and Jordan, M.I., “Latent Dirichlet Allocation”, Journal of Machine Learning Research, 3, 2003, 993–1022.
Web of Science ®Google Scholar
Bosch, A., Zisserman, A., and Munoz, X., “Scene classification via pLSA”, Lecture Notes in Computer Science, 3954, 2006, 517–530.
Google Scholar
Boyd-Graber, J., Blei, D., and Zhu, X., “A topic model for word sense disambiguation”, 2007, 1024–1033.
Google Scholar
Chang, J., Boyd-Graber, J., Gerrish, S., Wang, C., and Blei, D., “Reading Tea Leaves: How Humans Interpret Topic Models”, Neural Information Processing Systems, 2009, 1–9.
Google Scholar
Chen, Q., Tai, X., Jiang, B., Li, G., and Zhao, J., “Medical Image Retrieval Based on Latent Semantic Indexing”, Proceedings of the 2008 International Conference on Computer Science and Software Engineering, IEEE Computer Society, 2008, 561–564.
Google Scholar
Cheung, K., Kwok, J.T., Law, M.H., and Tsui, K., “Mining customer product ratings for personalized marketing”, Decision Support Systems, 35, 2003, 231–243.
Web of Science ®Google Scholar
Chou, T.-C., and Chen, M.C., “Using Incremental PLSI for Threshold-Resilient Online Event Analysis”, Knowledge and Data Engineering, IEEE Transactions on, 20 (3), 2008, 289–299.
Web of Science ®Google Scholar
Das, S.R., and Chen, M.Y., “Yahoo! for Amazon: Sentiment extraction from small talk on the web”, Management Science, 53 (9), 2007, 1375–1388.
Web of Science ®Google Scholar
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., and Harshman, R., “Indexing by Latent Semantic Analysis”, Journal of the American Society for Information Science, 41 (6), 1990, 391–407.
Web of Science ®Google Scholar
Ding, C.H.Q., “A probabilistic model for Latent Semantic Indexing: Research Articles”, Journal of the American Society for Information Science and Technology, 56 (6), 2005, 597–608.
Web of Science ®Google Scholar
Fuhr, N., “Probabilistic models in information retrieval”, The Computer Journal, 35 (3), 1992, 243–255.
Web of Science ®Google Scholar
Gansterer, W., Janecek, A., and Neumayer, R., “Spam filtering based on latent semantic indexing”, Survey of Text Mining II: Clustering, Classification, and Retrieval, 2008, 165–183.
Google Scholar
Girolami, M., and Kaban, A., “On an equivalence between PLSI and LDA”, ACM New York, NY, USA, 2003, 433–434.
Google Scholar
Greif, T., Horster, E., and Lienhart, R., “Correlated topic models for image retrieval”, Technical Report TR2008-09, University of Augsburg, 2008.
Google Scholar
Herdiyeni, Y., Nurdiati, S., and Daud, I.A., “Image Semantic Extraction Using Latent Semantic Indexing on Image Retrieval Automatic-Annotation”, Proceedings of the 2009 International Conference of Soft Computing and Pattern Recognition, IEEE Computer Society, 2009, 283–288.
Google Scholar
Hofmann, T., “Probabilistic latent semantic indexing”, SIGIR-99, ACM New York, NY, USA, 1999, 50–57.
Google Scholar
Hofmann, T., “Unsupervised learning by probabilistic latent semantic analysis”, Machine Learning, 42 (1), 2001, 177–196.
Web of Science ®Google Scholar
Hofmann, T., Puzicha, J., and Jordan, M., “Unsupervised learning from dyadic data”, Advances in Neural Information Processing Systems, 11, 1999.
Google Scholar
Ide, N., and Veronis, J., “Introduction to the special issue on word sense disambiguation: the state of the art”, Comput. Linguist., 24 (1), 1998, 2–40.
Google Scholar
Kakkonen, T., Myller, N., and Sutinen, E., “Applying latent Dirichlet allocation to automatic essay grading”, Lecture Notes in Computer Science, 4139, 2006, 110–120.
Google Scholar
Kakkonen, T., Myller, N., Sutinen, E., and Timonen, J., “Comparison of Dimension Reduction Methods for Automated Essay Grading”, Educational Technology & Society, 11 (3), 2008, 275–288.
Web of Science ®Google Scholar
Koller, D., and Friedman, N., “Probabilistic Graphical Models: Principles and Techniques”, The MIT Press, 2009.
Google Scholar
Kongthon, A., Haruechaiyasak, C., and Thaiprayoon, S., “Expert Identification for Multidisciplinary R&D Project Collaboration”, PICMET 2009 Proceedings, 2009.
Google Scholar
Kontostathis, A., and Pottenger, W., “A framework for understanding Latent Semantic Indexing (LSI) performance”, Information Processing and Management, 42 (1), 2006, 56–73.
Web of Science ®Google Scholar
Landauer, T.K., Foltz, P.W., and Laham, D., “An introduction to latent semantic analysis”, Discourse processes, 25, 1998, 259–284.
Web of Science ®Google Scholar
Larsen, K.R., Monarchi, D.E., Hovorka, D.S., and Bailey, C.N., “Analyzing unstructured text data: using latent categorization to identify intelectual communities in information systems”, Decision Support Systems, 45, 2008, 884–896.
Web of Science ®Google Scholar
Magatti, D., Calegari, S., Ciucci, D., and Stella, F., “Automatic Labeling Of Topics”, Ninth International Conference on Intelligent Systems Design and Applications, 2009, 1227–1232.
Google Scholar
McCallum, A., Wang, X., and Corrada-Emmanuel, A., “Topic and role discovery in social networks with experiments on enron and academic email”, Journal of Artificial Intelligence Research, 30 (1), 2007, 249–272.
Google Scholar
Mølgaard, L., Larsen, J., and Goutte, C., “Temporal analysis of text data using latent variable models”, IEEE International Workshop on Machine Learning for Signal Processubg, 2009.
Google Scholar
Papadimitriou, C.H., Raghavan, P., Tamaki, H., and Vempala, S., “Latent semantic indexing: A probabilistic analysis”, Journal of Computer and System Sciences, 61, 2000, 217–235.
Web of Science ®Google Scholar
Pons-Porrata, A., Berlanga-Llavori, R., and Ruiz-Shulcloper, J., “Topic discovery based on text mining techniques”, Information Processing and Management, 43 (3), 2007, 752–768.
Web of Science ®Google Scholar
Rodriguez, M., Ali, S., and Kanade, T., “Tracking in Unstructured Crowded Scenes”, The 12 IEEE International Conference on Computer Vision, 2009.
Google Scholar
Romberg, S., Horster, E., and Lienhart, R., “Multimodal pLSA on visual features and tags”, The Institute of Electrical and Electronics Engineers Inc., 2009, 414–417.
Google Scholar
Rosenfeld, R., “Two decades of statistical language modeling: where do we go from here?”, Proceedings of the IEEE, 88 (8), 2000, 1270–1278.
Web of Science ®Google Scholar
Salton, G., “Automatic text processing: the transformation, analysis, and retrieval of information by computer”, Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA, 1989.
Google Scholar
Santhiappan, S., Gopalan, V.P., and Valarmathi, B., “Topic models based personalized spam filter”, Proceedings of ISCF, 2006, 199–203.
Google Scholar
Sanz, E.P., Hidalgo, J.M.G., and Perez, J.C.C., “Email Spam Filtering”, Advances in Computers, 74, 2008, 45–109.
Web of Science ®Google Scholar
Sidorova, A., Evangelopoulos, N., Valacich, J., and Ramakrishnan, T., “Uncovering the intellectual core of the information systems discipline”, MIS Quarterly, 32 (3), 2008, 467–482.
Web of Science ®Google Scholar
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., and Freeman, W.T., “Discovering objects and their location in images”, International Conference on Computer Vision (ICCV 2005), 2005.
Google Scholar
Strunk Jr, W., “The elements of style”, Filiquarian Publishing, LLC., 2007.
Google Scholar
Sun, J., Zhang, Q., Yuan, Z., Huang, W., Yan, X., and Dong, J., “Research of Spam Filtering System Based on LSA and SHA”, Springer, 2008, 340.
Google Scholar
Tetlock, P.C., Saar-Tsechansky, M., and Macskassy, S., “More than words: Quantifying language to measure firms' fundamentals”, Journal of Finance, 63 (3), 2008, 1437–1467.
Web of Science ®Google Scholar
Titov, I., and McDonald, R., “A joint model of text and aspect ratings for sentiment summarization”, Urbana, 51, 2008, 308–316.
Google Scholar
Wu, H., Wang, Y., and Cheng, X., “Incremental probabilistic latent semantic analysis for automatic question recommendation”, ACM New York, NY, USA, 2008, 99–106.
Google Scholar
Xu, W., Liu, D., Guo, J., Cai, Y., and Hu, R., “Supervised Dual-PLSA for Personalized SMS Filtering”, Springer, 2009, 254–264.
Google Scholar
Yang, W., and Dia, J., “Discovering cohesive subgroups from social networks for targeted advertising”, Expert Systems with Applications, 34, 2008, 2029–2038.
Web of Science ®Google Scholar
Zhai, H., Guo, J., Wu, Q., Cheng, X., Sheng, H., and Zhang, J., “Query Classification Based on Regularized Correlated Topic Model”, 2009.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

An Empirical Comparison of Four Text Mining Methods

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

An Empirical Comparison of Four Text Mining Methods

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date