4,488
Views
6
CrossRef citations to date
0
Altmetric
Research Article

A study on the evaluation of tokenizer performance in natural language processing

ORCID Icon & ORCID Icon
Article: 2175112 | Received 17 Jun 2022, Accepted 27 Jan 2023, Published online: 09 Feb 2023

References

  • Alluhaibi, R., T. Alfraidi, M. A. Abdeen, and A. Yatimi. 2021. A comparative study of Arabic part of speech taggers using literary text samples from Saudi novels. Information 12 (12):523. doi:10.3390/info12120523.
  • Arora, M., and V. Kansal. 2019. Character level embedding with deep convolutional neural network for text normalization of unstructured data for twitter sentiment analysis. Social Network Analysis and Mining 9 (1):1–530. doi:10.1007/s13278-019-0557-y.
  • Auxier, B., and M. Anderson. 2021. Social media use in 2021. Pew Research Center 1:1–4.
  • Balbi, S., M. Misuraca, and G. Scepi. 2018. Combining different evaluation systems on social media for measuring user satisfaction. Information Processing & Management 54 (4):674–85. doi:10.1016/j.ipm.2018.04.009.
  • Basiri, M. E., S. Nemati, M. Abdar, E. Cambria, and U. R. Acharya. 2021. ABCDM: An attention-based bidirectional CNN-RNN deep model for sentiment analysis. Future Generation Computer Systems 115:279–94. doi:10.1016/j.future.2020.08.005.
  • Bataa, E., and J. Wu. 2019. An investigation of transfer learning-based sentiment analysis in japanese. arXiv preprint arXiv:1905.09642.
  • Bérard, A., I. Calapodescu, M. Dymetman, C. Roux, J. L. Meunier, and V. Nikoulina. 2019. Machine translation of restaurant reviews: New corpus for domain adaptation and robustness. arXiv preprint arXiv:1910.14589.
  • Collobert, R., J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12:2493–537.
  • Colón-Ruiz, C., and I. Segura-Bedmar. 2020. Comparing deep learning architectures for sentiment analysis on drug reviews. Journal of Biomedical Informatics 110:103539. doi:10.1016/j.jbi.2020.103539.
  • Decker, R., and M. Trusov. 2010. Estimating aggregate consumer preferences from online product reviews. International Journal of Research in Marketing 27 (4):293–307. doi:10.1016/j.ijresmar.2010.09.001.
  • Duarte, P., S. C. E Silva, and M. B. Ferreira. 2018. How convenient is it? Delivering online shopping convenience to enhance customer satisfaction and encourage e-WOM. Journal of Retailing and Consumer Services 44:161–69. doi:10.1016/j.jretconser.2018.06.007.
  • Eom, G., S. Yun, and H. Byeon. 2022. Predicting the sentiment of South Korean twitter users toward vaccination after the emergence of COVID-19 Omicron variant using deep learning-based natural language processing. Frontiers in Medicine 9. doi:10.3389/fmed.2022.948917.
  • Fang, X., and J. Zhan. 2015. Sentiment analysis using product review data. Journal of Big Data 2 (1):5. doi:10.1186/s40537-015-0015-2.
  • Farha, I. A., and W. Magdy. 2021. A comparative study of effective approaches for Arabic sentiment analysis. Information Processing & Management 58 (2):102438. doi:10.1016/j.ipm.2020.102438.
  • Gruen, T. W., T. Osmonbekov, and A. J. Czaplewski. 2006. eWOM: The impact of customer-to-customer online know-how exchange on customer value and loyalty. Journal of Business Research 59 (4):449–56. doi:10.1016/j.jbusres.2005.10.004.
  • Hashimoto, J., A. Mutoh, K. Moriyama, A. Yokogoshi, E. Yoshida, T. Matsui, and N. Inuzuka (2021, October). Classification of buzzwords by focusing on time trends using twitter data. In 2021 IEEE 10th Global Conference on Consumer Electronics (GCCE), Kyoto, Japan, (pp. 342–44). IEEE.
  • Henson, B., C. Barnes, R. Livesey, T. Childs, and K. Ewart. 2006. Affective consumer requirements: A case study of moisturizer packaging. Concurrent Engineering 14 (3):187–96. doi:10.1177/1063293X06068358.
  • Hochreiter, S., and J. Schmidhuber. 1997. Long short-term memory. Neural computation 9 (8):1735–80. doi:10.1162/neco.1997.9.8.1735.
  • Jabreel, M., N. Maaroof, A. Valls, and A. Moreno. 2021. Introducing sentiment analysis of textual reviews in a multi-criteria decision aid system. Applied Sciences 11 (1):216. doi:10.3390/app11010216.
  • Jain, P. K., W. Quamer, V. Saravanan, and R. Pamula. 2022a. Employing BERT-DCNN with sentic knowledge base for social media sentiment analysis. Journal of Ambient Intelligence and Humanized Computing 1–13. doi:10.1007/s12652-022-03698-z.
  • Jain, P. K., G. Srivastava, J. C. W. Lin, and R. Pamula. 2022b. Unscrambling Customer Recommendations: A Novel LSTM Ensemble Approach in Airline Recommendation Prediction Using Online Reviews. IEEE Transactions on Computational Social Systems 9 (6):1777–84. doi:10.1109/TCSS.2022.3200890.
  • Jain, P. K., E. A. Yekun, R. Pamula, and G. Srivastava. 2021. Consumer recommendation prediction in online reviews using Cuckoo optimized machine learning models. Computers & Electrical Engineering 95:107397. doi:10.1016/j.compeleceng.2021.107397.
  • Kaewpitakkun, Y., K. Shirai, and M. Mohd (2014, December). Sentiment lexicon interpolation and polarity estimation of objective and out-of-vocabulary words to improve sentiment classification on microblogging. In Proceedings of the 28th Pacific Asia conference on language, information and computing, Phuket,Thailand, (pp. 204–13).
  • Kalarani, P., and S. Selva Brunda. 2019. Sentiment analysis by POS and joint sentiment topic features using SVM and ANN. Soft Computing 23 (16):7067–79. doi:10.1007/s00500-018-3349-9.
  • Kim, W. 2021. A study on the subjective feeling affecting tactile satisfaction of leather in automobile: A structural equation modeling approach. International Journal of Industrial Ergonomics 84:103167. doi:10.1016/j.ergon.2021.103167.
  • Kim, W., B. Jin, S. Choo, C. S. Nam, and M. H. Yun. 2019a. Designing of smart chair for monitoring of sitting posture using convolutional neural networks. Data Technologies and Applications 53 (2):142–55. doi:10.1108/DTA-03-2018-0021.
  • Kim, W., T. Ko, I. Rhiu, and M. H. Yun. 2019b. Mining affective experience for a kansei design study on a recliner. Applied Ergonomics 74:145–53. doi:10.1016/j.apergo.2018.08.014.
  • Kim, W., Y. Lee, J. H. Lee, G. W. Shin, and M. H. Yun. 2018a. A comparative study on designer and customer preference models of leather for vehicle. International Journal of Industrial Ergonomics 65:110–21. doi:10.1016/j.ergon.2017.07.009.
  • Kim, W., D. Park, Y. M. Kim, T. Ryu, and M. H. Yun. 2018b. Sound quality evaluation for vehicle door opening sound using psychoacoustic parameters. Journal of Engineering Research 6 (2):176–190.
  • Kim, S., and J. Song. 2022. Semantic analysis via application of deep learning using naver movie review data. The Korean Journal of Applied Statistics 35 (1):19–33.
  • Kim, Y. M., Y. Son, W. Kim, B. Jin, and M. H. Yun. 2018. Classification of children’s sitting postures using machine learning algorithms. Applied Sciences 8 (8):1280. doi:10.3390/app8081280.
  • Kitsios, F., M. Kamariotou, P. Karanikolas, and E. Grigoroudis. 2021. Digital marketing platforms and customer satisfaction: Identifying eWOM using big data and text mining. Applied Sciences 11 (17):8032. doi:10.3390/app11178032.
  • Kudo, T., and J. Richardson. 2018. Sentencepiece: A simple and language independent sub-word tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808 06226. doi:10.48550/arXiv.1808.06226.
  • Lee, Y., W. Kim, J. H. Lee, Y. M. Kim, and M. H. Yun. 2020. Understanding the relationship between user’s subjective feeling and the degree of side curvature in smartphone. Applied Sciences 10 (9):3320. doi:10.3390/app10093320.
  • Lim, J., and J. Kim. 2014. An Empirical Comparison of Machine Learning Models for Classifying Emotions in Korean Twitter. Journal of Korea Multimedia Society 17 (2):232–239. doi:10.9717/kmms.2014.17.2.232.
  • Lin, Y., J. Li, L. Yang, K. Xu, and H. Lin. 2020. Sentiment analysis with comparison enhanced deep neural network. IEEE Access 8:78378–84. doi:10.1109/ACCESS.2020.2989424.
  • Lipton, Z. C., J. Berkowitz, and C. Elkan. 2015. A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506 00019. doi:10.48550/arXiv.1506.00019.
  • Litvin, S. W., R. E. Goldsmith, and B. Pan. 2008. Electronic word-of-mouth in hospitality and tourism management. Tourism Management 29 (3):458–68. doi:10.1016/j.tourman.2007.05.011.
  • Liu, B. 2012. “Sentiment analysis and opinion mining. “Synthesis Lectures on Human Language Technologies 5 (1):1–167.
  • Monika, R., S. Deivalakshmi, and B. Janet (2019, December). Sentiment analysis of US airlines tweets using LSTM/RNN. In 2019 IEEE 9th International Conference on Advanced Computing (IACC), Chennai, India, (pp. 92–95). IEEE.
  • Moon, S., S. Park, D. Park, W. Kim, M. H. Yun, and D. Park. 2019. A study on affective dimensions to engine acceleration sound quality using acoustic parameters. Applied Sciences 9 (3):604. doi:10.3390/app9030604.
  • Moon, S., S. Shin, S. Kim, M. Jung, and J. Y. Jang (2022, June). Characteristic comparison of Korean unstructured dialogue corpora by morphological analysis. In 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS), Incheon, Korea, (pp. 1–4). IEEE.
  • Park, D., S. Park, W. Kim, I. Rhiu, and M. H. Yun. 2019. A comparative study on subjective feeling of engine acceleration sound by automobile types. International Journal of Industrial Ergonomics 74:102843. doi:10.1016/j.ergon.2019.102843.
  • Polignano, M., P. Basile, M. De Gemmis, G. Semeraro, and V. Basile (2019). Alberto: Italian BERT language understanding model for NLP challenging tasks based on tweets. In 6th Italian Conference on Computational Linguistics, CLiC-it 2019, Bari, Italy, (Vol. 2481, pp. 1–6). CEUR.
  • Pota, M., F. Marulli, M. Esposito, G. De Pietro, and H. Fujita. 2019. Multilingual POS tagging by a composite deep architecture based on character-level features and on-the-fly enriched word embeddings. Knowledge-Based Systems 164:309–23. doi:10.1016/j.knosys.2018.11.003.
  • Rose, S., N. Hair, and M. Clark. 2011. Online customer experience: A review of the business‐to‐consumer online purchase context. International Journal of Management Reviews 13 (1):24–39. doi:10.1111/j.1468-2370.2010.00280.x.
  • Ryu, T., B. Son, and W. Kim. 2020. Analysis of perceived exertion and satisfaction in the opening and closing of tailgates of SUVs. International Journal of Industrial Ergonomics 80:103033. doi:10.1016/j.ergon.2020.103033.
  • Singh, C., T. Imam, S. Wibowo, and S. Grandhi. 2022. A deep learning approach for sentiment analysis of COVID-19 reviews. Applied Sciences 12 (8):3709. doi:10.3390/app12083709.
  • Son, Y., and W. Kim. 2023. Development of methodology for classification of user experience (UX) in online customer review. Journal of Retailing and Consumer Services 71:103210. doi:10.1016/j.jretconser.2022.103210.
  • Su, J., S. Yu, and D. Luo. 2020. Enhancing aspect-based sentiment analysis with capsule network. IEEE Access 8:100551–61. doi:10.1109/ACCESS.2020.2997675.
  • Taniguchi, Y., S. I. Konomi, and Y. Goda 2019. “Examining language-agnostic methods of automatic coding in the community of inquiry framework.” In 16th International Conference on Cognition and Exploratory Learning in Digital Age IADIS Press, Cagliari, Italy, 19–26.
  • Tian, Y., M. Sun, Z. Deng, J. Luo, and Y. Li. 2017. A new fuzzy set and nonkernel SVM approach for mislabeled binary classification with applications. IEEE Transactions on Fuzzy Systems 25 (6):1536–45. doi:10.1109/TFUZZ.2017.2752138.
  • van der Heijden, N., S. Abnar, and E. Shutova (2020, April). A comparison of architectures and pretraining methods for contextualized multilingual word embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, Newyork, USA, (Vol. 34, No. 05, pp. 9090–97).
  • Vidal, L., G. Ares, and S. R. Jaeger. 2018. Application of social media for consumer research. In Methods in consumer research, ed. G. Ares and P. Varela, vol. 1, 125–55. Woodhead Publishing. doi:10.1016/B978-0-08-102089-0.00006-6.
  • Yang, L., Y. Li, J. Wang, and R. S. Sherratt. 2020. Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning. IEEE Access 8:23522–30. doi:10.1109/ACCESS.2020.2969854.
  • Yoo, C. W., G. L. Sanders, and J. Moon. 2013. Exploring the effect of e-WOM participation on e-loyalty in e-commerce. Decision Support Systems 55 (3):669–78. doi:10.1016/j.dss.2013.02.001.
  • Zou, J., Y. Han, and S. S. So. 2008. Overview of artificial neural networks. Artificial Neural Networks, vol. 458, 14–22.