Browse
We’re here to help

Find guidance on Author Services

Search
Browse
We’re here to help

Find guidance on Author Services

Home
All Journals
Applied Artificial Intelligence
List of Issues
Volume 37, Issue 1
A study on the evaluation of tokenizer p ....

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Search in:

Advanced search

Applied Artificial Intelligence

An International Journal

Volume 37, 2023 - Issue 1

Submit an article Journal homepage

Open access

4,616

Views

CrossRef citations to date

Altmetric

Research Article

A study on the evaluation of tokenizer performance in natural language processing

Sanghyun Chooa Edward P. Fitts Department of Industrial and Systems Engineering, North Carolina State University, Raleigh, NC, USA

https://orcid.org/0000-0002-8884-3437 View further author information

Wonjoon Kimb Division of Future Convergence (HCI Science Major), Dongduk Women’s University, Seoul, South KoreaCorrespondence[email protected]

https://orcid.org/0000-0001-5177-8072 View further author information

Article: 2175112 | Received 17 Jun 2022, Accepted 27 Jan 2023, Published online: 09 Feb 2023

Cite this article
https://doi.org/10.1080/08839514.2023.2175112
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Figures & data

Figure 1. Illustrative of 10-fold cross-validation in this study.

Figure 2. Illustrative of overall research flow.

Table 1. The results of accuracy for SentencePiece and Mecab-Ko each classification algorithm.

Download CSV Display Table

Table 2. The results of precision for SentencePiece and Mecab-Ko each classification algorithm.

Download CSV Display Table

Table 3. The results of recall for SentencePiece and Mecab-Ko each classification algorithm.

Download CSV Display Table

Table 4. The results of F1-score for SentencePiece and Mecab-Ko each classification algorithm.

Download CSV Display Table

Figure 3. Result of the accuracy for token number of SentencePiece of each learning algorithm.

Figure 4. Result of the F1-score for token number of SentencePiece of each learning algorithm.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references