Search in:

Advanced search

Cogent Engineering Volume 10, 2023 - Issue 1

Submit an article Journal homepage

Open access

813

Views

CrossRef citations to date

Altmetric

Computer Science

A comprehensive comparison and analysis of machine learning algorithms including evaluation optimized for geographic location prediction based on Twitter tweets datasets

Hasti Samadi1 School of Computing and Information Systems, The University of Melbourne, Parkville, Australia

https://orcid.org/0000-0002-3689-6668 View further author information

Mohammed Ahsan Kollathodi1 School of Computing and Information Systems, The University of Melbourne, Parkville, AustraliaCorrespondence[email protected]

https://orcid.org/0000-0002-4229-0603 View further author information

Article: 2232602 | Received 17 Mar 2022, Accepted 25 Jun 2023, Published online: 04 Aug 2023

Cite this article
https://doi.org/10.1080/23311916.2023.2232602
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Figures & data

Figure 1. The frequency distribution for classes of duplicated tweets.

Figure 3. A comparison between training set performance and the development set performance.

Figure 2. Histogram of the Target Labels.

Figure 4. Training and development set confusion matrix.

Figure 5. Development set performance.

Figure 6. Training set performance after upsampling.

Figure 7. Training and development set confusion matrix.

Figure 9. Class distribution after upsampling and down sampling.

Figure 8. Class distribution after upsampling and downsampling.

Figure 12. Development set performance after upsampling and downsampling.

Figure 13. Confusion matrix after another round of upsampling and downsampling.

Figure 10. Confusion matrix after upsampling and down sampling.

Figure 11. Training and development set performance after upsampling and down sampling.

Figure 14. Development set performance on different classifiers (precision, recall and f1-score).

Figure 15. The development set performance for k equal to 3.

Figure 16. The Development set performance for k equal to 5.

Figure 17. Development set performance for k equal to 10.

Figure 18. Training and development set performance for rfc_1.

Figure 19. Training and Development set performance of rfc_2.

Figure 20. Training, development set performance and confusion matrix for rfc_3 (when after upsampling and downsampling) have been applied.

Figure 21. Training and development set performance for rfc_4.

Figure 22. Training and development set performance evaluation for rfc_5.

Figure 23. Class distribution with Upsampling.

Figure 24. Confusion matrix after upsampling and downsampling.

Figure 25. Training and development set performance for rfc_7 random forest classifier.

Figure 26. Training and development set performance for rfc_8.

Figure 27. Confusion matrix after two rounds of upsampling and downsampling.

Figure 28. Confusion matrix for training and development set (after upsampling and downsampling).

Figure 29. Training and Development set performance for rfc_9.

Figure 30. Training and development set performance after upsampling and downsampling.

Figure 31. Confusion matrix for the training set and development set, After upsampling and downsampling.

Figure 32. Training and development set performance for rfc_11.

Figure 33. Confusion matrix for training and development set (after upsampling and downsampling).

Table 1. Optimal training set performance

Download CSV Display Table

Table 2. Optimal development set performance

Download CSV Display Table

Figure 34. The concept of Precision and recall.

Figure 35. Final comparative results of performance among different classifiers.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

A comprehensive comparison and analysis of machine learning algorithms including evaluation optimized for geographic location prediction based on Twitter tweets datasets

Table 1. Optimal training set performance

Table 2. Optimal development set performance

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

A comprehensive comparison and analysis of machine learning algorithms including evaluation optimized for geographic location prediction based on Twitter tweets datasets

Figures & data

Table 1. Optimal training set performance

Table 2. Optimal development set performance

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date