1,667
Views
4
CrossRef citations to date
0
Altmetric
Research Article

Parsing AUC Result-Figures in Machine Learning Specific Scholarly Documents for Semantically-enriched Summarization

, , , , ORCID Icon, , & ORCID Icon show all
Article: 2004347 | Received 19 Mar 2021, Accepted 04 Nov 2021, Published online: 14 Nov 2021

References

  • Al-Zaidy, R. A., and C. L. Giles. 2015. Automatic extraction of data from bar charts. In Proceedings of the 8th International Conference on Knowledge Capture, 323. ACM, Palisades, NY, USA. October.
  • Al-Zaidy, R. A., and C. L. Giles (2017, February). A machine learning approach for semantic structuring of scientific charts in scholarly documents. In Twenty-Ninth IAAI Conference, San Francisco, California, USA.
  • Barros, C., E. Lloret, E. Saquete, and B. Navarro-Colorado. 2019. NATSUM: Narrative abstractive summarization through cross-document timeline generation. Information Processing & Management 56 (5):1775–349. doi:10.1016/j.ipm.2019.02.010.
  • Beaulieu, M., M. Gatford, X. Huang, S. Robertson, S. Walker, and P. Williams. 1997. Okapi at TREC-5. In NIST SPECIAL PUBLICATION SP, Netherlands, 143–66.
  • Bhatia, S., and P. Mitra. 2012. Summarizing figures, tables, and algorithms in scientific publications to augment search results. ACM Transactions on Information Systems (TOIS) 30:3.
  • Chen, C., R. Zhang, E. Koh, S. Kim, S. Cohen, T. Yu, R. Rossi, and R. Bunescu (2019) Neural Caption Generation over Figures. In Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and the 2019 International Symposium on Wearable Computers (UbiComp/ISWC ‘19 Adjunct), September 9–13,2019, London, United Kingdom. ACM, New York, NY, USA
  • Chen, K., M. Seuret, M. Liwicki, J. Hennebert, and R. Ingold (2015, August). Page segmentation of historical document images with convolutional autoencoders. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR) (pp. 1011–15). IEEE, Nancy, France.
  • Choudhury, P. S., S. Wang, and L. Giles (2015). Automated data extraction from scholarly line graphs. In GREC, Nancy, France
  • Choudhury, S. R., S. Tuarob, P. Mitra, L. Rokach, A. Kirk, S. Szep, …, and C. L. Giles (2013, July). A figure search engine architecture for a chemistry digital library. In Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries (pp. 369–70). ACM, Indianapolis, IN, USA.
  • Choudhury, S. R., S. Wang, and C. L. Giles (2016, June). Curve separation for line graphs in scholarly documents. In 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL) (pp. 277–78). IEEE, Newark, NJ, USA.
  • Clark, C., and S. Divvala, 2016. PDFFigures 2.0: Mining figures from research papers, in: Digital Libraries (JCDL), 2016 IEEE/ACM Joint Conference On. IEEE, pp. 143–52, Newark, NJ, USA.
  • Cliche, M., D. Rosenberg, D. Madeka, and C. Yee (2017, September). Scatteract: Automated extraction of data from scatter plots. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 135–50). Springer, Cham, Skopje, Macedonia.
  • Deng, J., W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, 2009. Imagenet: A large-scale hierarchical image database, in: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference On. Ieee, pp. 248–55, Miami, Florida, USA.
  • Goldberg, A. B., N. Fillmore, D. Andrzejewski, Z. Xu, B. Gibson, and X. Zhu (2009, May). May all your wishes come true: A study of wishes and how to recognize them. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 263–71). Association for Computational Linguistics, Boulder, Colorado, USA.
  • Hassan, S. U., A. Akram, and P. Haddawy (2017, June). Identifying important citations using contextual information from full text. In 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL) (pp. 1–8). IEEE, Toronto, ON, Canada.
  • Hassan, S. U., I. Safder, A. Akram, and F. Kamiran. 2018b. A novel machine-learning approach to measuring scientific knowledge flows using citation context analysis. Scientometrics 116 (2):973–96. doi:10.1007/s11192-018-2767-x.
  • Hassan, S. U., M. Imran, S. Iqbal, N. R. Aljohani, and R. Nawaz. 2018a. Deep context of citations using machine-learning models in scholarly full-text articles. Scientometrics 117 (3):1645–62. doi:10.1007/s11192-018-2944-y.
  • Hassan, S. U., M. Imran, T. Iftikhar, I. Safder, and M. Shabbir (2017, November). Deep stylometry and lexical & syntactic features based author attribution on PLoS digital repository. In International conference on Asian digital libraries (pp. 119–27). Springer, Cham, Bangkok, Thailand.
  • Hassan, S. U., P. Haddawy, P. Kuinkel, A. Degelsegger, and C. Blasy. 2012. A bibliometric study of research activity in ASEAN related to the EU in FP7 priority areas. Scientometrics 91 (3):1035–51. doi:10.1007/s11192-012-0665-1.
  • He, D., S. Cohen, B. Price, D. Kifer, and C. L. Giles (2017, November). Multi-scale multi-task fcn for semantic page segmentation and table detection. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) (Vol.1, pp. 254–61). IEEE, Kyoto, Japan.
  • Iqbal, S., S. U. Hassan, N. R. Aljohani, S. Alelyani, R. Nawaz, and L. Bornmann. 2021. A decade of in-text citation analysis based on natural language processing and machine learning techniques: An overview of empirical studies. Scientometrics 126 (8):6551–99. doi:10.1007/s11192-021-04055-1.
  • Iqbal, W., J. Qadir, G. Tyson, A. N. Mian, S. U. Hassan, and J. Crowcroft. 2019. A bibliometric analysis of publications in computer networking research. Scientometrics 119 (2):1121–55. doi:10.1007/s11192-019-03086-z.
  • Jung, D., W. Kim, H. Song, J. I. Hwang, B. Lee, B. Kim, and J. Seo (2017, May). ChartSense: Interactive data extraction from chart images. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 6706–17). ACM, Denver, CO, USA.
  • Kahou, S. E., V. Michalski, A. Atkinson, Á. Kádár, A. Trischler, and Y. Bengio. 2017. Figureqa: An annotated figure dataset for visual reasoning. ICLR, Vancouver, BC, Canada.
  • Khabsa, M., P. Treeratpituk, and C. L. Giles (2012, June). Ackseer: A repository and search engine for automatically extracted acknowledgments from digital libraries. In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries (pp. 185–94). ACM, Washington, DC, USA.
  • Lee, P. S., J. D. West, and B. Howe. 2017. Viziometrics: Analyzing visual information in the scientific literature. IEEE Transactions on Big Data 4 (1):117–29. doi:10.1109/TBDATA.2017.2689038.
  • Li, Z., M. Stagitis, S. Carberry, and K. F. McCoy (2013, July). Towards retrieving relevant information graphics. In Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval (pp. 789–92). ACM, Dublin, Ireland.
  • Lin, C. Y. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, 74–81, Spain.
  • Liu, X., K. H. Ghazali, F. Han, and I. I. Mohamed. 2021. Automatic detection of oil palm tree from UAV images based on the deep learning method. Applied Artificial Intelligence 35 (1):13–24. doi:10.1080/08839514.2020.1831226.
  • Liu, Y., K. Bai, P. Mitra, and C. L. Giles (2007, June). Tableseer: Automatic table metadata extraction and searching in digital libraries. In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries (pp. 91–100). ACM, Vancouver, BC, Canada.
  • Mohamed, M., and M. Oussalah. 2019. SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis. Information Processing & Management 56 (4):1356–72. doi:10.1016/j.ipm.2019.04.003.
  • Moraes, P., G. Sina, K. McCoy, and S. Carberry, 2014. Generating summaries of line graphs, in: Proceedings of the 8th International Natural Language Generation Conference (INLG). pp. 95–98, Philadelphia, Pennsylvania, USA.
  • Mutlu, B., E. A. Sezer, and M. A. Akcayol. 2019. Multi-document extractive text summarization: A comparative assessment on features. Knowledge-Based Systems, Amsterdam.
  • Pavičić, J., Ž. Andreić, T. Malvić, R. Rajić, and J. Velić, 2018. Application of Simpson\’ s and trapezoidal formulas for volume calculation of subsurface structures-recommendations, in: 2nd Croatian Scientific Congress from Geomathematics and Terminology in Geology, Croatia.
  • Qian, X., M. Li, Y. Ren, and S. Jiang. 2019. Social media based event summarization by user–text–image co-clustering. Knowledge-Based Systems 164:107–21. doi:10.1016/j.knosys.2018.10.028.
  • Rahi, S., I. Safder, S. Iqbal, S. U. Hassan, and R. Nawaz (2019) Citation classification using natural language processing and machine learning models. In proceedings of Conference on Smart Information & Communication Technologies (SmartICT’19), Oujda Morocco.
  • Ray Choudhury, S., P. Mitra, and C. L. Giles (2015, September). Automatic extraction of figures from scholarly documents. In Proceedings of the 2015 ACM Symposium on Document Engineering (pp. 47–50). ACM.
  • Saba, T., A. Rehman, A. Al-Dhelaan, and M. Al-Rodhaan. 2014. Evaluation of current documents image denoising techniques: A comparative study. Applied Artificial Intelligence 28 (9):879–87. doi:10.1080/08839514.2014.954344.
  • Safder, I., J. Sarfraz, S. U. Hassan, M. Ali, and S. Tuarob (2017, November). Detecting target text related to algorithmic efficiency in scholarly big data using recurrent convolutional neural network model. In International conference on Asian digital libraries (pp. 30–40). Springer, Cham, Bangkok, Thailand.
  • Safder, I., and S. U. Hassan (2018, November). DS4A: Deep search system for algorithms from full-text scholarly big data. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 1308–15). IEEE, Singapore.
  • Safder, I., and S. U. Hassan. 2019. Bibliometric-enhanced information retrieval: A novel deep feature engineering approach for algorithm searching from full-text publications. Scientometrics 119 (1):257–77. doi:10.1007/s11192-019-03025-y.
  • Safder, I., S. U. Hassan, A. Visvizi, T. Noraset, R. Nawaz, and S. Tuarob. 2020. Deep learning-based extraction of algorithmic metadata in full-text scholarly documents. Information Processing & Management 57 (6):102269. doi:10.1016/j.ipm.2020.102269.
  • Safder, I., S. U. Hassan, and N. R. Aljohani (2018, April). AI cognition in searching for relevant knowledge from scholarly big data, using a multi-layer perceptron and recurrent convolutional neural network model. In Companion Proceedings of the The Web Conference 2018 (pp. 251–58). International World Wide Web Conferences Steering Committee, Lyon , France.
  • Said, A., T. D. Bowman, R. A. Abbasi, N. R. Aljohani, S. U. Hassan, and R. Nawaz. 2019. Mining network-level properties of Twitter altmetrics data. Scientometrics 120 (1):217–35. doi:10.1007/s11192-019-03112-0.
  • Sermanet, P., D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun (2013). Overfeat: Integrated recognition, localization and detection using convolutional networks. In International Conference on Learning Representations (ICLR), pp. 1312.6229, Banff, AB, Canada.
  • Siegel, N., N. Lourie, R. Power, and W. Ammar (2018, May). Extracting scientific figures with distantly supervised neural networks. In Proceedings of the 18th ACM/IEEE on joint conference on digital libraries (pp. 223–32). ACM, Fort Worth, TX, USA.
  • Siegel, N., Z. Horvitz, R. Levin, S. Divvala, and A. Farhadi, 2016. FigureSeer: Parsing result-figures in research papers, in: European Conference on Computer Vision. Springer, pp. 664–80, Amsterdam, The Netherlands.
  • Sinoara, R. A., J. Camacho-Collados, R. G. Rossi, R. Navigli, and S. O. Rezende. 2019. Knowledge-enhanced document embeddings for text classification. Knowledge-Based Systems 163:955–71. doi:10.1016/j.knosys.2018.10.026.
  • Takimoto, H., F. Omori, and A. Kanagawa. 2021. Image aesthetics assessment based on multi-stream CNN architecture and saliency features. Applied Artificial Intelligence 35 (1):25–40. doi:10.1080/08839514.2020.1839197.
  • Tallarida, R. J., and R. B. Murray (1987). Area under a curve: Trapezoidal and Simpson’s rules. In Manual of Pharmacologic Calculations (pp. 77–81). Springer, New York, NY.
  • Thepade, S. D., and P. R. Chaudhari. 2021. Land usage identification with fusion of thepade SBTC and sauvola thresholding features of aerial images using ensemble of machine learning algorithms. Applied Artificial Intelligence 35 (2):154–70. doi:10.1080/08839514.2020.1842627.
  • Tsutsui, S., and D. J. Crandall (2017, November). A data driven approach for compound figure separation using convolutional neural networks. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) (Vol.1, pp. 533–40). IEEE, Kyoto, Japan.
  • Tuarob, S., S. Bhatia, P. Mitra, and C. L. Giles. 2016. AlgorithmSeer: A system for extracting and searching for algorithms in scholarly big data. IEEE Transactions on Big Data 2 (1):3–17. doi:10.1109/TBDATA.2016.2546302.
  • Unar, S., X. Wang, C. Wang, and Y. Wang. 2019. A decisive content based image retrieval approach for feature fusion in visual and textual images. Knowledge-Based Systems 179:8–20. doi:10.1016/j.knosys.2019.05.001.
  • Xu, J., F. Huang, X. Zhang, S. Wang, C. Li, Z. Li, and Y. He. 2019. Visual-textual sentiment classification with bi-directional multi-level attention networks. Knowledge-Based Systems 178:61–73. doi:10.1016/j.knosys.2019.04.018.
  • Zha, H., W. Chen, K. Li, and X. Yan (2019, July). Mining algorithm roadmap in scientific publications. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 1083–92), Anchorage, AK, USA.
  • Zhao, Z., H. Zhu, Z. Xue, Z. Liu, J. Tian, M. C. H. Chua, and M. Liu. 2019. An image-text consistency driven multimodal sentiment analysis approach for social media. Information Processing & Management 56 (6):102097. doi:10.1016/j.ipm.2019.102097.
  • Zhu, J., Y. Yang, Q. Xie, L. Wang, and S. U. Hassan. 2014. Robust hybrid name disambiguation framework for large databases. Scientometrics 98 (3):2255–74. doi:10.1007/s11192-013-1151-0.