163
Views
0
CrossRef citations to date
0
Altmetric
Articles

Predicting research projects’ output using machine learning for tailored projects management

& ORCID Icon
 

ABSTRACT

With the increasing interest and investment in research and development (R&D), the need for more efficient research project management has grown. Accordingly, we built prediction models to classify research projects that were expected to show excellent research output. Specifically, we applied five machine learning techniques to build prediction models. In an empirical analysis of data on research projects funded by South Korea over the last five years (2014–2018), we found that the automated machine learning model (autoML), in which the machine builds the most suitable learning model, shows relatively greater and more robust performance than models based on other techniques. We also established that research funding and project type played the most important roles in predicting excellent research projects. This study is significant because it shows the need for a paradigm shift in building an evidence-based project management system by verifying the utility and applicability of a data-driven approach in R&D project management.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 The South Korean government’s R&D investment has constantly increased since 1964 and surpassed KRW 20 trillion (≈ USD17.1 billion) for the first time in 2019, and the R&D budget for 2020 has been KRW 24 trillion, (≈ USD 20.5 billion) showing a remarkable increase of 17.3% compared to the previous year.

2 The number of government-funded research projects conducted in 2019 in South Korea was approximately 70,000, showing a 22.6% growth compared to 2015 (Lee & Yoo, Citation2020).

3 In a preliminary study, we compared the prediction performance between classical and AI-based approaches. The results unequivocally demonstrate that AI-based approaches exhibit a significant superiority over classical approaches. This substantiates the importance of incorporating advanced quantitative methods like AI to effectively address our research problem. For comprehensive experimental findings, please refer to Supplemental S1.

4 AI techniques are recently showing remarkable development in terms of performance, which already exceeds human judgment or prediction in various fields. This development is applied to various public sectors from images or voice recognition to security and healthcare, contributing to creating better social values.

5 NTIS operates and discloses the National R&D Information Standard Database. As of 2017, total 422 organizations are collecting information including representative specialized agencies (17 agencies) and project management agencies (125 agencies) managing R&D projects in each government ministry.

6 For simplicity, only the values of the top three codes of each categorical variable were reported.

7 Naïve Bayes, Support Vector Machine, Random Forest, TabNet, and autoML

8 There are a total of seven algorithms included in autoML: Distributed random forest, Generalized linear model, XGBoost Gradient boosting algorithm, H2O Gradient boosting algorithm, Deeplearning, and Stacked ensemble.

Additional information

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government [grant number 2019R1F1A1063365].

Notes on contributors

Huijae Kim

Huijae Kim is a Ph.D. student in the department of industrial and systems engineering at KAIST, Korea. Her research interests primarily focus on data analytics and optimisation. Kim received her MS degree from KAIST in the department of industrial and systems engineering.

Hoon Jang

Hoon Jang is an associate professor in the College of Global Business at Korea University, Korea. His research interests are primarily in the area of complex system designs, data-driven modelling and applied operations management problems. Dr. Jang obtained his MS and PhD degrees from KAIST in the dept of industrial and systems engineering.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.