ABSTRACT
We explore the interplay between domain-informed feature engineering, model performance, and model interpretability. This is a hybrid modelling and simulation study that merges the application of discrete-event simulation with alternative metamodelling techniques for modelling patient flow in health care. We consider two cases: a tandem queueing system of obstetric hospital units and a transient analysis of an outpatient clinic in which a finite number of scheduled patients arrive for care. We use several metamodels including various types of linear models, random forests, and neural networks. We evaluate the performance improvement of metamodel estimation when empowered with supplementary queueing theory knowledge. We consider three knowledge levels: no knowledge (no queueing-inspired features), basic (simple queueing features), and advanced (sophisticated queueing approximations). Our results show that queueing-related inputs improve the accuracy for the metamodels, independent of the model type. Moreover, queueing-related inputs improve model explainability and can lead to more parsimonious models. This has positive practical implications for implementing these types of models in actual health-care analytic projects.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data/Material
The simulation model along with supporting software for input and output processing as well as documentation are available as part of a free and open-source project called obflowsim-mm. This project is available on GitHub at https://github.com/misken/obflowsim-mm. It includes all of the source code used for creation of simulation inputs, the simulation model itself, and the simulation output processing. The project also includes code for metamodel data preparation, fitting, and output processing. All of the code for Case 1 is written in Python and uses well-known libraries such as SimPy, Pandas, and Scikit-Learn. The project includes a Jupyter notebook with an explanation of how to run the various stages of the code pipeline. The code for Case 2 analysis was written in R and can be found in a separate GitHub repo at https://github.com/misken/op_clinic_mm.