Abstract
For this study a simulation is conducted to investigate the accuracy of neural networks and logistic regression in identifying populations at high risk for occupational back injury. In contrast to most standard regression techniques, neural networks do not rely on linearity or explicitly specifying the nature of the association. Because the underlying relationships between work exposures, personal risk factors, and injury are often not well defined, neural networks may prove useful for injury risk assessment. Accuracy was assessed by comparing the injury status to the predicted level of risk in each worker. In simulations of a non-linear association, workers (used in the training data) were correctly classified 85% of the time with neural networks, 74% of the time with the main effects logistic model, and 79% of the time with the fully-specified logistic model. Using the test data, however, workers were correctly classified 67% of the time with neural networks, and 71% and 69% of the time with the main effects and fully specified logistic models, respectively. Simulations of a null association indicated that neural networks may be more likely to overfit random associations. These findings provide a valuable guide concerning statistical methodology for identifying high-risk worker populations.