897
Views
0
CrossRef citations to date
0
Altmetric
Articles

Using neural networks to predict high-risk flight environments from accident and incident data

& ORCID Icon

ABSTRACT

Flight risk assessment tools (FRATs) aid pilots in evaluating risk arising from the flight environment. Current FRATs are subjective, based on linear analyses and subject-matter expert interpretation of flight factor/risk relationships. However, a ‘flight system’ is complex with non-linear relationships between variables and emergent outcomes. A neural network was trained to categorize high and low-risk flight environments from factors such as the weather and pilot experience using data extracted from accident and incident reports. Negative outcomes were used as markers of risk level, with low severity outcomes representing low-risk environments and high severity outcomes representing high-risk environments. Eighteen models with varied architectures were created and evaluated for convergence, generalization and stability. Classification results of the highest performing model indicated that neural networks have the ability to learn and generalize to unseen accident and incident data, suggesting that they have the potential to offer an alternative to current risk analysis methods.

1. Introduction

In the aviation industry, risk management has become increasingly formalized and regulated. Many US Part 91 operators (covered by general operating and flight rules) and Part 135 operators (companies operating commuter and on-demand services) have voluntarily established risk assessment procedures [Citation1]. In 2012, the International Civil Aviation Organization (ICAO) implemented Annex 19, requiring member states to mandate for the establishment of data-driven risk management processes for international operators of large and turbojet aircraft [Citation2]. In 2014, the Federal Aviation Administration (FAA) implemented 14 Code of Federal Regulations (CFR) part 135.617 requiring helicopter air ambulance pilots to conduct formal pre-flight risk assessments [Citation3]. In 2015, the FAA added part 5 to 14 CFR applicable to US part 121 (airline) and international General Aviation (GA) operators [Citation4].

Current flight risk assessment tools (FRATs), such as that developed by the FAA [Citation5], are typically based upon a subject-matter expert (SME) risk and hazard analysis of accident data and compute a ‘risk number’ from a linear combination of ‘risk values’ assigned to flight environment factors related to the aircraft, aircrew, resource and atmospheric states expected during the flight. However, aviation operations are complex, with emergent outcomes arising from inter-related variables with relationships that are likely to be non-linear, and neither completely deterministic nor completely random [Citation6]. As a result, the use of linear methods may not be the most appropriate approach.

In a comparative study of maintenance operations, linear and non-linear analysis methods were applied to the prediction of safety outcomes collected over 6.2 years [Citation7]. While Poisson regression (linear) completely failed to predict safety outcomes, neural networks (NNs) (non-linear) were significantly more accurate. This suggests that the analysis method should reflect the properties of the system being analysed and that linear methods ‘may be totally inappropriate if the underlying mechanism is nonlinear’ [Citation8, p. 36]. Therefore, for the prediction of flight risk, an emergent property of a complex system, it is expected that a non-linear analysis method would be more successful.

Aviation risk-analysis research is dominated by methods dependent on SME knowledge, often incorporated into Bayesian models. However, expert knowledge of highly complex systems is limited since threat–consequence relationships can be ambiguous [Citation9–11] and unknown or unexpected relationships cannot be accounted for [Citation11]. Furthermore, while Bayesian methods may result in high-performing models if the prior probability is based on population statistics, in practice it is typically calculated from finite samples and/or SME knowledge. Therefore, the applicability of Bayesian networks is limited [Citation12]. Fuzzy logic has also been applied to the analysis of flight environment factors to assess risk (e.g., [Citation13,Citation14]). However, in concluding his discussion of fuzzy logic methods, Hadjimichael [Citation13, p.6516] stated that ‘a more robust method of determining the “most causal” risk factors is necessary. This is a complex issue, as finding a meaningful and useful definition of “most causal” is a significant research challenge’.

In contrast, NNs do not require prior assumptions or expert knowledge of a system. A NN performs a holistic analysis of historical data to determine an overall relationship between multiple inputs and multiple outputs. NNs are based upon a simplified model of human NNs, containing nodes, which represent input and output values (derived from a learning data set), and ‘hidden’ layers that allow complex, non-linear relationships to be modelled. The essential feature of NNs is that they learn the relationship(s) between variables in a building process, and self-correct. They are trained by exposure to historical data presented in a supervised leaning set, with known inputs and outputs. Once data from the supervised learning set have passed through the model, the error between actual and expected output is calculated and the weights in the model are revised iteratively through a back-propagation process until the solution converges and the overall error rate falls below a pre-defined criterion. To validate the NN, the derived model is then supplied with input data from an unseen (hold-out) data set and the output predictions from the network are compared to the known actual outputs [Citation15,Citation16].

In recent years, NNs have emerged as a data mining technique for exploring complex relationships in large data sets (e.g., [Citation16,Citation17]). NNs allow the simultaneous prediction of multiple outcomes from multiple inputs, and hence are ideal for predicting flight risk from flight environment factors. They are particularly well suited to applications with noisy, missing, overlapping, non-linear and non-continuous data [Citation18] and can also handle highly unstructured data typical of that derived from accident and incident reports. They provide a means of building an empirically describable and verifiable model, which allows for the incorporation of contextual information [Citation19].

NNs have been applied to a diverse range of classification problems such as the stock market [Citation20] and cancer survival [Citation21]. In the transportation field, they have been successfully applied to specific complex prediction problems. A decision-making tool for aircraft safety inspectors [Citation22] evaluated the ability of a hybrid two-stage NN to analyse the relationships between aircraft operation and maintenance data, and service difficulty reporting (SDR) profiles. When compared with actual SDR numbers, 13 out of 19 NN models developed had R2 values above 0.80. Classifications of marine accidents based on the river stage, traffic level, utilizations, location, weather and time were performed with 80% accuracy [Citation23]. Predictions of the landing speed of McDonnell-Douglas MD80 aircraft in no/low-gust and high-gust conditions based on airport topology, the environment and flight and aircraft parameters were 95% correct [Citation24]. Predictions of pilot decision-making in the resolution of a disruptive passenger incident [Citation16] were 100% correct. Liu et al. [Citation25] developed a NN to predict fatal accidents in GA operations from an analysis of flight environment-type factors and achieved a classification accuracy of over 78%. Harris and Li [Citation26] developed a NN model based on the theoretical model of error causation underpinning the human factors analysis and classification system (HFACS) [Citation27,Citation28]. They found that 74% of unsafe acts (errors) implicated in 523 military aviation accidents could be correctly predicted from their preconditions.

The aim of this study was to investigate the ability of a NN to classify flight environments according to the level of risk. Since risk itself is ‘multidimensional and nuanced’ [Citation29, p.1647], and therefore difficult to define, accidents and incidents – classified as having high and low severity outcomes – were used as markers of risk level. The NN was trained to predict outcome severity using data extracted from historic accident and incident reports. Model development and training was based on the four-stage process described by Diallo [Citation24]: selecting and defining the input and output nodes; data source selection and coding; building the model; and model evaluation.

2. Method

2.1. Stage 1: selecting and defining the input and output nodes

An initial set of 37 input variables was derived from the FAA’s FRAT [Citation5], the Aviation Safety Reporting System (ASRS) coding taxonomy [Citation30], the ICAO (1993) human factors checklist [Citation31] and the software–hardware–environment liveware (SHEL[L]) model [Citation31,Citation32]. After a review of the National Transportation Safety Board (NTSB) and ASRS reports, two further variables were included – ‘high altitude’ and ‘busy/complex airspace’ – both of which were frequently present at the time of the accident or incident. After initial coding, the number of variables was subsequently reduced as a result of missing data and infrequently occurring factors, resulting in 29 input variables (see Table ).

Table 1. Definition of input variables.

Commonly used FRATs typically categorize flight risk at two levels: high risk that requires threat mitigation and management consultation; and low risk that does not. Therefore, the outputs of the NN were chosen to similarly categorize flight outcomes at two levels. High severity outcomes (moderate, major and catastrophic events) were considered markers of high-risk environments; low severity outcomes (negligible and minor events) were attributed to low-risk environments. These outputs were defined by a combination of the levels of severity definitions found in the ICAO [Citation2] safety risk severity table and example severity table (see Table ).

Table 2. Output variables and definitions taken from the example severity table (ICAO, p.2,App 2–3) and the safety risk severity table (ICAO, p.2–29) [Citation2].

2.2. Stage 2: data source selection and coding

The performance of a NN depends on the input and output combinations contained in the training data [Citation33]. Therefore, to predict flight-outcome severity, the data needed to encompass a range of events from insignificant incidents to catastrophic accidents. NTSB investigations involve accidents having substantial consequences for participants and equipment (aircraft and property damage, casualties and fatalities). Events reported to the ASRS programme typically involve minor incidents (e.g., altitude and airspace deviations) that, although undesirable, generally result in little or no consequence. Therefore, the appropriate range of data were obtained from a combination of NTSB and ASRS reports extracted from their online databases.

This study focused on US 14 CFR Part 135 (scheduled and non-scheduled) and Part 91 (general) operations involving two-engine turbofan, turbojet and turboprop airplanes certified under US 14 CFR Part 25 (light and medium transport only) and Part 23 (normal category airplanes) [Citation4]. Narrowing the focus to similar types of operation had the advantage of making certain factors approximately constant (e.g., level of training), thus reducing the number of input variables. Report selection was limited to these categories. Reports were rejected if they involved the following:

  • illegal activity, e.g., use of illicit drugs;

  • system failures whose cause or handling was not affected by factors encountered between engine start and engine shutdown;

  • military, government, skydiving and low-altitude operations;

  • experimental aircraft;

  • a captain who did not hold a commercial pilot or air transport pilot certificate.

The data extraction and coding process was undertaken in two steps. In step one, all initial 39 input factors were coded. Categorical factors, e.g., single pilot, were recorded using a binary code, with ‘1’ indicating its presence and ‘0’ indicating its absence. For scale data, e.g., runway length, the value was recorded. Outcome severity was coded in increments of 0.2, with 0.2 indicating an insignificant outcome and 1.0 indicating a catastrophic outcome (see Table ). Depending on the nature of the event, the most appropriate of column one or column two of Table  was used to determine outcome severity. Where necessary, the NTSB and ASRS reports were supplemented with airport, terrain, navigation and weather data from SkyVector [Citation34], AirNav [Citation35] and Ogimet [Citation36].

While NTSB reports include information describing the entire flight, ASRS reports are de-identified and typically describe only the circumstances surrounding the incident itself. Therefore, they frequently lack data such as departure and destination airport information and pilot flight time. Additionally, both NTSB and ASRS reports yielded low numbers of personal aircrew factors, such as fatigue and stress. The following steps were performed to address these issues:

  • Departure and destination airport variables selected in stage 1 were combined and re-defined as a single ‘airport’ variable (e.g., uncontrolled airport) that was applied to either airport, provided the missing airport data were not directly associated with the accident or incident. With the exception of reports concerning en-route events, if both the departure and destination airport information was missing, the report was rejected.

  • For reports concerning en-route incidents containing neither departure nor destination information and for which the airports were not directly associated with the incident, all airport factors were coded as ‘absent’. This allowed the inclusion of high altitude and en-route incidents.

  • ‘First Officer (FO) less than 200 h in type’ and ‘FO less than 100 h in 90 days’ were combined into a single variable of ‘Inexperienced FO’. If FO hours were not reported, this factor was coded as absent unless the narrative indicated that the FO was inexperienced.

  • Flight time accrued during the duty period, sleep deprivation, personal stress and intra-crew stress were combined into a single ‘Personal factors’ input.

Scalar values were then re-coded as binary categorical factors according to the discriminator indicated by its definition. The output factors were also re-coded as binary categorical factors, with 0.5 discriminating between low and high severity outcomes.

A total of 467 reports meeting the operations and airplane criteria were retrieved from the NTSB database, involving events that occurred between January 2005 and December 2015. Of these, 206 reports were rejected. A further 1614 reports meeting the operations criteria were retrieved from the ASRS database, involving events that occurred between January 2010 and December 2015. Of these, 1220 reports were rejected. As a result, a total of 655 reports were coded and analysed.

2.3. Stage 3: building the model

The NN analysis tools in SPSS version 24.0 were used to build and evaluate multiple models.

Severity of outcome was entered as the dependent variable. The data set was split into three partitions: ‘training’, ‘testing’ and ‘hold-out’. Accident/incident data sets were randomly assigned to each partition, but the same ratio of NTSB to ASRS reports was maintained in each partition (see Table ).

Table 3. Assignment of data sets.

Selecting the initial parameters for a NN architecture (such as the number of hidden layers and/or the number of nodes in each hidden layer) has been described as both ‘an empirical procedure with guidelines’ [Citation7, p.5] and ‘more of an art than a science’ [Citation8, p.42]. While rules of thumb guide NN design, the optimum combination of layers, nodes, activation functions and training settings for a particular problem is generally determined through trial and error. Therefore, multiple models were developed using either the automatic architecture option or the custom selections. Models designed using the custom options varied in choice of activation function of the output layer (identity, softmax, hyperbolic tangent or sigmoid) and the number of nodes in the hidden layer (5, 14 or 29). The hyperbolic tangent function was selected as the activation function for the hidden nodes for all models.

The gradient-descent training algorithm was used for all models. The maximum number of steps without an error was set at three. The minimum relative change in training error was set to 0.0001 and the training error ratio was 0.001.

Each NN model analysed the data 10 times, five times with the inputs ordered as presented in Table  and five times with the inputs reversed. The mean, median and standard deviation for the percentage of correct predictions were calculated from the 10 runs of each model. Mean importance values for each variable were calculated from the values reported for each run.

The best performing model was selected based on a combination of highest mean accuracies for the hold-out sample, lowest standard deviations and greatest area under the receiver operator characteristic (ROC) curve.

3. Results

Twenty NN models were created. When tested on the hold-out partition, 15 models had average classification accuracies of over 60% for both low and high severity outcomes. Three models had accuracies of over 65%. The highest performing model predicted the outcomes of unseen data with an average accuracy of 69.8% (SD 2.8%) for high severity outcomes and 65.5% (SD 8.4%) for low severity outcomes, with the highest performing run having a prediction accuracy of 71.1% for high severity outcomes and 77.3% for low severity outcomes. The mean area under the ROC curves was 0.843 for both low and high severity outcomes.

The average classification results of this model and the results of the highest performing run are presented in Tables  and . The architecture of the highest performing model had 58 nodes in the input layer, 29 nodes in the hidden layer and two nodes in the output layer. Mean variable rankings, importance values and normalized mean importance values are presented in Table .

Table 4. Average classification accuracy for the highest performing model.

Table 5. Classification accuracy of the highest performing run of the highest performing model.

Table 6. Variable importance values.

4. Discussion

Adya and Colopy [Citation37] suggest that a ‘good’ prediction model should demonstrate good training sample performance, be able to accurately predict outcomes from unseen data (good generalization) and be stable (produce consistent results). When measured against these standards, the results indicate that NNs have the potential for becoming ‘good’ models for categorizing flight environments. The training and test results of the highest performing model show that the model was able to learn, and the hold-out accuracy of the highest performing run suggests that the model has the potential to generalize well to unseen data (see Table ). The areas under the ROC curves indicate that the NN has a better than 0.8 probability of correctly categorizing a random event and is, therefore, a ‘good’ discriminator of low and high-risk flight environments. However, the standard deviations and hold-out results suggest that the model may be sensitive to the data set and variable input order, particularly for low severity outcomes but less so for more critical high severity outcomes.

Colopy et al. [Citation38] suggest that a NN should be validated by a comparison of out-of-sample performance with other well-accepted models. However, in this case, selecting such models against which to make an informative comparison proved problematic. Typical pre-flight risk analyses (e.g., those similar to the FAA’s FRAT) are generally tailored to the specific needs of an organization. They are highly subjective, depending on the choice of inputs, the scores assigned and the operators’ tolerance of risk. Additionally, there is no universally accepted measure of high and low risk.

However, a comparison with the GA fatalities NN, although limited by fundamental differences such as type of aviation operation and inputs analysed, was possible to some degree. While the average training and test results for the highest performing model were similar to the accuracy achieved by Liu et al. [Citation25], the average hold-out accuracy was a little less favourable. This may be explained, however, by the broader range of causal factors analysed by the GA network, which, while perhaps being good indicators of risk (e.g., phase of flight), would not be suitable for pre-flight risk analysis.

With no well-accepted risk assessment model against which to validate the NN, the results were benchmarked against the prediction accuracies of two other complex systems, the weather and the stock market. A NN developed to predict the height of the 500-hPa pressure level in the northern hemisphere achieved an average monthly prediction accuracy of between 0.70 and 0.95 [Citation39]. In contrast, the stock market, which is highly reactive to human emotions and decisions, has an average prediction accuracy of between 50.8 and 65.79% [Citation20]. These studies indicate that, first, the ability to predict the outcomes of any complex system is limited and, second, the more it involves human agents, the less predictable it gets. However, research also indicates that this effect may be moderated by agent training and the resolution of the analysis. A NN applied to a single aviation decision made by highly trained airline pilots predicted outcomes with 100% accuracy [Citation16]. In contrast, the GA fatalities NN [Citation25] applied to a more macroscopic element of the aviation system, involving pilots who were likely, in general, less well trained, achieved only 78.9% accuracy. These results imply that, for systems involving human agents, higher resolution (more specific problems), better training and less freedom of choice increases predictability.

The flight environment involves a mix of the human, the machine, resources and the atmosphere. The predictability of these components runs on a continuum from highly predicable (the machine), to somewhat predictable (the atmosphere and resources) to less predictable (the human). However, the humans in the professional aviation system (pilots, air traffic controllers, maintainers, etc.) are well trained and their range of choices is restrained by regulation. Against this context, the results obtained in this study, better than stock market prediction but inferior to the prediction of weather events, are considered reasonable for an initial study of a NN flight system analysis. However, it is acknowledged that, before a NN can be applied to real-world pre-flight risk assessment tasks, where high-risk ‘misses’ could have significant negative consequences, further refinements are required.

5. Recommendations and conclusion

Given the ambiguous nature of risk and the complexity of the flight system, the development and validation of accurate and meaningful pre-flight risk analysis tools is challenging. However, the findings of this exploratory study indicate that NNs are both suited to and have the potential for such a task. The classification accuracies, although limited by system complexity, are commensurate with those of other complex systems and similar research.

In addition to categorizing flight environments, NNs potentially offer a means of identifying the most significant predictors of flight risk through a sensitivity analysis of the variables (see Table ). The trends, similarities and differences in the variable rankings found in this study and between other studies may provide insight into the underlying features of the aviation system.

It is recommended that future research be directed down three paths. The first aims to improve the convergence and stability of the NN by addressing the variable, missing data, data source and coding problems. The current research can only be considered as a proof of concept – the current data set is quite small. As a result, the second recommendation is to expand the data set to include accident and incident reports from other countries, which can be used in the process of training, validation and testing of NN. Building on this, the third recommendation focuses on improving accuracy and generalization through NN design, allowing a more inclusive analysis of the flight system encompassing supervision and organizational elements. Suggestions for such methods include a ‘zooming out’ approach involving the analysis of individual system elements through multiple and multi-stage NNs and the output of risk indicators that feed forward into a flight-risk analysis (combining the methodologies applied by Luxhøj and Williams [Citation22] and Hsiao et al. [Citation7]). It is expected that progress made through such research could lead towards the development of functional, objective and informative flight-risk analysis tools to aid pilot decision-making.

Disclosure statement

No potential conflict of interest was reported by the authors.

References

  • Cory C. Building a viable flight risk assessment process in business jet operations: selecting a risk assessment tool, setting baselines, trigger and mitigation points. Collegiate Aviation Review. 2013;13(2):39–57.
  • International Civil Aviation Organization. Safety Management Manual Doc 9859. 3rd ed. Montreal (CA): International Civil Aviation Organization ; 2012 [cited 2021 Jan 2]. Available from: https://www.icao.int/sam/documents/rst-smsssp-13/smm_3rd_ed_advance.pdf
  • Office of the Federal Register, Volume 79, No. 25: helicopter air ambulance, commercial helicopter, and part 91 helicopter operations; final rule. Washington DC: Federal Aviation Administration ; 2014 [cited 2021 Jan 2]. Available from: https://www.gpo.gov/fdsys/pkg/FR-2014-02-21/pdf/2014-03689.pdf
  • Office of the Federal Register. Electronic Code of Federal Regulations, Title 14, Chapter I, Subchapter A, Part 5. Washington DC: Federal Aviation Administration; 2017 [cited 2021 Jan 2]. Available from: http://www.ecfr.gov/cgi-bin/text-idx?SID=31a1ff34c5fb23bfab62b105fa038ceb&mc=true&node=pt14.1.5&rgn=div5
  • Federal Aviation Administration. Flight risk assessment tool (InFO 07015). Federal Aviation Administration; 2007 [cited 2021 Jan 2]. Available from : https://www.faa.gov/other_visit/aviation_industry/airline_operators/airline_safety/info/all_infos/media/2007/info07015.pdf
  • Sayama H. Introduction to the modeling and analysis of complex systems. Geneseo (NY): Open SUNY Textbooks; 2015.
  • Hsiao Y, Drury C, Wu C, et al. Predictive models of safety based on audit findings: part 1: model development and reliability. Appl Ergon. 2013;44(2):261–273. doi: 10.1016/j.apergo.2012.07.010
  • Zhang G, Patuwo E, Hu MY. Forecasting with artificial neural networks: the state of the art. Int J Forecast. 1998;14(1):35–62. doi: 10.1016/S0169-2070(97)00044-7
  • Macrae C. Making risks visible: identifying and interpreting threats to airline flight safety. J Occup Organ Psychol. 2009;82(2):273–293. doi: 10.1348/096317908X314045
  • Booker P. Experts, Bayesian belief networks, rare events and aviation risk estimates. Saf Sci. 2011;49(8–9):1142–1155. doi: 10.1016/j.ssci.2011.03.006
  • Sackman H. Delphi assessment: expert opinion, forecasting and group process. Santa Monica (CA): Rand Corporation; 1974. (Report no. AD-786 878).
  • Weiss SM, Kulikowski CA. Computers systems that learn. San Mateo (CA): Morgan Kaufmann; 1991.
  • Hadjimichael M. A fuzzy expert system for aviation risk assessment. Expert Syst Appl. 2009;36(3/2):6512–6519. doi: 10.1016/j.eswa.2008.07.081
  • Cheng CB, Shyur HJ, Kuo YS. Implementation of a flight operations risk assessment system and identification of critical risk factors. Sci Iran. 2014;21(6):2387–2398.
  • Garson DG. Neural networks: an introductory guide for social scientists. London: Sage; 1998.
  • Duggan SJ, Harris D. Modelling naturalistic decision making using an artificial neural network: pilot’s responses to a disruptive passenger incident. Hum Fac Aero Saf. 2001;1(2):145–166.
  • Hair JF, Anderson RE, Tatham RL, et al. Multivariate data analysis. 5th ed. Upper Saddle River (NJ): Prentice-Hall; 1998.
  • Moore B. ART1 and pattern clustering. In: D Touretzky, G Hinton, T Sejnowski, editors. Proceedings of the 1988 connectionist model summer school. San Mateo (CA): Morgan Kaufmann; 1988. p. 174–185.
  • Haykin S. Neural networks: a comprehensive foundation. 2nd ed. Upper Saddle River (NJ): Prentice Hall; 1999.
  • Kim K, Han I. Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index. Expert Syst Appl. 2000;19(2):125–132. doi: 10.1016/S0957-4174(00)00027-0
  • Burke HB, Goodman PH, Rosen DB, et al. Artificial neural networks improve the accuracy of cancer survival prediction. Cancer. 1997;79(4):857–862. doi: 10.1002/(SICI)1097-0142(19970215)79:4<857::AID-CNCR24>3.0.CO;2-Y
  • Luxhøj JT, Williams TP. Integrated decision support for aviation safety inspectors. Finite Elem Anal Des. 1996;23(2–4):381–403. doi: 10.1016/S0168-874X(96)80018-7
  • Hashemi RR, Le Bland LA, Rucks CT, et al. A neural network for transportation safety modeling. Expert Syst Appl. 1995;9(3):247–256. doi: 10.1016/0957-4174(95)00002-Q
  • Diallo ON. A predictive aircraft landing speed model using neural network. Paper presented at: 31st Digital Avionics Systems Conference; 2012 Oct 14–18; Moffett Field, CA.
  • Liu D, Nickens T, Hardy L, et al. Effect of HFACS and non-HFACS-related factors on fatalities in general aviation accidents using neural networks. Int J Aviat Psychol. 2013;23(2):153–168. doi: 10.1080/10508414.2013.772831
  • Harris D, Li W-C. Using neural networks to predict HFACS unsafe acts from their pre-conditions. Ergonomics. 2019;62(2):181–191. doi: 10.1080/00140139.2017.1407441
  • Shappell SA, Wiegmann DA. The human factors analysis and classification system – HFACS (DOT/FAA/AM-00/7). Washington (DC): FAA, Office of Aviation Medicine; 2000.
  • Wiegmann DA, Shappell SA. A human error approach to aviation accident analysis: the human factors analysis and classification system. Aldershot: Ashgate; 2003.
  • Haimes YY. On the complex definition of risk: a systems-based approach. Risk Anal. 2009;29(12):1647–1654. doi: 10.1111/j.1539-6924.2009.01310.x
  • National Aeronautics and Space Administration. ASRS coding taxonomy. Moffett Field (CA): NASA Ames Research Center; n.d. [cited 2021 Jan 6]. Available from: https://asrs.arc.nasa.gov/search/dbol/databasecoding.html
  • International Civil Aviation Organization. Human Factors Digest No. 7, Circular 240-AN/144. Montreal (CA): International Civil Aviation Organization; 1993 [cited 2021 Jan 6]. Available from: http://www.skybrary.aero/bookshelf/books/2037.pdf
  • Edwards E. Man and machine: systems for safety. In: Proceedings of British Airline Pilots Association Technical Symposium. London (UK): British Airline Pilots Association; 1972. p. 21–36.
  • Ung ST, Williams V, Bonsall S, et al. Test case based risk predictions using artificial neural network. J Saf Res. 2006;37(3):245–260. doi: 10.1016/j.jsr.2006.02.002
  • SkyVector aeronautical charts. Skyvector; 2017 [cited 2021 Jan 6]. Available from: www.skyvector.com
  • AirNav.com: the pilots window into a world of aviation information. AirNav; n.d. [cited 2021 Jan 6]. Available from: http://www.airnav.com
  • Ogimet: professional information about meteorological conditions in the world; 2017 [cited 2021 Jan 6]. Available from: http://www.ogimet.com/metars.phtml.en
  • Adya M, Collopy F. How effective are neural networks at forecasting and prediction? A review and evaluation. J Forecast. 1998;17(5–6):481–495. doi: 10.1002/(SICI)1099-131X(1998090)17:5/6<481::AID-FOR709>3.0.CO;2-Q
  • Collopy F, Adya M, Armstrong JS. Research report – principles for examining predictive validity: the case of information systems spending forecasts. Inf Syst Res. 1994;5(2):170–179. doi: 10.1287/isre.5.2.170
  • NCEP/EMC global model experimental forecast performance statistics; 2017 [cited 2021 Jan 6]. Available from: http://www.emc.ncep.noaa.gov/gmb/STATS_vsdb