103
Views
17
CrossRef citations to date
0
Altmetric
Original Research

Outcome predictors in autism spectrum disorders preschoolers undergoing treatment as usual: insights from an observational study using artificial neural networks

, , , &
Pages 1587-1599 | Published online: 30 Jun 2015

Abstract

Background

Treatment as usual (TAU) for autism spectrum disorders (ASDs) includes eclectic treatments usually available in the community and school inclusion with an individual support teacher. Artificial neural networks (ANNs) have never been used to study the effects of treatment in ASDs. The Auto Contractive Map (Auto-CM) is a kind of ANN able to discover trends and associations among variables creating a semantic connectivity map. The matrix of connections, visualized through a minimum spanning tree filter, takes into account nonlinear associations among variables and captures connection schemes among clusters. Our aim is to use Auto-CM to recognize variables to discriminate between responders versus no responders at TAU.

Methods

A total of 56 preschoolers with ASDs were recruited at different sites in Italy. They were evaluated at T0 and after 6 months of treatment (T1). The children were referred to community providers for usual treatments.

Results

At T1, the severity of autism measured through the Autism Diagnostic Observation Schedule decreased in 62% of involved children (Response), whereas it was the same or worse in 37% of the children (No Response). The application of the Semeion ANNs overcomes the 85% of global accuracy (Sine Net almost reaching 90%). Consequently, some of the tested algorithms were able to find a good correlation between some variables and TAU outcome. The semantic connectivity map obtained with the application of the Auto-CM system showed results that clearly indicated that “Response” cases can be visually separated from the “No Response” cases. It was possible to visualize a response area characterized by “Parents Involvement high”. The resultant No Response area strongly connected with “Parents Involvement low”.

Conclusion

The ANN model used in this study seems to be a promising tool for the identification of the variables involved in the positive response to TAU in autism.

Introduction

Autism spectrum disorders (ASDs) encompass a broad spectrum of heterogeneous neurodevelopmental disorders characterized by social communication impairments and restricted repetitive patterns of behavior.Citation1

Recent studies have compared specific early manualized interventions versus treatment as usual (TAU) that is usually available in the communities.Citation2Citation8 These studies were randomized controlled trials (RCTs), which are considered as the “gold standard” of the evidence-based research.Citation9 Nevertheless, the debate on RCTs remains much discussed.

One of the biggest problems associated with RCT studies is their distance from the real-world environment. The present dilemma raises a question whether we should use randomized trials or observational studies to assess the outcome of a particular disease such as autism. This question is really fundamental since an observational study might constitute the ideal medium for the application of artificial adaptive systems (AASs).

The central strength of an RCT is that groups of patients allocated to each treatment tend to be comparable. In addition, randomization leads to robust methods of hypothesis testing that requires a few statistical assumptions. For these reasons, RCT is often regarded as the “gold standard” of therapeutic and diagnostic research.

However, in real life, patients are not randomly assigned to receive manualized treatment given in a rigid, standardized way, as is the case in most RCTs.

Since, traditionally, the drawback of observational studies is the poor internal validity, in the recent years efforts have been made to develop improved methods to evaluate therapeutic effectiveness in the framework of observational studies.Citation10Citation12

AASs can analyze real-world data very efficiently. The internal validity of their assessment is provided by uniquely severe validation protocols, seldom used in classical statistics.Citation13Citation15

In the last 20 years, artificial neural networks (ANNs) have been used in the field of autism to investigate the mechanisms of developmental regression,Citation16 to identify peculiar features in reach-and-throw movements,Citation17 to predict the diagnosis,Citation18 to study attention shift,Citation19 and to discriminate children with autism from children with mental retardation.Citation20

We performed the present study to investigate whether this revolutionary mathematical approach can increase our knowledge on the connections among those variables in subjects who respond positively to TAU and hence identify the key variables to discriminate responders from nonresponders.

To accomplish this, we applied ANNs and other machine learning systems to assess their predictive capacity in distinguishing consistently the two outcomes of interest (Response vs No Response) of TAU and to identify the variables expressing the maximal amount of relevant information for this distinction.

ANNs allow a method of forecasting with an understanding of the relationship among variables, and in particular nonlinear relationships.Citation11Citation22 ANNs function by initially learning a known set of data from a given problem with a known solution (training) and then the networks, inspired by the analytical processes of the human brain, are able to reconstruct the imprecise rules, which may be underlying a complex set of data (testing).

Moreover, we used the Auto Contractive Map (Auto-CM), a special kind of ANN able to define the strength of the associations of each variable with all the others and to visually show the map of the main connections of the variables and the basic semantic of their ensemble.

Materials and methods

Population

In this work, we have explored in a new way some of the data from our previous study on brief outcome of children with autism under early treatment.Citation23 For this exploration, the sample consisted of 56 children (47 males, 9 females; mean age: 36.01±0.79 months; age range: 18–60 months) with a DSM-IV-TR diagnosis of autistic disorder (n=46) or pervasive developmental disorder not otherwise specified (n=10). A total of 51 children received an Autism Diagnostic Observation Schedule (ADOS) classification of autism and 5 received an autism spectrum diagnosis. The mean nonverbal development quotient was 73.8±18.3 (range: 50–125), and the mean general quotient was 59.1±11.8 (range: 34–85). All the children were re-evaluated after 6 months and divided in two groups: responders and nonresponders.

Measurements

The assessment protocol was composed by gold standard measures: ADOS-Generic (the first author, AN, was certified to administer ADOS in clinical and research setting at the University of Michigan Autism Communication Centre; all the clinicians involved in this study were trained to administer ADOS in a clinical and research setting) and Griffiths Mental Developmental Scales and Vineland Adaptive Behavior Scales-II. We also used parent reports: MacArthur Communicative Development Inventories, Child Behavior Checklist (CBCL) 1½–5, and Parenting Stress Index (PSI). A detailed description of these assessment protocol was reported in our original study).Citation23

Procedure

The children were evaluated at T0 and after 6 months of treatment (T1). At T0, child clinical measures were well equable across treatment sites.

Intervention

All the children received TAU. It includes eclectic treatments usually available in the community and school inclusion with an individual support teacher. TAU included speech therapy and/or psycho-educative therapy. Each child’s program comprises individual objectives but is mainly based on therapist expertise rather than on manualized treatment protocols or uniform training. Treatments can be placed within a continuum ranging from highly structured behavioral approaches to approaches that follow the interests of the child in a naturalistic setting and are based on a developmental curriculum in a relational-based context (a deep explanation of TAU is also reported in our original paper).Citation23

Outcome

The primary outcome was the ADOS calibrated severity score (ADOS-CSS) in order to distinguish children who positively respond to treatment (hereinafter Response) versus nonresponders (hereinafter No Response). ADOS-CSS is a measurement of the severity of the autism symptoms. The ADOS-CSS scores had more uniform distributions across developmental groups and were less influenced by participant demographics than raw totals. This metric is useful in comparing assessments across time and identifying trajectories of autism severity for clinical research.Citation24

Mathematical methods

To evaluate the possibility to predict the treatment outcome (Response vs No Response) using as input data all the 25 variables on study () we have trained different machine learning systems available on WEKA data mining software (University of Waikato, Hamilton, New Zealand)Citation25Citation27 and on Semeion Research Centre depository, Rome, Italy, as classification tools to predict the treatment outcome using the Training and Testing validation protocol. This protocol has been described in detail elsewhere.Citation14,Citation15

Table 1 Variables on study

The learning machines algorithms developed at the University of Waikato, New Zealand, available on the WEKA data mining software are listed in ,Citation28Citation34 whereas two ANNs (Self Momentum Back Propagation and Sine Net)Citation35,Citation36 were implemented in “Supervised ANNs Software”, developed at the Semeion Research Center (Buscema M; Supervised ANNs. Semeion software #12, version 16.0).

Table 2 Learning machines in the WEKA software package

However, since noisy input attributes sometimes can hide the small meaningful information embedded in other attributes, a pruning procedure was used as a preprocessing tool to eliminate noisy variables before the outcome prediction of the main test. In order to conduct that procedure, a special and powerful recently published input selection algorithm named Training With Input Selection and Testing (TWIST) was appliedCitation37Citation44 and developed in a special research software at the Semeion Research Center (Buscema M [2006–2012] TWIST Input Search, Semeion software #39, version 3.2).

TWIST algorithm

As described in the work by Coppedè,Citation21 the TWIST algorithm is a complex algorithm that is able to search for the best distribution of the global dataset divided in two optimally balanced subsets containing a minimum number of input features useful for optimal pattern recognition. TWIST is an evolutionary algorithm based on a seminal paper about genetic doping systems, already applied to medical data with very promising results.Citation11,Citation22,Citation26,Citation38Citation44 TWIST selected 9 of the original attributes () and generated a global dataset of 25 attributes, and 2 optimal subsets for training and testing. We then applied the K-Fold protocol to the global dataset to verify whether the nine attributes selected by TWIST may improve the performances of the learning machines already applied to the original dataset. Moreover, as a second step, we applied the same learning machines to the two subsets generated directly by TWIST.

Table 3 Variables selected by the TWIST system

Semantic connectivity map

An existing mapping methodCitation45,Citation46 was used to highlight through a graph the most important links among variables, using a mathematical approach called Auto-CM. Auto-CM is a special kind of ANN able to find the consistent patterns and/or systematic relationships among variables.Citation45,Citation46 Auto-CM ANN was designed by Buscema M at the Semeion Research Center, and developed in specific research softwares (AutoCM – Auto Contractive Map, Semeion software #46, version 6.0; Modular Auto-Associative ANN, Semeion software #51, version 18.1).

Auto-CM can also recognize in hard conditions, that is, when the connections of the main diagonal of the second connections matrix are removed. When the learning process is organized in this way, Auto-CM seems to find specific relationships between each variable and any other. Consequently, from an experimental point of view, it seems that the ranking of its connections matrix is equal to the ranking of the joint probability between each variable and the others. For the Auto-CM analysis, the same 25 variables used for predictive analysis were employed, except for sex and treatment center localization. We transformed the 23 input variables in 46 input variables constructing for each of the variable, scaled from 0 to 1, its complement as explained in a previous paper.Citation47

In the complement, by subtracting the scaled value from 1, the system was allowed to project and point out the fuzzy position of each variable according to its low values. This is important because in nonlinear systems, the position of high and low values of a given variable is not necessarily symmetric.

In this way, the projection of the original variables tended to show high values, whereas the complement transformation tended to show low values of the original variables. In the map, we have named these two different forms as high and low. This preprocessing scaling is necessary to make possible a proportional comparison among all the variables and to understand the existing links of each variable when the values tend to be high or low.

Results

Response vs No Response

At T1, ADOS-CSS improved in 35 (62.5%) of the 56 children (Response), whereas it was the same or worse in 21 (37.5%) of the 56 children (No Response).

In , the independent t-test and Cohen’s d effect size results of the comparison between Response and No Response groups at T0 assessment are shown. There were significant differences at CBCL (Internalizing Problems) and at PSI (Total and Child Domains).

Table 4 Comparison at T0 among Response vs No Response groups

Prediction of the outcome with machine learning algorithms

and show the results in the two selected strategies of prediction (with and without variable selection, respectively).

Table 5 Predictive results without variable selection

Table 6 Predictive results with variable selection

Using all the 25 variables in the dataset as input vectors, the classification capabilities of all the algorithms are rather low, except the Sine Net and Back Propagation (77.35% and 77.99% of global accuracy, respectively). The conclusion from could be that there is a moderate evidence of correlation between these variables and TAU outcome. However, the application of the TWIST algorithm to eliminate noisy variables before the main test of pattern recognition allowed the selection of nine attributes (listed in ). Most of the learning machines improve their performances dramatically (up to 80% and more of global accuracy), and both the Semeion ANNs overcome 85% of global accuracy (Sine Net almost reaching 90%) (). Consequently, some of the tested algorithms found a good correlation between some variables and TAU outcome, once the noisy attributes were removed (see Supplementary materials for explanation of different machine learning).

Semantic connectivity map

reports the semantic connectivity map. As described by Coppedè,Citation21 in order to better understand the meaning of the connections, a numerical value is applied to each edge of the graph. This value, deriving from the original weight developed by Auto-CM during the training phase scaled from 0 to 1, is proportional to the strength of the connections between two variables. Moreover, by means of Auto-CM, it is possible to obtain not only the direction of the association as provided by standard statistical analyses but also specifically the strength of this association (link strength [LS]).

Figure 1 Semantic connectivity map obtained with Auto-Cm System.

Notes: The figures on the arches of the graph refer to the strength of the association between two adjacent nodes. The range of this value is from 0 to 1. Red arrow points to the no response group; green arrow points to the response group.
Abbreviations: ADOS-CSS, Autism Diagnostic Observation Schedule-Calibrated Severity Score; CBCL, Child Behavior Checklist; int, internalizing; ext, externalizing; tot, total; Griffiths (locomotor, Locomotor development; personal, Personal–social development; speech, Hearing and speech; eye, Hand and eye coordination; general, General quotient); PSI, Parenting Stress Index; Vineland (Com, Communication; Daily Living, Daily Living Skills; Soc, Socialization).
Figure 1 Semantic connectivity map obtained with Auto-Cm System.

It was possible to visualize a Response area characterized by “Parents Involvement high” (LS=0.98) and “MacArthur Expressive low” (LS=0.99).

This last condition was linked to: “Age low” (LS=0.99), “Vineland Composite low” (LS=0.99), “MacArthur Comprehension low” (LS=0.99), and “Griffith Locomotor low” (LS=1.00). Globally, all Griffiths scales, linked to “Response”, showed low scores: Personal, Speech, Eye, Performance, and General.

Otherwise, the resultant No Response area was highly connected only with “Parents Involvement low” (LS=0.98). This condition was directly linked to “PSI total low” (LS=0.99), which was linked to low scores on CBCL scales.

In general, “No Response” area was linked to low PSI scores: both on Parent Domain and Child Domain, and high MacArthur scores (Expressive, Comprehension, and Gestures).

Discussion

The present study represents the first attempt to use ANNs in the arena of the research on ASD treatment. Our aim was to see whether ANNs were able to discriminate children who responded positively to TAU in terms of reduction of autism severity, using a set of variables describing behavioral, developmental and adaptive level profiles, and parental distress.

Despite the observational nature of the study, thanks to ANNs capacity, it was possible to build a predictive model of outcome response, an objective which could not be reached in our previous research work.Citation23 In fact, through the TWIST system, we established a consistent possibility to predict the status of being a responder or a nonresponder on the basis of nine variables (selected out of 25), which allowed to reach up to 89% global accuracy to some of the used learning machines. These selected variables contain specific information to discriminate between the two responder conditions. It was unexpected that, among these predictors, cognitive and language levels were not present. Most studies in fact have indicated that children with lower IQ are less likely to undergo positive gains.Citation48,Citation49 However, other studies have clearly demonstrated that, even among children with equally impaired cognition and language, individual response to the same treatment often differ markedly.Citation50 According to this latter finding, this study suggested that other factors not unique to ASD, such as parent involvement and stress, may be better predictors of treatment outcomes.

The semantic connectivity map obtained by means of the Auto-CM system has identified parent involvement as the main variable that influences the positive outcome of children under treatment; on the other side, no parent involvement is the main factor predicting negative outcomes. This finding, although partially expected,Citation50Citation57 underlines the importance of involving parents who no longer have to be “left out” of the treatment room. Interestingly, a recent comprehensive synthesis of existing meta-analyses of Early Intensive Behavioral Intervention for young children with ASD published from 2009 to 2011 reported parent inclusion as a crucial factor for enhancing treatment effectiveness.Citation55

First, parents must be viewed as important participants in the intervention, and therapist-delivered treatment programs must be accompanied by parent-training methods.Citation56 In fact, this tenant has continued as part of the most recent approaches to early intervention in autism.Citation57 Second, this result is on the same wavelength with findings of a recent meta-analysis that support the positive impact of psychosocial interventions delivered by nonspecialist providers as well as the parents of children with ASD.Citation58 Finally, the positive effect of parent involvement during therapy makes it necessary in the future to assess parent–child interaction as a possible outcome measure.Citation59

In addition to the direct involvement of parents, semantic connectivity map has identified other predictors of better outcome in terms of reduction in the severity of autism after TAU.

First, the young age in which the child begins treatment is consistent with the finding that confirms others research works that have underlined the importance of young age at the start of the treatment as a factor to promote benefits in the social communication domain.Citation60Citation65 According to these authors, it is largely hypothesized that the better outcome might be due to the higher brain plasticity at this early age.Citation66

Second, young children are more likely to undergo positive gains if, at the beginning, they have low language and cognitive performances. RogersCitation67 has already suggested, some years ago, that the evidence of direct links between pretreatment language abilities and treatment outcomes is contradictory. For example, FenskeCitation60 mentioned that the presence of language abilities not always predict positive outcomes in young treated children. The reason for this counterintuitive finding needs more investigations. It could be hypothesized that at this young age, a later development of language means that it is less interfered by the autistic process. It is possible that if language already has autistic features, other gains in the social/pragmatic language become more difficult. These children could be most resistant to change than children having low language performances when they started the treatment. On the contrary, if language develops during a sustained social-communicative program, it has more chances to have typical features and it could have cascading effects on global development.

Semantic connectivity map shows that cognitive functioning cannot be considered a critical factor affecting outcomes in young children with ASD.Citation68 Although some studies showed that having higher IQ at intake is predictive of a better social performances after treatment,Citation65 other studies found no relation between pretreatment IQ and outcomes.Citation62,Citation69 Thus, the role of the initial IQ as a predictor of outcome needs to be more investigated in future studies.

Third, the total number of hours of treatment was not predictive of better outcome. The intensiveness of treatment is a longstanding conflicting discussion point in the arena of autism treatment. Although some studies have described best outcome when maximum hours per week of treatment is provided,Citation70 other studies, which specifically examined outcome effects of hours per week of treatment, have found no differences in benefits obtained.Citation71 In any case, this study suggests that the concept of intensiveness should be reformulated taking into account which type of support children have outside specific hours of treatment. For example, parent involvement means that some part of treatment is provided by parents during everyday life, thereby increasing the hours of treatment.

Again the stronger variables influencing no response to treatment, in addition to low parental involvement during the treatment, are the low stress levels of parents and the low behavioral problems of the child.

Usually, a child with a diagnosis of autism could be a source of stress for the familyCitation72 and the parental stress could reach higher levels when the child begins the treatment.Citation24 On the contrary, the low level of parental stress could be linked to a low awareness of the severity of diagnosis of their children, so that these parents could be less active in being involved, seeking, and planning the treatment solutions for their children. The low stress could also be linked to the low level of child’s behavioral problems that often represent one of the most significant sources of stress for the families.Citation73Citation75 It is worth noting that a recent studyCitation76 has reported that behavioral problems that are not core symptoms of ASD were associated with an high parental stress.

The low behavioral problems could indicate that a certain type of children are less sensitive to TAU: first of all, this behavioral pattern seems to describe the aloof type of autism spectrum, according to Wing,Citation77 that is, subjects with a total disengagement from social interaction and a failure to engage in interpersonal reciprocity; second, these patients seem to be free of regulation disorder and/or anxious or opposite comorbidity frequently reported in ASD.Citation78,Citation79 Our hypothesis is that the absence of these comorbid features could mean a more rigid and less treatable autism. These children could be most resistant to change than children having dysregulatory comorbid pattern or simply they are less sensitive to TAU and need a different type of treatment.

Strengths and limitations

The observational approach combined with the use of ANNs represents the main point of strength of this study. Cases spontaneously arrived at clinics represent a real autistic population of preschoolers, which received treatments by their communities. This is a big advantage with respect to translational need of current clinical research. In this scenario, although the lack of an RCT trial could be considered a weakness from a methodological point of view, the use of ANNs allowed us to overcome the main problem of observational design approach (ie, the low internal validity).

Special protocols of external validation methods, including cross-validation, and the dataset splitting into training and testing samples are able to increase the internal validity of clinical studies such as ours. Originally developed for neural network approaches, these validation protocols are now frequently applied to these traditional analyses. In this way, the use of ANNs is a powerful booster for the more widespread use of observational design in clinical research.

Moreover, ANN could be considered a more “naturalistic” approach than RCT in the field of autism research. In fact, in real life, patients are not randomly assigned to receive manualized treatment given in a standardized way, as is the case in most RCTs.

Patients with autism in the real world have comorbid conditions (ie, epilepsy; severe mental retardation) that normally would preclude them from entering an RCT, or they tend to be less compliant to the treatment and less subject to artificial expectations of recovery, arising from enthusiastic feedback from highly motivated investigators (Hawthorn effect).

An RCT tries to maintain a specific variable (the type of intervention) under control, thanks to randomization, presuming that all independent variables will be automatically balanced between treatment groups, and, therefore, the eventual differences on the outcome might be attributed to the treatment type. Unfortunately, the balance of independent variables at the group level may not be the same at the single individual level nor it allows for the discovery of an eventual complex interaction between independent and dependent variables.

Since translational research has to do with real life, one would be more interesting in “effectiveness” rather than “efficacy”.

Effectiveness tends to answer to the question that whether the intervention works in the real world. Although effectiveness is much more difficult to assess than efficacy, it is now recognized as being the most important factor in deciding whether a particular agent is worth the resources that it consumes.

Since traditionally the drawback of observational studies is the poor internal validity, in the recent years efforts have been made to develop improved methods to evaluate therapeutic effectiveness in the framework of observational studies.

AASs can analyze real-world data very efficiently and it is very important for the autism community. The internal validity of their assessment is provided by uniquely severe validation protocols, seldom used in classical statistics.

The main limitation of this study is the relatively small sample size. The clinical applicability of ANNs should be tested in large, multicenter, prospective clinical trials on treatment effectiveness.

Moreover, although this study found some interesting predicting factors, it has not included many other potential predicting factors (eg, the features of the parents and the family, some biomarkers of the disease). To include all these, possible variables will be very important for a good prediction model. Hence, the current study is preliminary, as a methodological exploration on the path to accurate prediction.

In conclusion, the ANN model used in this study appears to be a promising tool for the identification of the variables involved in the positive or negative response to TAU in autism. The identification of these variables represents a core step to respond to the key question “what works for whom” and thus to pave the way for treatment personalization.

Acknowledgments

The study was funded by the Italian Ministry of Health (IDIA project, Inquiry into Disruption of Intersubjective Equipment in Autism Spectrum Disorder in Childhood). SC was partly supported by the Italian Ministry of Health and by Tuscany Region with the grant (GR-2010-2317873). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Supplementary materials

The comparison algorithms

In this section, we have briefly described the classic learning machines we compared. We have implemented the following learning machines using the WEKA software package (Waikato Environment for Knowledge Analysis, version 3.6.8, 1999–2012, an open source software tool developed for machine learning at the University of Waikato in New Zealand) and Semeion Software Suites (Rome, Italy; Buscema M, Supervised ANNs and Organisms, Semeion Software #12, version 23.0, 1999–2014).

Bayesian algorithms

The Bayesian algorithms are, obviously, based on Bayes’ theorem, which states that given a set of events that partition an event space, any event dependent on event space enriches the knowledge of initial events by the equation:Citation1,Citation2 P(Ei|A)=P(A|Ei)P(Ei)j=1nP(A|Ej)P(Ei)(1)

The classifiers based on Bayesian networks (Bayes Net) represent the variables described by the formula in the equationCitation1 without special restrictions, whereas the naïve Bayesian networks (Naïve Bayes) are based on Bayes’ formula with the assumption of stochastic independence between the variables. This drastic restriction of the domain of validity of the theorem makes this a high-performance classifier applicable to many practical problems.Citation3Citation5

The Naïve Bayes classifier used in this paper is according to the WEKA implementation.

Regression algorithms: logistic regression and multilayer perceptron

The logistic regression is a particular case of generalized linear regression applied in cases where the dependent variable “y” and its type are dichotomous.Citation6,Citation7

The model is described by the function ln(p1p)=β0+β1x1++βkxk(2) with xi independent variables and p is the probability that event y will occur.

As a generalization of the logistic regression model with a feed forward flow and totally interconnected, we have the multilayer preceptor model.Citation8

The regression classifier and the multilayer perceptron classifier used in this paper follow the WEKA implementation.

Optimization algorithms: sequential minimal optimization and support vector machine

A support vector machine is a binary classifier that recognizes the hyperplane separating two different classes by maximizing the distance between the closest training examples.

Given a set of dual training {(xi,yj)|xip,yi{1,1}}i=1n(3) we seek a solution for the equation maxαi=1nαi12i=1nj=1nyiyjK(xi,xj)αiαj(4) in which 0αiC;i=1;i=1nyiαi=0(5) and where C = constant, K (xi, xj) is the kernel function, and ai represents Lagrange multipliers.

The sequential minimal optimization are iterative algorithms used to solve the optimization problem described for the support vector machine by decomposing it into a series of sub-problems, most small enough so that they can be solved analytically.Citation9Citation12

The sequential minimal optimization classifier used in this paper is according to the WEKA implementation.

Tree algorithms

Tree algorithms, or decision-making trees, rely on building a tree from the element’s attributes (nodes) and the possible values that they can take (strings) until one arrives at the leaves representing the class of the instance. The path from the root node to a leaf node through the arch value determines the path that a particular instance must take to reach the membership class. The constructed tree attained from training datasets uses equations that determine the number of strings needed to be generated from a single node. Such decision trees can be used as binders.

J48

The J48 and the WEKA implementation of the C 4.5 algorithm was used to generate a decision tree of the kind developed by Ross Quinlan as an extension of the Iterative Dichotomiser 3 algorithm.Citation13 A decision tree constructed in this way builds from the training data using the concept of entropy of a discrete random variable X = {x1,…, xn} H(X)=p(xi)logp(xi)(6) where p(xi) is the probability of the ith event.

Random trees, random forest

Random decision trees were introduced by Leo Breiman and Adele Cutler to treat both the problems of classification and regression. These are defined as a collection of decision trees called a forest.Citation14 The random tree classifier takes in input feature vectors, the ranking for each tree in the forest, and assigns the class that had the largest number of recurrences.

J48 and random forest classifiers used in this paper are according to the WEKA implementation.

Rotation forest

Rotation forestCitation15 draws upon the random forest idea. The base classifiers are also independently built decision trees, but in rotation forest, each tree is trained on the whole dataset in a rotated feature space. As the tree learning algorithm builds the classification regions using hyperplanes parallel to the feature axes, a small rotation of the axes, using principal component analysis, may lead to a very different tree.

Enhanced Back Propagation

Enhanced Back Propagation is an enhanced version of classic Back Propagation algorithm. The momentum is transformed in self-momentum in order to adapt the learning process to the local error condition of each network’s node.Citation16

Sine Net

Sine Net is characterized by the presence of a specific double nonlinear relationship on the connections between nodes. This characteristic has deep evident consequences on the properties of this network both on the computed function and behavior of this network during the learning phase.Citation17Citation19

Instance-based learning algorithms

Instance-based learning algorithm is a sort of K-nearest neighbors classifier. It can select appropriate value of K based on cross-validation. It can also do distance weighting. The algorithm can work on numeric class, binary class, date class, nominal class, missing class values, and on the following types of attributes: date attributes, unary attributes, numeric attributes, nominal attributes, missing values, binary attributes, and empty nominal attributes.Citation20

References

  • NielsenSNielsenTDAdapting Bayes network structures to non-stationary domainsInt J Approx Reason200849379397
  • FriedmanNGeigerDGoldszmidtMBayesian network classifiersMach Learn199729131163
  • ZhangHLingCXLearnability of augmented Naive Bayes in nominal domainsBrodleyCEDanylukAPProceedings of the Eighteenth International Conference on Machine LearningBurlingtonMorgan Kaufmann2001617623
  • JohnGHLangleyPEstimating continuous distributions in Bayesian classifiersProceedings of the eleventh conference on uncertainty in artificial intelligence2005Morgan Kaufmann PublishersSan Mateo
  • RishIAn empirical study of the naïve Bayes classifierIBM research report, RC 22230, (W0111-014)2001New York
  • CessieSvan HouwelingenJCRidge estimators in logistic regressionAppl Stat199241191201
  • HosmerDWLeneshowSApplied Logistic Regression2nd edNew YorkWiley2000
  • RumelhartDEHintonGEWilliamsRJLearning internal representations by error propagationRumelhartDEMcClellandJL1Parallel Distributed ProcessingBostonthe MIT Press1986318362
  • PlattJFast training of support vector machines using sequential minimal optimizationSchoelkopfBBurgesCJCSmolaAJAdvances in Kernel Methods – Support Vector LearningCambridge (MA)MIT Press1998
  • KeerthiSSShevadeSKBhattacharyyaCMurthyKImprovements to Platt’s SMO algorithm for SVM classifier designNeural Comput200113637649
  • KeerthiSSGilbertEGConvergence of a generalized SMO algorithm for SVM classifier designMach Learn200246351360
  • KecmanVLearning and Soft Computing – Support Vector Machines, Neural Networks, Fuzzy Logic SystemsCambridge (MA)The MIT Press2001
  • QuinlanJRC45: Programs for Machine LearningSan MateoMorgan Kaufman2004
  • BreimanLRandom forestMach Learn200145532
  • RodriguezJJLudmilaIKunchevaCRotation forest: a new classifier ensemble methodIEEE Trans Pattern Anal Mach Intell200628101619163016986543
  • ArisawaRWatadaJEnhanced back-propagation learning and its application to business evaluationNeural Networks19941155160
  • BuscemaMTerziSBredaMUsing sinusoidal modulated weights improve feed-forward neural network performances in classification and functional approximation problemsWSEAS Trans Inform Sci Appl200653885893
  • BuscemaMSine Net: an artificial neural network. Applicant Semeion Research Centre. Inventor M. Buscema, European Patent (Application n. 03425582.8 deposited 09-09-2003)USA Patent No US7,788,196 B28312010 International Patent: Application PCT/EP2004/05189 deposited 08-28-2004
  • BuscemaMTerziSBredaMA Feed Forward Sine Based Neural Network for Functional Approximation of a Waste Incinerator EmissionsProceedings of the 8th WSEAS Int Conference on Automatic Control, Modeling and SimulationPrague, Czech RepublicMarch 12–14 2006276280
  • AhaDKiblerDInstance-based learning algorithmsMachine Learning199163766

Disclosure

The authors report no conflicts of interest in this work.

References

  • American Psychiatric AssociationDiagnostic and Statistical Manual of Mental Disorders5th edArlington (VA)American Psychiatric Publishing2013
  • VolkmarFSiegelMWoodbury-SmithMPractice parameter for the assessment and treatment of children and adolescents with autism spectrum disorderJ Am Acad Child Adolesc Psychiatry201453223725724472258
  • KasariCAre we there yet? The state of early prediction and intervention in autism spectrum disorderJ Am Acad Child Adolesc Psychiatry201353213313424472247
  • DawsonGBernierRA quarter century of progress on the early detection and treatment of autism spectrum disorderDev Psychopathol2013254 Pt 21455147224342850
  • OspinaMBKrebs SeidaJClarkBBehavioural and developmental interventions for autism spectrum disorder: a clinical systematic reviewPLoS One2008311e3755 Epub 2008 Nov 1819015734
  • NarzisiAColombiCBalottinUMuratoriFNon-pharmacological treatments in autism spectrum disorders: an overview on early interventions for pre-schoolersCurr Clin Pharmacol Epub 2013 Sep 20
  • DawsonGRogersSMunsonJRandomized, controlled trial of an intervention for toddlers with autism: the Early Start Denver ModelPediatrics20121251e17e2319948568
  • GreenJCharmanTMcConachieHParent-mediated communication-focused treatment in children with autism (PACT): a randomised controlled trialLancet2010375973221522160 Epub 2010 May 2020494434
  • GrossiETechnology Transfer from the science of medicine to the real world: the potential role played by artificial adaptive systemsSubst Use Misuse200742226730417558931
  • Rutten-van MölkenMPvan DoorslaerEKvan VlietRCStatistical analysis of cost outcomes in a randomized controlled clinical trialHealth Economics199433333457827649
  • GrossiEManciniABuscemaMInternational experience on the use of artificial neural networks in gastroenterologyDig Liver Dis20073927828517275425
  • HorwitzRIViscoliCMClemensJDSadockRTDeveloping improved observational methods for evaluating therapeutic effectivenessAm J Med19908956306381978566
  • VomwegTWBuscemaMKauczorHUImproved artificial neural networks in prediction of malignancy of lesions in contrast-enhanced MR-mammographyMed Phys20033092350235914528957
  • AndriulliAGrossiEBuscemaMContribution of artificial neural networks to the classification and treatment of patients with uninvestigated dyspepsiaDig Liver Dis20033522223112801032
  • MecocciPGrossiEBuscemaMUse of artificial networks in clinical trials: a pilot study to predict responsiveness to Donepezil in Alzheimer’s diseaseJ Am Geriatr Soc200250111857186012410907
  • ThomasMSKnowlandVCKarmiloff-SmithAMechanisms of developmental regression in autism and the broader phenotype: a neural network modeling approachPsychol Rev2011118463765421875243
  • PeregoPFortiSCrippaAValliAReniGReach and throw movement analysis with support vector machines in early diagnosis of autismConf Proc IEEE Eng Med Biol Soc20092555255819965210
  • ArthiKTamilarasiAPrediction of autistic disorder using neuro fuzzy system by applying ANN techniqueInt J Dev Neurosci2008267699704 Epub 2008 Jul 2618706991
  • GustafssonLPaplińskiAPSelf-organization of an artificial neural network subjected to attention shift impairments and familiarity preference, characteristics studied in autismJ Autism Dev Disord200434218919815162937
  • CohenILAn artificial neural network analogue of learning in autismBiol Psychiatry19943615208080903
  • CoppedèFGrossiEMigheliFMiglioreLPolymorphisms in folate-metabolizing genes, chromosome damage, and risk of Down syndrome in Italian women: identification of key factors using artificial neural networksBMC Med Genomics201034220868477
  • PencoSGrossiEChengSAssessment of the role of genetic polymorphism in venous thrombosis through artificial neural networksAnn Hum Genet20056969370616266408
  • MuratoriFNarzisiAIDIA ConsortiumExploratory study describing 6-months outcomes for young children with autism who receive treatment as usual (TAU) in ItalyNeuropsychiatr Dis Treat201481057758624748794
  • GothamKPicklesALordCStandardizing ADOS scores for a measure of severity in autism spectrum disordersJ Autism Dev Disord2009395693705 Epub 2008 Dec 1219082876
  • HallMFrankEHolmesGThe WEKA data mining software: an updateSIGKDD Explorations20091111018
  • BuscemaMGrossiEIntraligiMAn Optimized experimental protocol based on neuro-evolutionary algorithms. Application to the classification of dyspeptic patients and to the prediction of the effectiveness of their treatmentArtif Intell Med20053427930516023564
  • BuscemaMGenetic Doping Algorithm (GenD): theory and applicationExpert Syst2004216379
  • HosmerDWLemeshowSApplied Logistic Regression2nd edWiley2000
  • RossQC45: Programs for Machine LearningSan Mateo (CA)Morgan Kaufmann Publishers1993
  • CollobertRBengioSLinks between perceptrons, MLPs and SVMsProc Int’l Conf on Machine Learning (ICML)ACM Digital Library2004 Available at http://dl.acm.org/citation.cfm?id=1015415Accessed 2 June 2015
  • GeorgeHJLangleyPEstimating continuous distributions in Bayesian classifiersEleventh Conference on Uncertainty in Artificial Intelligence1995San Mateo, CA338345
  • LivingstonFImplementing Breiman’s random forest algorithm into WekaECE591Q Machine Learning Conference Papers11272005
  • RodriguezJJKunchevaLIAlonsoCJRotation forest: a new classifier ensemble methodIEEE Trans Pattern Anal Mach Intell200628101619163016986543
  • KeerthiSSGilbertEGConvergence of a generalized SMO algorithm for SVMMachine Learning2002351360
  • BuscemaMBack propagation neural networksSubst Use Misuse1998332332709516725
  • BuscemaMTerziSBredaMUsing sinusoidal modulated weights improve feed-forward neural networks performances in classification and functional approximation problemsWSEAS Trans Inf Sci Appl20063885893
  • BuscemaMBredaMLodwickWTraining With Input Selection and Testing (TWIST) algorithm: a significant advance in pattern recognition performance of machine learningJ Intell Learn Syst Appl201352938
  • DietterichTGApproximate statistical tests for comparing supervised classification learning algorithmsNeural Comput199810718851924
  • LahnerEIntraligiMBuscemaMArtificial neural networks in the recognition of the presence of thyroid disease in patients with atrophic body gastritisWorld J Gastroenterol20081456356818203288
  • BuriLHassanCBersaniGAppropriateness guidelines and predictive rules to select patients for upper endoscopy: a nationwide multicenter studyAm J Gastroenterol20101051327133720029414
  • StreetMEGrossiEVoltaCFaleschiniEBernasconiSPlacental determinants of fetal growth: identification of key factors in the insulin-like growth factor and cytokine systems using artificial neural networksBMC Pediatr200882418559101
  • BuscemaMGrossiECapriottiMBabiloniCRossiniPMThe I.F.A.S.T. model allows the prediction of conversion to Alzheimer disease in patients with mild cognitive impairment with high degree of accuracyCurr Alzheimer Res2010717318719860726
  • RotondanoGCipollettaLGrossiEArtificial neural networks accurately predict mortality in patients with non variceal upper GI bleedingGastrointest Endoscop201173218226
  • PaceFRieglerGde LeoneAIs it possible to clinically differentiate erosive from non erosive reflux disease patients? A study using an artificial neural networks-assisted algorithmEur J Gastroenterol Hepatol2010221163116820526203
  • BuscemaMGrossiEThe semantic connectivity map: an adapting self-organising knowledge discovery method in data bases. Experience in gastro-oesophageal reflux diseaseInt J Data Min Bioinform2008236240419216342
  • BuscemaMGrossiESnowdonDAntuonoPAuto-Contractive Maps: an artificial adaptive system for data mining. An application to Alzheimer diseaseCurr Alzheimer Res2008548149818855590
  • GironiMSaresellaMRovarisMA novel data mining system points out hidden relationships between immunological markers in multiple sclerosisImmun Ageing201310123305498
  • HowlinPMossPSavageSRutterMSocial outcomes in mid- to later adulthood among individuals diagnosed with autism and average nonverbal IQ as childrenJ Am Acad Child Adolesc Psychiatry2013526572581 Epub 2013 Apr 2423702446
  • TrembathDBalandinSTogherLStancliffeRJPeer-mediated teaching and augmentative and alternative communication for preschool-aged children with autismJ Intellect Dev Disabil200934217318619404838
  • VismaraLAColombiCRogersSJCan one hour per week of therapy lead to lasting changes in young children with autism?Autism20091319311519176579
  • AndersonSRRomanczykRGEarly intervention for young children with autism: continuum-based behavioral modelsJ Assoc Pers Sev Handicaps199924162173
  • DawsonGOsterlingJEarly intervention in autism: effectiveness and common elements of current approachesGuralnickMJThe Effectiveness of Early Intervention: Second Generation ResearchBaltimore (MD)Brookes1997307326
  • GreenGEvaluating claims about treatments for autismMauriceCGreenGLuceSCBehavioral Intervention for Young Children With Autism A Manual for Parents and ProfessionalsAustin (TX)PRO-ED19961527
  • SallowsGOGraupnerTDIntensive behavioral treatment for children with autism: four-year outcome and predictorsAm J Ment Retard200511041743816212446
  • StraussKManciniFFavaLSPC GroupParent inclusion in early intensive behavior interventions for young children with ASD: a synthesis of meta-analyses from 2009 to 2011Res Dev Disabil20133492967298523816632
  • BerkowitzPBGrazianoAMTraining parents as behaviour therapist: a reviewBehav Res Ther1972102973174564856
  • VismaraLARogersSJBehavioral treatments in autism spectrum disorder: what do we know?Annu Rev Clin Psychol2010644746820192785
  • ReichowBServiliCYasamyMTBarbuiCSaxenaSNon-specialist psychosocial interventions for children and adolescents with intellectual disability or lower-functioning autism spectrum disorders: a systematic reviewPLoS Med20131012e1001572 Epub 2013 Dec 1724358029
  • OonoIPHoneyEJMcConachieHParent-mediated early intervention for young children with autism spectrum disorders (ASD)Cochrane Database Syst Rev2013304CD00977423633377
  • FenskeECZalenskiSKrantzPJMcClannahanLEAge at intervention and treatment outcome for autistic children in a comprehensive intervention programAnal Intervention Dev Disabil198554958
  • AndersonSRCampbellSCannonBOThe may center for early childhood educationHarrisSLHandlemanJSPreschool Education Programs for Children With AutismAustin (TX)PRO-ED19941536
  • BirnbrauerJSLeachDJThe Murdoch early intervention program after 2 yearsBehav Change1993106374
  • LovaasOIBehavioral treatment and normal educational and intellectual functioning in young autistic childrenJ Consult Clin Psychol198755393571656
  • RogersSJVismaraLWagnerALMcCormickCYoungGOzonoffSAutism treatment in the first year of life: a pilot study of infant start, a parent-implemented intervention for symptomatic infantsJ Autism Dev Disord201444122981299525212413
  • HarrisSLHandlemanJSAge and IQ at intake as predictors of placement for young children with autism: a four- to six-year follow-upJ Autism Dev Disord200030213714210832778
  • VentolaPEOostingDAndersonLCPelphreyKABrain mechanisms of plasticity in response to treatments for core deficits in autismProg Brain Res201320725527224309258
  • RogersSJEmpirically supported comprehensive treatments for young children with autismJ Clin Child Psychol19982721681799648034
  • SchalockRLBorthwick-DuffySABradleyVJIntellectual Disability: Definition, Classification, and Systems of SupportsWashington, DCAmerican Association on Intellectual and Developmental Disorders2010
  • VivantiGBarbaroJHudryKDissanayakeCPriorMIntellectual development in autism spectrum disorders: new insights from longitudinal studiesFront Hum Neurosci20135735423847518
  • GabrielsRLHillDEPierceRARogersSJWehnerBPredictors of treatment outcome in young children with autism: a retrospective studyAutism20015440742911777257
  • SheinkopfSJSiegelBHome-based behavioral treatment of young children with autismJ Autism Dev Disord199828115239546298
  • EstesAOlsonESullivanKParenting-related stress and psychological distress in mothers of toddlers with autism spectrum disordersBrain Dev201335213313823146332
  • BebkoJMKonstantareasMMSpringerJParent and professional evaluations of family stress associated with characteristics of autismJ Autism Dev Disord19871745655763680156
  • KoegelRLKoegelLKSurrattAVLanguage intervention and disruptive behavior in preschool children with autismJ Autism Dev Disord1992221411531378049
  • HerringSGrayKTaffeJTongeBSweeneyDEinfeldSBehaviour and emotional problems in toddlers with pervasive developmental disorders and developmental delay: associations with parental mental health and family functioningJ Intellect Disabil Res20065087488217100948
  • DavisNOCarterASParenting stress in mothers and fathers of toddlers with autism spectrum disorders: associations with child characteristicsJ Autism Dev Disord20083871278129118240012
  • WingLThe autistic spectrumLancet1997350176117679413479
  • KohaneISMcMurryAWeberGThe co-morbidity burden of children and young adults with autism spectrum disordersPLoS One201274e33224 Epub 2012 Apr 1222511918
  • MunshiKRGonzalez-HeydrichJAugensteinTD’AngeloEJEvidence-based treatment approach to autism spectrum disordersPediatr Ann2011401156957422066509