92
Views
5
CrossRef citations to date
0
Altmetric
Original Research

Novel ensemble method for the prediction of response to fluvoxamine treatment of obsessive–compulsive disorder

, , &
Pages 2027-2038 | Published online: 10 Aug 2018

Abstract

Objective

About 30% of obsessive–compulsive disorder (OCD) patients exhibit an inadequate response to pharmacotherapy. The detection of clinical variables associated with treatment response may result in achievement of remission in shorter period, preventing illness development and reducing socioeconomic costs.

Methods

In total, 330 subjects with OCD diagnosis underwent 12-week pharmacotherapy with fluvoxamine (150–300 mg). Treatment response was ≥25% reduction in Yale-Brown Obsessive–Compulsive Scale (Y-BOCS) score. In total, 36 clinical attributes of 151 subjects who had completed their treatment course were analyzed. Data mining algorithms included missing value handling, feature selection, and new analytical method based on ensemble classification. The results were compared with those of other traditional classification algorithms such as decision tree, support vector machines, k-nearest neighbor, and random forest.

Results

Sexual and contamination obsessions are high-ranked predictors of resistance to fluvoxamine pharmacotherapy as well as high Y-BOCS obsessive score. Our results showed that the proposed analysis strategy has good ability to distinguish responder and nonresponder patients according to their clinical features with 86% accuracy, 79% sensitivity, and 89% specificity.

Conclusion

This study proposed an analytical approach which is an accurate and a sensitive method for the analysis of high-dimensional medical data sets containing more number of missing values. The treatment of OCD could be improved by better understanding of the predictors of pharmacotherapy, which may lead to more effective treatment of patients with OCD.

Introduction

Obsessive–compulsive disorder (OCD) is a neuropsychiatric disorder and affects 1%–3% of the population worldwideCitation1 and 1.8% (0.7% and 2.8% in men and women, respectively) in Iran.Citation2

Pharmacotherapy and cognitive–behavioral therapy are considered effective for the treatment of this disorder.Citation3 First-line drugs for OCD pharmacotherapy are serotonin reuptake inhibitors (SRIs),Citation4 but 40%–60% of patients do not respond adequately to a trial of these drugs.Citation5,Citation6 The Yale-Brown Obsessive–Compulsive Scale (Y-BOCS) is frequently used to quantify the severity of obsessive–compulsive (OC) symptoms. Responders are clinically defined as patients who show >25% or 35% decline in Y-BOCS rating, although they may experience significant impairment from their residual OCD symptoms.Citation4,Citation7 Approximately one third of nonresponders to initial SRI monotherapy respond to a second, different SRI, but others are refractory patients who do not respond adequately to SRI pharmacotherapy.Citation5

To find the most effective SRI, each medication has to be tried sequentially for at least 12 weeks.Citation8 Using appropriate response predictors may result in the achievement of remission in shorter period, preventing illness development and reducing socioeconomic costs. There have been several attempts to detect predictors of treatment response with SRIs using demographic and clinical characteristics of OCD patients, their genotype, and the results of neuroimaging assessments.Citation6,Citation9Citation11

Studies investigated the possible association of OCD clinical characteristics, including symptom dimensions and SRI treatment, reported different predictors for treatment response. Factors that have been associated with poor response to OCD treatment include hoarding dimension,Citation12Citation17 somatic obsessions,Citation6 contamination and cleaning,Citation15,Citation18,Citation19 repeating rituals and counting compulsions,Citation13 obsessions of symmetry,Citation15,Citation17,Citation20 poor insight,Citation6,Citation21,Citation22 sexual/religious obsessions,Citation23,Citation24 severity of compulsions,Citation19,Citation25 early onset and chronic course of OC symptoms,Citation6,Citation19,Citation26Citation30 psychiatric comorbidity,Citation31Citation38 lack of sensory phenomena and greater symptom severity,Citation35,Citation39 SRI treatment at intake,Citation37 absence of family history,Citation6 family involvement in the OC symptoms,Citation40,Citation41 being male,Citation30 being older at intake,Citation30 and longer duration of illness.Citation36,Citation42

While greater OCD severity at intake was reported as the predictor of poor response to treatment,Citation10,Citation11,Citation30,Citation36,Citation38,Citation40,Citation43,Citation44 three studies found a better response in those with higher baseline severity of illness.Citation45Citation47 Moreover, forbidden thoughts (sexual/religious/harm-related obsessions) with checking compulsions were associated with better acute medication response,Citation15,Citation20 but poor long-term outcome and treatment refractoriness were also reported.Citation20,Citation23,Citation48 The other predictors of good response to SRI treatment were having a partnerCitation35,Citation37,Citation40 and washing and obsessive thoughts.Citation49

The presence of numerous different kinds of variables that affect response to OCD treatment makes it difficult to detect appropriate predictors for treatment responder and nonresponder discrimination. Data mining provides an opportunity for the assessment of all potential predictors simultaneously. Machine learning methods not only consider the effect of each variable on the outcome of interest separately but also identify patterns of information that are useful to predict outcomes at the individual patient level.Citation50

Machine learning is commonly used in the social and applied sciences, but limited attempts use this method in clinical research, especially psychiatry studies. However, some machine learning methods such as support vector machine (SVM), support vector regression, and random forest (RF) were used in OCD clinical research. Hoexter et al used machine learning methods to discriminate patients from healthy controls through brain structural magnetic resonance imaging and to predict OCD severity in patients.Citation51 In another study, machine learning methods were used to predict remission in OCD.Citation25 To our knowledge, feature selection and ensemble classification have not yet been applied to predict treatment response in OCD.

In this study, we aimed at developing a classification algorithm based on an ensemble of classifiers to be used for the prediction of treatment response in order to help individualize clinical assignment of treatment. We assessed demographic and clinical variables of Iranian OCD patients to find the most important attributes and applied the proposed machine learning approach to predict OCD treatment response in fluvoxamine pharmacotherapy.

Methods

Subjects and treatment procedure

In total, 330 outpatients with Iranian origin meeting the text revision of the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV-TR) criteria for OCD were recruited from Imam Hossain Hospital (Tehran, Iran) between 2014 and 2017. The diagnosis was established for the subjects by the administration of the structured clinical interview for DSM-IV-TR by psychiatrists, and the patients were checked for disease symptoms and severity using the Persian version of Y-BOCS checklist and severity scale.Citation52

Inclusion criteria were the following: 1) meeting DSM-IV-TR criteria for OCD; 2) aged between 18 and 65 years; 3) having OCD symptoms for >1 year; and 4) a drug-free period of at least 3 weeks.

Exclusion criteria were the following: 1) with a history of mental retardation; 2) having psychotic disorders; 3) having severe neurological pathology; 4) having other DSM-lV-TR Axis I disorders except depression, anxiety, or tic disorder; 5) with a history of substance use; 6) with a total Y-BOCS severity <9; 7) being under the selective serotonin reuptake inhibitor (SSRI) or antidepressant pharmacotherapy; 8) having prominent suicidal ideation; 9) pregnancy or lactation; and 10) refusal to participate or take treatment procedure.

All the subjects completed a semi-structured interview that recorded clinical and sociodemographic data, such as age at assessment, gender, marital status, educational level, occupation, chief complaint, age of onset, and the familial history of any psychiatric disorders, specifically OCD.

Pharmacotherapy was defined as 12-week treatment with fluvoxamine (150–300 mg). No concomitant therapy was allowed during the whole treatment period, either pharmacological or nonpharmacological. Fluvoxamine daily dose was initiated from 25 mg/day, increased to 50 mg/day after a week, and up to 100 mg/day for the third week. Its daily dose was increased up to 150 mg/day for the next 3 weeks, and after the sixth week, the patients were visited by the psychiatrist, and the fluvoxamine dose was adjusted based on the severity of patients’ symptoms.

As ~60% of patients experience SRI pharmacotherapy that provided at least a 25%–35% decrease in OC severities on Y-BOCS,Citation7 one of these cut points has typically been considered as the criterion for responsiveness. In an adequate trial of an SRI, a <25% decrease in the Y-BOCS score in patients with at least moderate symptom severity is usually considered nonresponse.Citation4 We also chose 25% improvement cut point for fluvoxamine responsiveness.

The severity of OCD symptoms at the first visit before treatment initialization and after 12-week treatment was scored using the Y-BOCS, by an experienced psychologist. According to the reduction in patients’ Y-BOCS scores at the beginning of the treatment compared with the scores after 12 weeks of treatment with fluvoxamine, they were divided into two groups: Group A (responders) was comprised of patients who exhibited >25% reduction in Y-BOCS scores after treatment with fluvoxamine, and group B (nonresponder) was comprised of patients who exhibited <25% reduction in Y-BOCS scores.Citation5,Citation53 We included another group (refractory patients) comprised of patients experienced various SSRI trials during their illness period, but the severity of their symptoms did not change or even became worse.Citation4

Upon OCD diagnosis and meeting the inclusion criteria, 368 patients were invited to participate in the study, but some of them (19 patients) refused to participate, and 19 patients were excluded from the study due to total Y-BOCS severity score <9 (N=6), the history of substance use (N=3), the presence of psychotic disorders or severe neurological disorders (N=4), and other reasons (N=6). Of 330 patients who had participated in the study, 151 had completed it and were eligible for the analysis of response predictors. The remainders were excluded due to denial to come for follow-up (N=110), refusal to take medication (N=38), and other reasons (N=31). The CONSORT diagram in summarizes the flow of participants through different stages of the trial.

Figure 1 Consort diagram of the study.

Abbreviation: Y-BOCS, Yale-Brown Obsessive–Compulsive Scale.
Figure 1 Consort diagram of the study.

Written informed consent was obtained from all subjects after a full description of the study. This study was performed in accordance with the World Medical Association’s Declaration of Helsinki and approved by the Neuroscience Research Center Ethical Committee (Project No IR.SBMU.PHNS.REC.1396.2).

Data processing

All data processing and analyses were conducted using MATLAB Version 2014a (The MathWorks Inc., Natick, MA, USA).

Variables

Response to treatment (responder, nonresponder, and refractory) was considered as a dependent variable, and sociodemographic and clinical variables were considered as independent variables (predictors). Thirty-six variables were studied as predictors such as gender, marital status, employment, educational level, family history, initial Y-BOCS obsession score, initial Y-BOCS compulsion score, Y-BOCS obsession subtypes, Y-BOCS compulsion subtypes, insight, avoidance, depression score, age of onset, and illness duration. Moreover, exploratory factor analysis was applied to the 13 Y-BOCS obsession and compulsion categories.Citation54 Underlying factors were detected as factor 1: aggression, contamination, sexual, and religious obsessions as well as somatic, checking, and repeating compulsions; factor 2: symmetry obsession and cleaning, counting, and ordering compulsions; factor 3: contamination obsession and cleaning compulsion; and factor 4: hoarding obsession and compulsion. These factors were also considered as variables in our analysis.

Missing values

One of the important challenges in the analysis of medical data sets is the presence of missing values. Several methods have been proposed to treat missing data such as case deletion or replacing the missing attribute by the mean of known values of that attribute. Acuna and RodriguezCitation55 compared four different methods for treating missing values (case deletion, mean imputation, median imputation, and k-nearest neighbor [KNN] imputation) to evaluate the misclassification error rate of these methods. Their results showed that KNN imputation procedure performs better in dealing with missing values.Citation55 In KNN imputation method, the missing values of an instance are imputed, considering a given number of instances that are most similar to the instance of interest. In the present study, we used KNN imputation procedure described in Box 1. Briefly, at first, we divided the samples into two groups, the samples with missing values and the samples without missing values. At the next step, for each instance of interest with missing values, we found the KNNs from the samples with complete data that have the same class of treatment response (responder, nonresponder, and refractory) and calculated the means of continuous variables or found the modes of nominal or ordinal features of these samples. Finally, we imputed these values to the missing values of the instance of interest. Thirty-one patients (21%) had attributes with missing values, and other patients’ data (79%) were complete. The proportion of missing data to all was 4%. The mechanism of missing data was missing at random which means that propensity for a data point to be missing is related to some of the observed data.

Box 1 Pseudocode for dealing with missing data

Feature selection

The presence of a lot of features affecting treatment response in psychiatric disorders and the small sample size emerge as common problems for the detection of treatment response predictors. Feature selection may improve accuracy and efficiency of classifier methods by finding the most appropriate features. Feature selection techniques can be classified into three groups: filter methods, wrapper methods, and embedded methods.Citation56Citation58

In filter methods, features are selected based on the relevance of their intrinsic characteristics to the target classes using statistical tests such as Independent samples t-test or F-test.Citation59 In wrapper methods, feature selection is “wrapped” around a learning method, and its importance is directly judged by the estimated accuracy of the learning method. Wrapper methods typically require extensive computation to search the best features, but the characteristics of the selected features match well with the learning method.Citation60 Feature selection in embedded methods performs through the process of training and is usually specific to given learning machines. This kind of methods have the advantage of including the interaction with the classification model in feature selection procedure, while at the same time being far less computationally intensive than wrapper methods.Citation58 According to different challenges in feature selection, there is not a specific algorithm that works best under all conditions.Citation58

In the present study, we used maximum relevance minimum redundancy (MRMR, EquationEquation 1) algorithm for feature selection, which is a classifier-independent method and needs less computational time.Citation61,Citation62 MRMR criteria have a good overall trade-off for accuracy/stability compared with other criteria.Citation63 One of the advantages of MRMR criteria is detecting correlated features (which cannot be ruled out in simple filter methods) to exclude redundant ones from further analysis.  xjXSm1max{I(xj;c)1m1xiSm1I(xj;xi)}(1)

The first part of the MRMR formula is maximum relevance, and the second part is called minimum redundancy. EquationEquation 1 is an incremental method of MRMR implementation. According to this equation, we calculated mutual information of each feature (xj) with the class label (c). The feature with maximum value was selected as the first feature. Next, the mean value of mutual information between each feature and selected features (Sm−1) was subtracted from the mutual information of each feature and class label. After finding the maximum, the second appropriate feature was found. This step was repeated several times until all the features were selected (in EquationEquation 1, X denotes all features).

In the present study, we used MRMR algorithm as a way for weighting variables and excluding redundant features. If the result of EquationEquation 1 was <0, we excluded that feature from further analysis. In other words, if mutual information of a feature with the class label is less than the mean value of mutual information between feature and selected features, then that feature was removed. The weight of each feature has been calculated using the result of EquationEquation 1 divided by total MRMR values of all features. To assess the relevance of features, we conducted chi-squared test for nominal attributes and independent sample t-test for continuous ones, and then, 1−(p value) was used as a merit of a feature.

Data analysis

Proposed algorithm

We used ensemble learning for the classification of our data set. In ensemble methods, multiple learning algorithms are used to achieve better predictive performance compared with each learning algorithm alone. The results showed that ensemble of classifiers can enhance class prediction even though individual classifiers might be rather weak and error-prone in making decisions.Citation64,Citation65 Successful applications of ensemble methods can be found in bioinformaticsCitation66 and medicine.Citation67,Citation68

Bagging,Citation69 Boosting,Citation70 and RFCitation71 are famous ensemble-based algorithms. In ensemble classification, a number of base classifiers are trained. At the test time, test samples are given to all the base classifiers, and the class label of samples is determined, typically through majority vote, based on the output of all base classifiers.Citation72 It is obvious that we should create diversity in the output of base classifiers; otherwise, the accuracy of the ensemble method does not change. For example, multilayer perceptron (MLP) with different structures or different initialization weights can be used as a base classifier.

It is usually assumed that increasing diversity may decrease ensemble error.Citation73 Theoretical and empirical results suggested that one of the most effective methods of achieving independence classifiers is attribute bagging (training the members of an ensemble on qualitatively different feature (sub)sets).Citation74,Citation75 It has been shown that the best voting accuracy is achieved for attribute subset sizes between one third and half of the total number of attributes.Citation74

The proposed method has two phases: 1) sorting the features using MRMR algorithms and 2) weight assignment to each feature based on the importance of that feature. After sorting the features based on their weights, we removed attributes that have a weight less than a predefined threshold. Next, we generated several base classifiers (decision tree in our case) using a subset of features that were selected by roulette wheel sampling.Citation76 With the roulette wheel algorithm, we selected attributes randomly according to their weights. At the time of base classifiers construction, we pruned base classifiers that have accuracy less than a predefined threshold (75% in our study). We split the data set into training and testing parts. The first part is used for training the model, and the second part is used for the validation of the model. At the test time, we used majority vote for determining the class label of samples. If base classifiers could not determine the class label accurately (eg, 45% of base classifiers agreed on class 1 and 55% agreed on class 2), we determined the class label of test sample based on the base classifier that has the best training accuracy. Box 2 describes the pseudocode for our proposed method.

Box 2 Pseudocode for our new proposed method

Traditional classification algorithms and performance measures

We compared the results of our proposed method with several well-known learning models that are commonly used for classification including MLP, KNN, SVM, decision tree, and RF.

MLP is a popular artificial neural network architecture with backpropagation (a supervised learning algorithm). It has been shown that given the right size and the structure, MLP is capable of learning arbitrarily complex nonlinear functions to arbitrary accuracy levels.Citation77

KNN is based on the principle that the cases within a data set will generally exist in close proximity to other cases that have similar properties.Citation78 The KNN algorithm locates the k-nearest instances to the query instance and determines its class by identifying the single most frequent class label.

SVM is a maximum margin classification algorithm, which exploits information about the inner products in some feature space.Citation79,Citation80 Studies showed that these algorithms have a good ability for classification in medical data sets.Citation81,Citation82

Decision trees are powerful classification algorithms such as Quinlan’s ID3, C4.5, and C5 that are becoming increasingly more popular with the growth of data mining in the field of medicine.Citation83

RF is an ensemble learner, a method that generates many classifiers and aggregates their results.Citation71 RF shows high predictive accuracy and is applicable in high-dimensional data sets with highly correlated features, such as medical data sets.Citation84,Citation85

For evaluating the proposed analytical model, we used three performance measures: accuracy (EquationEquation 2), sensitivity (EquationEquation 3), and specificity (EquationEquation 4), which are usually defined for binary classification. As we dealt with a multiclass application, we calculated these measures based on the following formulas:Citation86 Average accuracy=i=1lTPi+TNii=1lTPi+TNi+FPi+FNi(2) l is the number of class Sensitivity=i=1lTPii=1lTPi+FNi(3) Specificity=i=1lTNii=1lTNi+FPi(4) where TPi, TNi, FPi, and FNi denote true positives, true negatives, false positives, and false negatives for class i, respectively.

True positive (TP) is the number of positive samples that are correctly identified as positive. False positive (FP) is the number of negative samples that are incorrectly identified as positive. False negative (FN) denotes the number of positive samples that are incorrectly identified as negative. True negative (TN) is the number of negative samples that are correctly identified as negative.

Results

Sociodemographic data of the studied sample

summarizes sociodemographic and clinical variables of 151 patients who had completed pharmacotherapy. Responder, nonresponder, and refractory groups consisted of 68%, 50%, and 68% women; 68%, 64%, and 63% married patients; and 64%, 41%, and 59% unemployed patients (); 15% of patients in the responder group reported no history of mental illness in their family, but others (85%) were from the families with the history of psychiatric disorders. Percentages of positive family history in nonresponder and refractory groups were reported as 85% and 77%, respectively ().

Table 1 Clinical variables of patients in each treatment response classes

Patients’ ages of onset (mean±SD) were 24.0±10.0 in responder group, 20.9±12.0 in the nonresponder group, and 20.6±8.8 in the refractory group. The mean values of initial Y-BOCS scores for obsession (mean±SD) were 10.3±4.7 for responders, 12.4±4.9 for nonresponders, and 12.0±3.9 for refractory patients. One-way analysis of variance with Tukey’s post hoc test showed that obsession scores were significantly different between these three groups, and responders reported less severe obsessions compared with nonresponder and refractory patients. However, compulsion and total Y-BOCS scores were not significantly different between these three groups ().

Analyses revealed that contamination and sexual obsessions were the most important predictors for fluvoxamine pharmacotherapy. Sixty-nine percent of responder patients reported contamination symptoms, whereas 88% of nonresponder patients had contamination obsessions. Sexual obsessions were also observed in 24% of responder patients, but in 52% of nonresponder patients ().

Treatment predictors

Of the 36 initial features, 19 were selected for further analysis as the result of feature selection and others were removed. Attributes such as sexual and contamination obsessions, obsession severity, and illness duration were selected to predict treatment response of patients, and attributes such as marital status, hoarding, and religious obsessions were removed from further analysis.

Model evaluation

summarizes the average performance of different classification algorithms for the treatment response data set based on the results of 20 replications of 10-fold cross-validation. We used 10-fold stratified cross-validation for evaluating predictive models that is the best estimation technique, especially in data sets with small sample size.Citation87,Citation88 It must be noted that our data set contains 151 samples that belong to three classes; 95 patients responded adequately to treatment (responder class), 34 patients exhibited inadequate response to fluvoxamine (nonresponder class), and 22 patients belonged to refractory class. The values in the parentheses are the confidence interval of that measure at a 95% confidence level. The results showed that MLP is not a good algorithm for this data set, and decision tree is relatively better than other classification models. The results also showed that the new method which is proposed in this study is the best classifier for dealing with this data set. The accuracy, sensitivity, and specificity of the proposed method were 86%, 79%, and 89%, respectively.

Table 2 Accuracy, sensitivity, and specificity of different classification algorithms applied on the current OCD data set based on 20 repetitions of 10-fold cross-validation

shows the values of TP, FP, FN, and TN for each treatment response class, which were resulted from applying conventional methods and new analysis method on our data set. The proposed analytical algorithm correctly assigned 83 subjects to responder class, 23 subjects to nonresponder class, and 13 subjects to refractory class. These tables show that some algorithms such as RF are very good at the prediction of responder class, but very poor on two other classes.

Table 3 TP, FP, FN, and TN values for each treatment response class resulted from some algorithms and new method

summarizes the accuracy, sensitivity, and specificity of the new method for each treatment response class. The accuracy values of the proposed method for prediction of classes 1–3 were 83%, 85%, and 89%, respectively. The sensitivity values were 87%, 68%, and 59%, respectively, for classes 1–3. Specificity measures were obtained as 77%, 90%, and 94%, respectively, for classes 1–3.

Table 4 Accuracy, sensitivity, and specificity of each of class resulted from the new method

Discussion

In the current study, we aimed at proposing a model suitable to predict the final outcome of OCD pharmacotherapy using fluvoxamine. From the results of our analysis, it appears that sexual and contamination obsessions and higher Y-BOCS obsessive scores are high-ranked predictors of resistance to fluvoxamine pharmacotherapy. Moreover, our proposed strategies for data analysis including missing value handling, feature selection using MRMR algorithm, and new analytical method based on ensemble classification are best suited for dealing with the OCD treatment response data set and prediction of fluvoxamine pharmacotherapy result.

Data mining approaches can be used to process high-dimensional data sets such as medical data sets for the prediction of treatment response in order to help individualize clinical assignment of treatment. In the current study, we proposed a new ensemble classification that showed good accuracy, sensitivity, and specificity over some commonly used classification methods. This method appears to have a good potential for medical decision-making in the assignment of patients to treatment based on clinical characteristics.

Medical data sets including those related to treatment response compromise numerous demographic, clinical, and genetic variables over a relatively small number of patients, which present challenges for analysis and data extraction. There are limited studies investigating the prediction of treatment response in OCD patients, and almost all of them dealt with these challenges. In this study, we used several strategies to reduce the number of variables and increase the ratio of samples to variables. First, we used MRMR algorithm for weighting the attributes. Second, we eliminated features with low association with treatment response classes. Finally, we applied attribute bagging for improving the accuracy and stability of classifier ensembles using random subsets of features.

The results of the current study showed that predominance of contamination and sexual symptoms, as well as high scores in Y-BOCS obsessive subscores, are predictive of poor response to fluvoxamine. These findings are consistent with the previous reports that contamination and cleaning may represent markers of poor prognosis.Citation15,Citation18,Citation19,Citation25,Citation38

Previous investigations revealed that different brain networks might be involved in OC symptoms such as checking, washing, symmetry, and hoarding.Citation89Citation94 These findings suggest that poor treatment response of OCD patients with predominant contamination and sexual symptoms and higher obsession scores may be related to relatively distinct neural circuits correlated with these symptoms.

The current investigation has a number of important strengths as well as some key limitations. This study is the first try to investigate predictors of OCD treatment response in an Iranian sample. Moreover, we proposed a new method of ensemble classification for treatment response prediction, which enables more comprehensive examination of potential predictors of remission. Missing value handling and feature selection improved analysis approach and make it possible to perform more accurate prediction. The other advantage of the current study was the administration of just one drug, fluvoxamine, which might reduce confounding factors.

There are several shortcomings inherent in this study. Treatment response was evaluated after 12-week pharmaco-therapy, while the assessment of treatment response outcome in more extended period may detect stronger predictors of remission. Besides, some patients who reported drug side effects were excluded from the study or did not complete their medication. Altogether, a large number of patients were eliminated in different stages of the study, resulted in the small sample size of the study. Moreover, sampling was performed from a single medical center at Tehran leaded to the lack of diversity in our sample. Therefore, it may prevent our findings from being generalized to members of different cultural groups or other populations. Replication of the proposed method with independent and larger samples is suggested.

Conclusion

This study proposed an analytical approach that is an accurate and a sensitive method for the analysis of high-dimensional medical data sets containing more number of missing values. The treatment of OCD could be improved by better understanding of the predictors of pharmacotherapy, which may lead to more effective treatment of patients with OCD and more accurate prognostic information to them. Moreover, it helps to estimate how many patients will need access to alternative treatments, which is important from the public health perspective.

Acknowledgments

This research was supported by grants from Neuroscience Research Center, Shahid Beheshti University of Medical Sciences, Iran. We are also grateful to Imam Hossain Hospital staff for their cooperation with our research team.

Disclosure

The authors report no conflicts of interest in this work.

References

  • HaslerGLaSalle-RicciVHRonquilloJGObsessive–compulsive disorder symptom dimensions show specific relationships to psychiatric comorbidityPsychiatry Res2005135212113215893825
  • MohammadiMRGhanizadehARahgozarMPrevalence of obsessive-compulsive disorder in IranBMC Psychiatry20044215018627
  • DoughertyDDRauchSLJenikeMAPharmacotherapy for obsessive-compulsive disorderJ Clin Psychol200460111195120215389617
  • PallantiSQuercioliLTreatment-refractory obsessive-compulsive disorder: methodological issues, operational definitions and therapeutic linesProg Neuropsychopharmacol Biol Psychiatry200630340041216503369
  • BlochMLanderos-WeisenbergerAKelmendiBCoricVBrackenMBLeckmanJFA systematic review: antipsychotic augmentation with treatment refractory obsessive-compulsive disorderMolecular Psychiatry200611762263216585942
  • ErzegovesiSCavalliniMCCavediniPDiaferiaGLocatelliMBellodiLClinical predictors of drug response in obsessive-compulsive disorderJ Clin Psychopharmacol200121548849211593074
  • GoodmanWKPharmacotherapy of obsessive-compulsive disorderHandIGoodmanWKEversUZwangsstörungen/Obsessive-Compulsive DisordersBerlin, HeidelbergSpringer1992141151
  • FinebergNAReghunandananSSimpsonHBObsessive–compulsive disorder (OCD): practical strategies for pharmacological and somatic treatment in adultsPsychiatry Res2015227111412525681005
  • HazariNNarayanaswamyJCArumughamSSPredictors of response to serotonin reuptake inhibitors in obsessive-compulsive disorderExpert Rev Neurother201616101175119127282021
  • StorchEALarsonMJShapiraNAClinical predictors of early fluoxetine treatment response in obsessive–compulsive disorderDepression Anxiety200623742943316841343
  • SteinDJMontgomerySAKasperSTanghojPPredictors of response to pharmacotherapy with citalopram in obsessive-compulsive disorderInt Clin Psychopharmacol200116635736111712625
  • Mataix-ColsDRauchSLManzoPAJenikeMABaerLUse of factor-analyzed symptom dimensions to predict outcome with serotonin reuptake inhibitors and placebo in the treatment of obsessive-compulsive disorderAm J Psychiatry199915691409141610484953
  • SalomoniGGrassiMMosiniPRivaPCavediniPBellodiLArtificial neural network model for the prediction of obsessive-compulsive disorder treatment responseJ Clin Psychopharmacol200929434334919593173
  • SaxenaSMaidmentKMTreatment of compulsive hoardingJ Clin Psychology2004601111431154
  • SteinDJAndersenEWOveroKFResponse of symptom dimensions in obsessive-compulsive disorder to treatment with citalopram or placeboRev Bras Psiquiatr200729430330718200396
  • SamuelsJBienvenuOJ3rdRiddleMAHoarding in obsessive compulsive disorder: results from a case-control studyBehav Res Ther200240551752812043707
  • SteinDJCareyPDLochnerCSeedatSFinebergNAndersenEWEscitalopram in obsessive-compulsive disorder: response of symptom dimensions to pharmacotherapyCNS Spectr2008130649249818567973
  • AlarconRDLibbJWSpitlerDA predictive study of obsessive-compulsive disorder response to clomipramineJ Clin Psychopharmacol19931332102138354737
  • RavizzaLBarzegaGBellinoSBogettoFMainaGPredictors of drug treatment response in obsessive-compulsive disorderJ Clin Psychiatry19955683683737635854
  • Landeros-WeisenbergerABlochMHKelmendiBDimensional predictors of response to SRI pharmacotherapy in obsessive–compulsive disorderJ Affect Disord20101211–217517919577308
  • NezirogluFPintoAYaryura-TobiasJAMcKayDOvervalued ideation as a predictor of fluvoxamine response in patients with obsessive–compulsive disorderPsychiatry Res20041251536014967552
  • Ravi KishoreVSamarRJanardhan ReddyYChandrasekharCRThennarasuKClinical characteristics and treatment response in poor and good insight obsessive–compulsive disorderEur Psychiatry200419420220815196601
  • AlonsoPMenchonJMPifarreJLong-term follow-up and predictors of clinical outcome in obsessive-compulsive patients treated with serotonin reuptake inhibitors and behavioral therapyJ Clin Psychiatry200162753554011488364
  • Mataix-ColsDMarksIMGreistJHKobakKABaerLObsessive-compulsive symptom dimensions as predictors of compliance with and response to behaviour therapy: results from a controlled trialPsychother Psychosom200271525526212207105
  • AsklandKDGarnaatSSibravaNJPrediction of remission in obsessive compulsive disorder using a novel machine learning strategyInt J Methods Psychiatr Res201524215616925994109
  • AckermanDLGreenlandSBystritskyAMorgensternHKatzRJPredictors of treatment response in obsessive-compulsive disorder: multivariate analyses from a multicenter trial of clomipramineJ Clin Psychopharmacol19941442472547962680
  • SkoogGSkoogIA 40-year follow-up of patients with obsessive-compulsive disorderArch Gen Psychiatry199956212112710025435
  • Rosario-CamposMCLeckmanJFMercadanteMTAdults with early-onset obsessive-compulsive disorderAm J Psychiatry2001158111899190311691698
  • FontenelleLFMendlowiczMVMarquesCVersianiMEarly- and late-onset obsessive–compulsive disorder in adult patients: an exploratory clinical and therapeutic studyJ Psychiatr Res200337212713312842166
  • EisenJLPintoAManceboMCDyckIROrlandoMERasmussenSAA 2-year prospective follow-up study of the course of obsessive-compulsive disorderJ Clin Psychiatry20107181033103920797381
  • MinichielloWEBaerLJenikeMASchizotypal personality disorder: a poor prognostic indicator for behavior therapy in the treatment of obsessive-compulsive disorderJ Anxiety Disord198713273276
  • McDougleCJGoodmanWKPriceLHNeuroleptic addition in fluvoxamine-refractory obsessive-compulsive disorderAm J Psychiatry199014756526541970224
  • BaerLFactor analysis of symptom subtypes of obsessive compulsive disorder and their relation to personality and tic disordersJ Clin Psychiatry199455Suppl1823
  • MundoEErzegovesiSBellodiLFollow-up of obsessive-compulsive patients treated with proserotonergic agentsJ Clin Psychopharmacol19951542882897593716
  • ShavittRGBelottoCCuriMClinical features associated with treatment response in obsessive-compulsive disorderCompr Psychiatry200647427628116769302
  • FinebergNAHengartnerMPBergbaumCGaleTRösslerWAngstJRemission of obsessive-compulsive disorders and syndromes; evidence from a prospective community cohort study over 30 yearsInt J Psychiatry Clin Pract201317317918723428237
  • MarcksBAWeisbergRBDyckIKellerMBLongitudinal course of obsessive-compulsive disorder in patients with anxiety disorders: a 15-year prospective follow-up studyCompr Psychiatry201152667067721349511
  • CatapanoFPerrisFMasellaMObsessive–compulsive disorder: a 3-year prospective follow-up study of patients treated with serotonin reuptake inhibitors: OCD follow-up studyJ Psychiatr Res200640650251016904424
  • HollanderEBienstockCAKoranLMRefractory obsessive-compulsive disorder: state-of-the-art treatmentJ Clin Psychiatry200163Suppl 62029
  • SteketeeGEisenJDyckIWarshawMRasmussenSPredictors of course in obsessive compulsive disorderPsychiatry Res199989322923810708269
  • SteketeeGVan NoppenBFamily approaches to treatment for obsessive compulsive disorderRev Bras Psiquiatr2003251435012975679
  • EisenJLSibravaNJBoisseauCLFive-year course of obsessive-compulsive disorder: predictors of remission and relapseJ Clin Psychiatry201374323323923561228
  • DenysDBurgerHvan MegenHde GeusFWestenbergHA score for predicting response to pharmacotherapy in obsessive–compulsive disorderInt Clin Psychopharmacol200318631532214571151
  • TükelRBozkurtOPolatAGençAAtliHClinical predictors of response to pharmacotherapy with selective serotonin reuptake inhibitors in obsessive–compulsive disorderPsychiatry Clin Neurosci200660440440916884439
  • NakataniENakagawaANakaoTA randomized controlled trial of Japanese patients with obsessive-compulsive disorder–effectiveness of behavior therapy and fluvoxaminePsychother Psychosom200574526927616088264
  • AckermanDLGreenlandSBystritskyAClinical characteristics of response to fluoxetine treatment of obsessive-compulsive disorderJ Clin Psychopharmacol19981831851929617976
  • D’alcanteCCDinizJBFossaluzaVNeuropsychological predictors of response to randomized treatment in obsessive–compulsive disorderProg Neuropsychopharmacol Biol Psychiatry201239231031722789662
  • FerrãoYAShavittRGBedinNRClinical features associated to refractory obsessive–compulsive disorderJ Affect Disord2006941–319920916764938
  • FarnamAGoreishizadehMAFarhangSEffectiveness of fluoxetine on various subtypes of obsessive-compulsive disorderArch Iran Med200811552252518759519
  • CleophasTJZwindermanAHCleophas-AllersHIMachine Learning in MedicineBerlin, HeidelbergSpringer2013
  • HoexterMQMiguelECDinizJBShavittRGBusattoGFSatoJRPredicting obsessive–compulsive disorder severity combining neuroimaging and machine learning methodsJ Affect Disord201315031213121623769292
  • Rajezi EsfahaniSMotaghipourYKamkariKZahiredinAJanbozorgiMReliability and validity of the Persian version of the Yale-Brown Obsessive-Compulsive Scale (Y-BOCS)Iran J Psychiatry Clin Psychol2012174297303
  • CorregiariFMBernikMCordeiroQValladaHEndophenotypes and serotonergic polymorphisms associated with treatment response in obsessive-compulsive disorderClinics (Sao Paulo)201267433534022522758
  • HasanpourHAsadiSGhavamizadeh MeibodiRA critical appraisal of heterogeneity in obsessive-compulsive disorder using symptom-based clustering analysisAsian J Psychiatr201728899628784407
  • AcunaERodriguezCThe treatment of missing values and its effect on classifier accuracyBanksDMcMorrisFRArabiePGaulWClassification, Clustering, and Data Mining ApplicationsBerlin, HeidelbergSpringer2004639647
  • KohaviRJohnGHWrappers for feature subset selectionArtif Intell1997971–2273324
  • Bolón-CanedoVSánchez-MaroñoNAlonso-BetanzosAA review of feature selection methods on synthetic dataKnowl Inf Syst2013343483519
  • SaeysYInzaILarrañagaPA review of feature selection techniques in bioinformaticsBioinformatics200723192507251717720704
  • ModelFAdorjanPOlekAPiepenbrockCFeature selection for DNA methylation based cancer classificationBioinformatics200117Suppl 1157164
  • XiongMFangXZhaoJBiomarker identification by feature wrappersGenome Research200111111878188711691853
  • PengHLongFDingCFeature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancyIEEE Trans Pattern Anal Mach Intel200527812261238
  • DingCPengHMinimum redundancy feature selection from microarray gene expression dataJ Bioinform Comput Biol20053218520515852500
  • BrownGPocockAZhaoMJLujánMConditional likelihood maximisation: a unifying framework for information theoretic feature selectionJ Mach Learn Res20121312766
  • AhnHMoonHFazzariMJLimNChenJJKodellRLClassification by ensembles from random partitions of high-dimensional dataJ Comput Stat Data Anal2007511261666179
  • RokachLPattern Classification Using Ensemble Methods75World Scientific2010SingaporeWorld Scientific Publishing Company
  • TanACGilbertDDevilleYMulti-class protein fold classification using a new ensemble machine learning approachGenome Inform20031420621715706535
  • MangiameliPWestDRampalRModel selection for medical diagnosis decision support systemsDecis Support Syst2004363247259
  • ÖzçiftARandom forests ensemble classifier trained with data resampling strategy to improve cardiac arrhythmia diagnosisComput Biol Med201141526527121419401
  • BreimanLBagging predictorsMach Learn1996242123140
  • SchapireREThe strength of weak learnabilityMach Learn199052197227
  • BreimanLRandom forestsMach Learn2001451532
  • LamLSuenSApplication of majority voting to pattern recognition: an analysis of its behavior and performanceIEEE Trans Syst Man Cybern A Syst Hum1997275553568
  • ZenobiGCunninghamPUsing diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization errorEuropean Conference on Machine Learning2001 Sep 5Berlin, HeidelbergSpringer576587
  • BryllRGutierrez-OsunaRQuekFAttribute bagging: improving accuracy of classifier ensembles by using random feature subsetsPattern Recogn200336612911302
  • TurnerKOzaNCDecimated input ensembles for improved generalization. In Neural Networks, 1999. IJCNN’99International Joint Conference on1999530693074
  • De JongDAn Analysis of the Behavior of a Class of Genetic Adaptive SystemsPhD ThesisAnn ArborDepartment of Computer and Communication Sciences, University of Michigan1975
  • HornikKStinchcombeMWhiteHUniversal approximation of an unknown mapping and its derivatives using multilayer feedforward networksNeural Net199035551560
  • CoverTHartPNearest neighbor pattern classificationIEEE Trans Inf Theory19671312127
  • VapnikVThe Nature of Statistical Learning TheoryBerlin, HeidelbergSpringer Science & Business Media2013
  • VapnikVNVapnikVStatistical Learning Theory1New YorkWiley1998
  • XingYWangJZhaoZCombination data mining methods with new medical data to predicting outcome of coronary heart disease. In Convergence Information Technology, 2007International Conference on20071121868872
  • SartakhtiJSZangooeiMHMozafariKHepatitis disease diagnosis using a novel hybrid method based on support vector machine and simulated annealing (SVM-SA)Comput Methods Programs Biomed2012108257057921968203
  • QuinlanJRC45: Programs for Machine LearningSan MateoMorgan Kaufmann1993
  • KhaliliaMChakrabortySPopescuMPredicting disease risks from highly imbalanced data using random forestBMC Med Inform Decis Mak20111115121801360
  • YangFWangHZMiHLinCDCaiWWUsing random forest for reliable classification and cost-sensitive learning for medical diagnosisBMC Bioinformatics200910Suppl 1S22
  • SokolovaMLapalmeGA systematic analysis of performance measures for classification tasksInform Proces Manage2009454427437
  • KohaviRA study of cross-validation and bootstrap for accuracy estimation and model selectionIjcai199514211371145
  • SeniGElderJFEnsemble methods in data mining: improving accuracy through combining predictionsSynth Lect Data Mining Knowl Discov2010211126
  • Mataix-ColsDWoodersonSLawrenceNBrammerMJSpeckensAPhillipsMLDistinct neural correlates of washing, checking, and hoarding symptom dimensions in obsessive-compulsive disorderArch Gen Psychiatry200461656457615184236
  • GilbertARMataix-ColsDAlmeidaJRBrain structure and symptom dimension relationships in obsessive–compulsive disorder: a voxel-based morphometry studyJ Affect Disord2008109111712618342953
  • AlvarengaPGdo RosárioMCBatistuzzoMCObsessive-compulsive symptom dimensions correlate to specific gray matter volumes in treatment-naïve patientsJ Psychiatr Res201246121635164223040160
  • ShapiraNALiuYHeAGBrain activation by disgust-inducing pictures in obsessive-compulsive disorderBiol Psychiatry200354775175614512216
  • van den HeuvelOARemijnsePLMataix-ColsDThe major symptom dimensions of obsessive-compulsive disorder are mediated by partially distinct neural systemsBrain2008132Pt 485386818952675
  • MurayamaKNakaoTSanematsuHDifferential neural network of checking versus washing symptoms in obsessive-compulsive disorderProg Neuropsychopharmacol Biol Psychiatry20134016016622996045