513
Views
13
CrossRef citations to date
0
Altmetric
Articles

ParkinsoNET: Estimation of UPDRS Score Using Hubness-Aware Feedforward Neural Networks

&

ABSTRACT

Parkinson’s disease is a worldwide, frequent, neurodegenerative disorder with increasing incidence. Speech disturbance appears during the progression of the disease. The Unified Parkinson’s Disease Rating Scale (UPDRS) is a gold-standard tool for diagnosis and follow-up of the disease. We aim at estimating the UPDRS score based on biomedical voice recordings. In this article, we study the hubness phenomenon in context of the UPDRS score estimation and propose hubness-aware error correction for feedforward neural networks to increase the accuracy of estimation. We perform experiments on publicly available datasets derived from real-voice data and show that the proposed technique systematically increases the accuracy of various feedforward neural networks.

Introduction

Parkinson’s disease (PD) is one of the most important neurodegenerative disorders and is increasing in incidence. PD affects 7 to 10 million people worldwide. Clinically, PD is characterized by cardinal symptoms: initially unilateral, asymmetrical resting tremor (shaking of the hand), rigidity and bradykinesia (slow movement). In addition to these symptoms (motor disturbances), the disorder is associated with nonmotor, neuropsychiatric symptoms such as cognitive impairment, autonomic dysregulation, and sleep problems (Crosiers et al. Citation2011).

Total PD-related costs are estimated as $25 billion per year in the United States alone, and “medication costs for an individual person with PD average $2,500 a year, and therapeutic surgery can cost up to $100,000 dollars per patient.”Footnote1 As noted by De Rijk et al. (Citation1997) and De Lau and Breteler (Citation2006), the prevalence increases with the age, and the increasing importance of PD is underlined by the fact that “most economically developed and many developing countries are experiencing marked demographic shifts, with progressively larger proportions of their populations entering old age” (Pringsheim et al. Citation2014).

The UPDRS is the most commonly used scale in the clinical study of PD (Ramaker et al. Citation2002). Roughly speaking, the UPDRS score of a particular patient describes the severity of the disease in the case of that patient (see the references in the following subsection for more details on UPDRS). The UPDRS score might change over time, indicating the success of treatment or the progression of the disease. Ideally, the UPDRS score would be measured regularly and relatively often in order to provide medical doctors, the patient, and his/her relatives with detailed information about the progression of the disease and to contribute to the patient’s awareness of the disease, which is one of the most relevant factors influencing the efficiency of the treatment. However, because the assessment of the UPDRS score requires notable effort and the available capacity of medical personnel is a bottleneck, under realistic conditions, the total UPDRS score is measured with a relatively low frequency, e.g., at the beginning of the treatment and after several months.

Little et al. (Citation2009), Tsanas et al. (Citation2010), and Sakar et al. (Citation2013) have shown that the UPDRS score is related to various characteristics of the voice; thus, at least in theory, it could be estimated based on a patient’s speech while he/she makes telephone or Skype calls using his/her smartphone or tablet. With our current study, we would like to take a step toward this visionary application which, in the long term, is expected to allow continuous monitoring of the patient’s UPDRS score and almost immediate identification of its substantial changes.

One of the major challenges associated with the aforementioned visionary application is the fact that the exact function describing how the UPDRS score depends on (the combination of) quantifiable characteristics of speech is unknown. Therefore, state-of-the-art solutions for the estimation of UPDRS scores from voice data are based on machine learning (Sakar et al. Citation2013; Tsanas et al. Citation2011). Following the machine learning paradigm, voice data may be collected from a large set of patients’ audio recordings and UPDRS scores at the time of recording. Such data allows machine learning approaches to “discover” the dependency between the characteristics of the voices and the UPDRS scores so that the UPDRS scores of “new” patients can be estimated based on their speech.

Because artificial neural networks (ANNs) are known to be universal approximators (Pang-Ning, Steinbach, and Kumar Citation2006), we base our solution on ANNs. In particular, after studying the hubness phenomenon and the presence of bad hubs in the context of the UPDRS score estimation in a later section, we propose hubness-aware error correction for ANNs in order to increase the accuracy of estimation. To the best of our knowledge, the current work is the first attempt to exploit hubness in the context of ANNs. We perform experiments on publicly available datasets derived from real-voice data and show that the proposed error correction technique systematically increases the accuracy of various feedforward neural networks.

Background

In order to ensure that this article is self-contained, we provide the most relevant background information about the UPDRS in the following subsection and then review related works and basic definitions used throughout the article, respectively, in the two subsections that follow that.

Unified Parkinson’s disease rating scale

The Unified Parkinson’s Disease Rating Scale (UPDRS) was developed to provide a comprehensive coverage of the symptoms, in order to allow for clinical examination and follow-up of the progression of the disease. Today it serves as the gold-standard reference scale.

The scale has four parts. Part I (previously titled Mentation) was designed to assess nonmotor experiences of daily living. Part II (previously called Activities of Daily Living) assesses motor experiences of daily living. Part III (a.k.a. the Motor part) contains the examination of the patient’s motor skills, and Part IV (titled as Complications) considers motor complications.

The aforementioned parts of the UPDRS are measured at different frequencies; for example, according to Goetz, Poewe, and Rascol (Citation2003), Part III was used in 98% of the cases, whereas Part I was used with a frequency of 60% only. For more details about the UPDRS score, the reader is referred to Goetz et al. (Citation2008).

Alteration of speech is a well-known symptom of PD, about 70% of PD patients exhibit speech impairment (Hartelius and Svensson Citation1994; Logemann et al. Citation1978). Speech disturbances are represented in Parts II and III of the UPDRS. Speech disturbances in PD are characterized by hypophonia, hypokinetic dysarthria, palilalia, and speech dysfluency. With the progression of the disease, due to the involvement of speech organs, worsening of speech is known. Moreover, “positive effect of L-dopa treatment on speech disorders could be objectively confirmed” (Pawlukowska et al. Citation2015). In this work, we aim to estimate patients’ UPDRS scores from voice measures.

Related works

Machine learning techniques are widely applied for medical tasks (Cyganek and Wozniak Citation2015; Grana et al. Citation2011; Froelich, Wrobel, and Porwik Citation2015).

As we formalize the task of automated estimation of UPDRS score as a regression task, when reviewing related works, we focus on regression, which is one of the most prominent fields of machine learning with various applications in medicine (Busra Celikkaya et al. Citation2013; Soyiri, Reidpath, and Sarran Citation2013). In the last decades, various regression techniques have been developed, ranging from simple linear and polynomial regression over nearest neighbor regression, to more complex models, such as artificial neural networks (ANNs) and support vector regression (Devroye et al. Citation1994; Adamczak, Porollo, and Meller Citation2004; Basak, Pal, and Patranabis Citation2007).

One of the most interesting recent observations is the presence of hubs in various datasets. Informally, hubs are instances that are similar to a surprisingly high number of other instances. Unfortunately, some of the hubs are bad in the general sense that they could mislead machine learning algorithms. The presence of hubs has been studied and surveyed primarily in the context of classification, clustering, and instance selection (Radovanović, Nanopoulos, and Mirjana Citation2010a, Citation2010b; Tomašev and Dunja Citation2013; Radovanović, Nanopoulos, and Ivanović Citation2009; Tomašev et al. Citation2011; Tomašev, Radovanović, et al. 2015; Buza, Nanopoulos, and Schmidt-Thieme Citation2011; Tomašev, Buza, et al. Citation2015).

To the best of our knowledge, Buza, Nanopoulos, and Nagy (Citation2015) were the first to study the presence of hubs in regression tasks. They focused on nearest neighbor regression and considered various applications, whereas in the subsequent sections of this article, we study the role of hubs in the estimation of the UPDRS score and propose a hubness-aware enhancement of ANNs.

Definitions and notations

A dataset Δ containing n instances is given. In our case, each instance corresponds to an audio recording. Numeric features describing characteristic properties of the voice are extracted; therefore, each instance is a vector of such features. Instances are denoted by . For each instance , the value of the continuous target, i.e., UPDRS score, is given and it is denoted by . We say that is the label of instance and Δ is the training dataset. With regression, we mean the task of predicting (estimating) the label of an instance .

We use to denote the distance between two instances and . In order to study the hubness phenomenon, we will use the notion of -nearest neighbors of an instance , which is a subset of so that and

We may omit the upper index Δ whenever there is no ambiguity. We note that ties may be broken arbitrarily, i.e., in the case that several subsets fulfill the above condition, any of them may be used as the set of nearest neighbors.

Bad hubs in UPDRS score estimation

Informally, hubness in datasets refers to the phenomenon that some instances are similar to a surprisingly large number of other instances. In order to quantitatively study hubness in context of UPDRS score estimation from voice data, we use the notion of k-nearest neighbors.

Let us first note that the k-nearest neighbor relationship is asymmetric: although each instance has k nearest neighbors, an instance does not necessarily appear k times as one of the k-nearest neighbors of other instances. This is illustrated in for k = 1. In order to keep the example simple, we consider two-dimensional vector data, therefore, instances correspond to points of the plane. In the context of UPDRS score estimation from speech data, we may imagine a simple scenario in which two numeric features of the audio signals (such as shimmer and jitter) are extracted and we use only these two features to represent the data. Each of these features may correspond to either the horizontal or vertical axis, thus, the audio recordings may be mapped to points in the plane.

Figure 1. (a) The nearest neighbor relationship is asymmetric. Some instances never appear as the first nearest neighbor of other instances, although there are some instances that appear frequently as the first nearest neighbor of other instances. (b) Example used to illustrate error correction.

Figure 1. (a) The nearest neighbor relationship is asymmetric. Some instances never appear as the first nearest neighbor of other instances, although there are some instances that appear frequently as the first nearest neighbor of other instances. (b) Example used to illustrate error correction.

In , there is a directed edge from each instance (denoted by a circle) to its first nearest neighbor. Whereas each instance has exactly one first nearest neighbor, how many times an instance appears as the first nearest neighbor of other instances (i.e., the number of incoming edges to an instance) is not necessarily one. As can be seen, some of the instances never appear as nearest neighbors of others, and there is an instance that appears as the first nearest neighbor of three other instances. In particular, the integer next to each instance shows how many times it appears as the first nearest neighbor of others.

Generally, we use to denote how many times the instance appears as one of the k-nearest neighbors of other instances of Δ. It is easy to see that the expected value of is , however, the actual value of varies from instance to instance. As was shown by (Radovanović et al. Citation2010a; Buza et al. Citation2011; Tomašev and Dunja Citation2013) in many cases, the distribution of is substantially skewed to the right, i.e., there are a few instances with extraordinarily high values, furthermore, the skewness increases with increasing intrinsic dimensionality of the data. Usually, instances having surprisingly high are called hubs, and instances with exceptionally low are called antihubs. More precisely, we say that an instance is a hub, if ; and an instance is an antihub if . The phenomenon that is skewed is called hubness and it is often quantified by the third standardized moment (skewness) of the distribution of .

In order to show that there are instances that might mislead machine learning models, we perform the following analysis on the telemonitoring and multiple sound recording datasets, both of them containing voice data for UPDRS estimation. (The datasets are described in a later section in more detail.) We considered both estimation tasks (total and motor UPDRS score) associated with the telemonitoring data separately. For each instance , as , we calculate the average absolute difference between the label of (i.e., UPDRS score associated with ) and the labels of those instances that have as one of their nearest neighbors. Formally, let

(1)

Figure 2. The distribution of in the case of motor UPDRS scores of the telemonitoring dataset for (a) low-error instances, (b) high-error instances, and (c) both histograms in the same plot. Similar observations can be made for the multiple sound recording dataset and total UPDRS scores of the telemonitoring dataset as well. Note that some of the high-error instances appear as nearest neighbors of many other instances, i.e., there are bad hubs in the data. Remarkably, the distribution of high-error instances is shifted to the right compared with the distribution of low-error instances. This indicates that there are more high-error hubs than low-error hubs.

Figure 2. The distribution of in the case of motor UPDRS scores of the telemonitoring dataset for (a) low-error instances, (b) high-error instances, and (c) both histograms in the same plot. Similar observations can be made for the multiple sound recording dataset and total UPDRS scores of the telemonitoring dataset as well. Note that some of the high-error instances appear as nearest neighbors of many other instances, i.e., there are bad hubs in the data. Remarkably, the distribution of high-error instances is shifted to the right compared with the distribution of low-error instances. This indicates that there are more high-error hubs than low-error hubs.

then

(2)

After calculating the above error for each instance, we ordered the instances according to their errors and selected 25% of the instances having highest error, and another 25% having lowest error. We call these instances high-error instances and low-error instances. Throughout the analysis, we used k = 10 and the Euclidean distance over all biomedical voice features present in the datasets. shows the distributions of for low- and high-error instances of the telemonitoring dataset in the case of motor UPDRS scores. In the figure, the horizontal axis corresponds to N10 (x), while the height of the column shows how many instances have that particular value of N10 (x).

As one can see, the distributions of N10 (x) are notably skewed. Most importantly, some of the high-error instances appear as nearest neighbors of many other instances. In particular, as we defined hubs as instances that appear as nearest neighbors of more than 2k instances, we observe that there are hubs among the high-error instances. We use the term bad hubs to refer to hubs among high-error instances. Additionally, let us note that the distribution of high-error instances is shifted to the right compared with the distribution of low-error instances. This indicates that there are more bad hubs than low-error (or good) hubs.

Hubs tend to be located in dense regions of the data space; according to recent results, they might even serve as cluster centers (Tomašev et al. Citation2015b). Under the assumption that the model will be applied to instances originating from the same (or at least similar) distribution as the distribution from which the training data originates, it is essential for any regressor to perform well on instances being “close to” hubs, because many of the new/test instances are expected to be located exactly in these regions, i.e., in the proximity of hubs. Therefore, in the next section, we devise a mechanism that is able to compensate for the detrimental effect of high-error instances, including high-error instances located at “central” positions, i.e., bad hubs.

Hubness-aware artificial neural networks

Next, we describe error correction, a mechanism that can be used to improve the performance of ANNs. We define the corrected label of an instance x as

(3)

where denotes the set of instances that have as one of their -nearest neighbors; see Equation (1) for the formal definition of . We propose to use the corrected labels instead of the original labels while training ANNs. Although our current work focuses on ANNs, we note that, in principle, the above error correction technique may be used with various other regressors as well.

Using the example in , we illustrate how the corrected labels are calculated. In , training instances are denoted by circles. They are identified by the symbols . The numeric value next to each instance shows its label. In order to keep the example simple, we use to calculate the corrected labels of training instances. For training ANNs, the corrected label of all the training instances need to be calculated, however, we present the calculations only for and because the procedure is the same in the case of the other instances as well. Concretely, the corrected labels of and are

Experiments

In this section, we present the results of our experimental evaluation of the proposed approach on two real-world speech datasets associated with UPDRS scores as prediction targets.

Datasets

The Parkinson’s Telemonitoring dataset (Little et al. Citation2009; Tsanas et al. Citation2010) “is composed of a range of biomedical voice measurements from 42 people with early-stage Parkinson’s disease recruited to a six-month trial of a telemonitoring device for remote symptom-progression monitoring.”Footnote2 In total, the data contains 5875 instances. Both motor (i.e., part III) and total (i.e., all four parts) UPDRS scores as well as temporal information (i.e., on which day the measurements were taken) are available. We use motor to denote the experiments when the motor UPDRS score was used as target; analogously, total denotes the experiments when the total UPDRS score was used as target.

We performed experiments on the Parkinson Speech Dataset with Multiple Types of Sound Recordings as well, to which we refer as multi for simplicity (Sakar et al. Citation2013). This dataset contains 1040 instances.

Both datasets are available in the UCI Machine Learning repository (Bache and Lichman Citation2013). For both datasets, we used jitter and shimmer features.

Experimental protocol

We performed experiments according to the patient-based 10 × 10-fold cross-validation protocol, i.e., in each round of the 10 × 10-fold cross-validation, all the instances belonging to the same patient either appear in the train or test split. This simulates the medically relevant scenario in which historical data is used to train the model, which is then applied to the estimation of UPDRS scores of new patients.

Because temporal information was available in the telemonitoring dataset, an additional estimation task that may be associated with the data is to estimate the change of UPDRS score relative to its initial value. Because it might be highly relevant to identify substantial changes in the UPDRS score of a patient as early as possible, we also evaluated the proposed technique in the UPDRS score change estimation context. These experiments are denoted as change motor and change total, respectively.

In all the aforementioned experiments, features were normalized by subtracting their mean and dividing by their standard deviation. Mean and standard deviation were calculated on the training subset of the data (therefore, the normalization was performed in each round of the 10 × 10-fold cross-validation).

In our experiments, we used the implementation of ANNs from the Weka software package (Witten and Frank Citation2005). In particular, we used feedforward ANNs trained with backpropagation with learning rate of 0.3 and momentum of 0.2, and the number of training epochs was set to 500. In order to show that the proposed approach is indeed able to improve the performance of neural networks, we used neural networks of the same structure both with and without the error correction technique described in previously. While performing error correction, we used the Euclidean distance and as the default -value in order to calculate the nearest neighbor relationship. Error correction was performed only on the training set, i.e., in both cases (with and without error correction) we aimed to predict the same target.

We use a comma-separated list of integers to denote the structure of ANNs: each item of the list corresponds to one of the hidden layers and it denotes the number of neurons in that layer. For example, “Net-5,5” denotes an ANN with two hidden layers, each of them containing 5 neurons. We performed experiments with ANNs of six different structures: one hidden layer with (i) 5, (ii) 10, and (iii) 20 neurons and two hidden layers each of them having (iv) 5, (v) 10, and (vi) 20 neurons respectively.

Performance metrics

We measured the performance of our approach and the baselines in terms of mean absolute error (MAE) and root mean square error (RMSE):

where and denote the test set and its size, respectively, denotes the label predicted for instance , while denotes the true label of instance . We used paired calibrated t-tests proposed by Bouckaert (Citation2003) at significance level of 0.05 to examine if the ANNs with error correction statistically significantly outperform ANNs without error correction. For simplicity, we report results only for MAE, but we note that we observed similar trends for RMSE as well.

Results

In each of the five different contexts (i.e., motor, total, multi, change motor and change total), we examined six types of neural networks, this gave, in total, 30 experiments. In each of these 30 experiments, we compared neural networks using error correction (+EC) and neural networks without error correction (EC).

summarizes the results. We report MAE averaged over 10 10 folds. As one can see, the proposed error-correction mechanism systematically improves the quality of UPDRS score estimation. In particular, in all of the aforementioned 30 experiments, the neural networks with error correction outperformed the neural networks without error correction. The improvement is statistically significant in 24 experiments.

Table 1. Mean absolute error averaged over 10 × 10 folds with (+EC) and without error correction (–EC). The MAE-value is underlined if the difference is statistically significant.

As noted previously, error correction may be used with various models. Therefore, we examined the effect of error correction on other regression techniques as well. shows the performance of various models in the case of estimating the motor UPDRS score with and without error correction. In particular, we consider: (1) Net-5, feedforward artificial neural network with one hidden layer containing five neurons, (2) k-NN, k-nearest neighbor regression with k = 10, (3) M5P regression trees from the aforementioned Weka machine learning library, and (4) SVM, support vector regression with linear kernel. As one can see, error correction is able to improve the performance of these models as well.

Figure 3. (a) Performance (MAE) of various models when predicting the motor UPDRS score with (EC) and without (EC) error correction. (b) Performance of Net-5 with error correction (EC) using various -values, and without error correction (EC).

Figure 3. (a) Performance (MAE) of various models when predicting the motor UPDRS score with (EC) and without (EC) error correction. (b) Performance of Net-5 with error correction (EC) using various -values, and without error correction (EC).

As described, in order to perform error correction, -nearest neighbor relationships are computed first. shows the performance of Net-5 when performing error correction with various k-values. As can be seen, error correction systematically improves the performance of the model for all the examined k values, and k values between 5 and 10 are preferable.

Discussion

During spontaneous usage of tablets, smartphones, and laptops, these devices are able to capture unprecedented amounts and varieties of data about their users. The gathered information might range from the dynamics of typing recorded while the user writes short messages or e-mails, over voice recordings performed during phone or Skype calls, to GPS coordinates and social connectivity features (such as how many people the user regularly writes messages to). It is hypothesized that such information might be related to various disorders or predict unexpected changes of the user’s health conditions (Estrin Citation2013). Given the relatively high and increasing computational power of smartphones and tables, processing and analysis of the data collected during spontaneous usage became technologically possible. This is expected to give rise to visionary health-care applications that support medical doctors in diagnostic decisions and treatment of diseases.

Although much of the research focuses on understanding the background of PD, (Zimprich et al. Citation2003; Balicza et al. Citation2012), in this article, we focused on the estimation of UPDRS score from biomedical voice measurements. Due to the fact that the user’s voice can be simply recorded during spontaneous interactions with tablets or smartphones, we strongly hope that our work is a step toward the usage of tablets and smartphones in the strict follow up of PD. Such visionary applications, once they will be realized, will not only allow for inexpensive and continuous monitoring of patient status, but the resulting data is expected to increase the patients’ and their relatives’ awareness of the disease, which is one of the key factors of successful treatment of the disease.

We hope that, over the long term, similar approaches will be used for diagnosis and monitoring of other diseases, as well. For example, Estrin (Citation2013) reported that reduction of hearing abilities could be detected by examining the volume at which the user listened to music on his smartphone. This increased the patient’s awareness of the disease and convinced the patient to turn to medical doctors for examination. Dementia and cognitive impairment (Bereczki and Szatmári Citation2009), is another domain in which telemonitoring systems might be advantageous over the long term.

With regard to the limitations of our study, we have to mention that it is likely that the accuracy of UPDRS score estimation needs to be increased further for successful applications. Therefore, incorporation of further data mining techniques, such as monotonization (Horváth et al. Citation2011) and monotone models (Horváth and Vojtáš Citation2006) and the adaptation of hybrid solutions (Woźniak, Graña, and Corchado Citation2014) might be advantageous. We also mention that the proposed error-correction technique could contribute to avoiding overfitting to hub instances having nonrepresentative labels. In contrast, conventional regularization techniques focus on model complexity and treat all the instances as equally important. More detailed study of the relation between regularization and error correction is left for future work.

The data recorded during spontaneous usage of smartphones and tablets was originally not designed for diagnostic purposes, therefore, even the most useful pieces of the data can be expected to be only weakly correlated with medically relevant conditions. Note, however, that a combination of weak features can serve as a reasonable predictor even if the features are weak predictors separately, and, by training neural networks, the machine is expected to learn the appropriate combination of those individually weak predictors. However, medical doctors are expected to play a crucial role in the correct interpretation of the data and potential additional examinations. The overall workload of medical doctors will not necessarily decrease: although continuous telemonitoring might replace some of the scheduled face-to-face meetings between patients and doctors, due to the continuous monitoring of UPDRS scores or other conditions and possible false alarms generated by automated recognition systems, patients might ask for more appointments with medical doctors.

Conclusions

In this article, we focused on the automated estimation of UPDRS scores based on biomedical voice measures. This is a crucial component of telemonitoring systems for PD patients. We studied the hubness phenomenon in context of the UPDRS score estimation and proposed a hubness-aware error correction for artificial neural networks. We performed experiments on publicly available real-world datasets and showed that the proposed technique systematically improves estimation accuracy as measured by MAE and RMSE.

Funding

This research was performed within the framework of the grant of the Hungarian Scientific Research Fund – OTKA PD 111710. This article was supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences. N. Á. Varga was supported by the KTIA NAP 13 1-2013-0001.

Additional information

Funding

This research was performed within the framework of the grant of the Hungarian Scientific Research Fund – OTKA PD 111710. This article was supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences. N. Á. Varga was supported by the KTIA NAP 13 1-2013-0001.

Notes

References

  • Adamczak, R., A. Porollo, and J. Meller. 2004. Accurate prediction of solvent accessibility using neural networks-based regression. Proteins: Structure, Function, and Bioinformatics 56 (4):753–767. doi:10.1002/prot.20176.
  • Bache, K., and M. Lichman. 2013. UCI machine learning repository. http://archive.ics.uci.edu/ml/ (accessed on November 30, 2015).
  • Balicza, P., B. Bereznai, A. Takáts, P. Klivényi, G. Dibó, E. Hidasi, I. Balogh, and M. J. Molnár. 2012. The absence of the common LRRK2 G2019S mutation in 120 young onset Hungarian Parkinson’s disease patients. Ideggyogyaszati Szemle 65 (7–8):239–242.
  • Basak, D., S. Pal, and D. C. Patranabis. 2007. Support vector regression. Neural Information Processing-Letters and Reviews 11 (10):203–224.
  • Bereczki, D., and S. Szabolcs. 2009. Treatment of dementia and cognitive impairment: What can we learn from the cochrane library. Journal of the Neurological Sciences 283 (1):207–210. doi:10.1016/j.jns.2009.02.351.
  • Bouckaert, R. R. 2003. Choosing between two learning algorithms based on calibrated tests. Paper presented at Proceedings of the 20th International Conference on Machine Learning (ICML-03), Washington, DC, August 21–24, 2003, 51–58.
  • Busra Celikkaya, E., C. R. Shelton, D. Kale, R. C. Wetzel, and R. G. Khemani. 2013. Non-invasive blood gas estimation for pediatric mechanical ventilation. Paper presented at Machine Learning for Clinical Data Analysis and Healthcare, NIPS Workshop, Lake Tahoe, Nevada, December 5–10, 2013.
  • Buza, K., A. Nanopoulos, and G. Nagy. 2015. Nearest neighbor regression in the presence of bad hubs. Knowledge-Based Systems 86:250–260. doi:10.1016/j.knosys.2015.06.010.
  • Buza, K., A. Nanopoulos, and L. Schmidt-Thieme. 2011. Insight: Efficient and effective instance selection for time-series classification. In Advances in knowledge discovery and data mining, ed. J. Z. Huang, L. Cao, and J. Srivastava, 149–160. Berlin Heidelberg: Springer.
  • Crosiers, D., J. Theuns, P. Cras, and V. B. Christine. 2011. Parkinson disease: Insights in clinical, genetic and pathological features of monogenic disease subtypes. Journal of Chemical Neuroanatomy 42 (2):131–141. doi:10.1016/j.jchemneu.2011.07.003.
  • Cyganek, B., and M. Wozniak. 2015. Tensor based representation and analysis of the electronic healthcare record data. In IEEE international conference on bioinformatics and biomedicine (BIBM), 1383–1390. IEEE.
  • De Lau, L. M. L., and M. M. B. Breteler. 2006. Epidemiology of Parkinson’s disease. The Lancet Neurology 5 (6):525–535. doi:10.1016/S1474-4422(06)70471-9.
  • De Rijk, M. C. D., C. Tzourio, M. M. Breteler, J. F. Dartigues, L. Amaducci, S. Lopez-Pousa, J. M. Manubens-Bertran, A. Alperovitch, and W. A. Rocca. 1997. Prevalence of Parkinsonism and Parkinson’s disease in Europe: The Europarkinson collaborative study. European community concerted action on the epidemiology of Parkinson’s disease. Journal of Neurology, Neurosurgery & Psychiatry 62 (1):10–15. doi:10.1136/jnnp.62.1.10.
  • Devroye, L., L. Györfi, A. Krzyzak, and G. Lugosi. 1994. On the strong universal consistency of nearest neighbor regression function estimates. The Annals of Statistics 1371–1385. doi:10.1214/aos/1176325633.
  • Estrin, D. 2013. Small, n=me, data. Invited Talk at the Conference of the Neural Information Processing Systems Foundation (NIPS), Lake Tahoe, Nevada, December 5–10, 2013.
  • Froelich, W., K. Wrobel, and P. Porwik. 2015. Diagnosis of Parkinson’s disease using speech samples and threshold-based classification. Journal of Medical Imaging and Health Informatics 5 (6):1358–1363. doi:10.1166/jmihi.2015.1539.
  • Goetz, C. G., W. Poewe, O. Rascol, et al. 2003. Movement disorder society task force on rating scales for Parkinson’s disease, the unified Parkinson’s disease rating scale (updrs): Status and recommendations. Movement Disorders 18:738–750. doi:10.1002/mds.10473.
  • Goetz, C. G., B. C. Tilley, S. R. Shaftman, G. T. Stebbins, S. Fahn, P. Martinez-Martin, W. Poewe, et al. 2008. Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (mds-updrs): Scale presentation and clinimetric testing results. Movement Disorders 23 (15):2129–2170. doi:10.1002/mds.22340.
  • Grana, M., M. Termenon, A. Savio, A. Gonzalez-Pinto, J. Echeveste, J. M. Pérez, and A. Besga. 2011. Computer aided diagnosis system for Alzheimer disease using brain diffusion tensor imaging features selected by Pearson’s correlation. Neuroscience Letters 502 (3):225–229. doi:10.1016/j.neulet.2011.07.049.
  • Hartelius, L., and P. Svensson. 1994. Speech and swallowing symptoms associated with Parkinsons disease and multiple sclerosis: A survey. Folia Phoniatrica Et Logopaedica 46 (1):9–17. doi:10.1159/000266286.
  • Horváth, T., A. Eckhardt, K. Buza, P. Vojtas, and L. Schmidt-Thieme. 2011. Value-transformation for monotone prediction by approximating fuzzy membership functions. In Computational Intelligence and Informatics (CINTI), 2011 IEEE 12th International Symposium on, 367–372. IEEE.
  • Horváth, T., and P. Vojtáš. 2006. Ordinal classification with monotonicity constraints. Advances in Data Mining. Applications in Medicine, Web Mining, Marketing, Image and Signal Mining 4065:217–225.
  • Little, M. A., P. E. McSharry, E. J. Hunter, J. Spielman, and L. O. Ramig. 2009. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. Biomedical Engineering, IEEE Transactions on 56 (4):1015–1022. doi:10.1109/TBME.2008.2005954.
  • Logemann, J. A., H. B. Fisher, B. Boshes, and E. R. Blonsky. 1978. Frequency and cooccurrence of vocal tract dysfunctions in the speech of a large sample of Parkinson patients. Journal of Speech and Hearing Disorders 43 (1):47–57. doi:10.1044/jshd.4301.47.
  • Pang-Ning, T., M. Steinbach, and V. Kumar. 2006. Introduction to data mining. Harlow, UK: Pearson Addison-Wesley.
  • Pawlukowska, W., M. Gołab-Janowska, K. Safranow, I. Rotter, K. Amernik, K. Honczarenko, and P. Nowacki. 2015. Articulation disorders and duration, severity and l-dopa dosage in idiopathic Parkinson’s disease. Neurologia I Neurochirurgia Polska 49 (5):302–306. doi:10.1016/j.pjnns.2015.07.002.
  • Pringsheim, T., N. Jette, A. Frolkis, and T. D. L. Steeves. 2014. The prevalence of Parkinson’s disease: A systematic review and meta-analysis. Movement Disorders 29 (13):1583–1590. doi:10.1002/mds.v29.13.
  • Radovanović, M., A. Nanopoulos, and M. Ivanović. 2009. Nearest neighbors in high-dimensional data: The emergence and influence of hubs. In Proceedings of the 26th annual international conference on machine learning, 865–872. ACM.
  • Radovanović, M., A. Nanopoulos, and I. Mirjana. 2010a. Hubs in space: Popular nearest neighbors in high-dimensional data. The Journal of Machine Learning Research 11:2487–2531.
  • Radovanović, M., A. Nanopoulos, and I. Mirjana 2010b. Time-series classification in many intrinsic dimensions. In Proceedings of the 10th SIAM international conference on data mining (SDM), 677–688. SIAM.
  • Ramaker, C., J. Marinus, A. M. Stiggelbout, and B. J. Van Hilten. 2002. Systematic evaluation of rating scales for impairment and disability in Parkinson’s disease. Movement Disorders 17 (5):867–876. doi:10.1002/mds.10248.
  • Sakar, B. E., M. M. Isenkul, C. Okan Sakar, A. Sertbas, F. Gurgen, S. Delil, H. Apaydin, and O. Kursun. 2013. Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings. Biomedical and Health Informatics, IEEE Journal of 17 (4):828–834. doi:10.1109/JBHI.2013.2245674.
  • Soyiri, I. N., D. D. Reidpath, and C. Sarran. 2013. Forecasting peak asthma admissions in London: An application of quantile regression models. International Journal of Biometeorology 57 (4):569–578. doi:10.1007/s00484-012-0584-0.
  • Tomašev, N., K. Buza, K. Marussy, and P. B. Kis. 2015. Hubness-aware classification, instance selection and feature construction: Survey and extensions to time-series, Berlin Heidelberg: Springer.
  • Tomašev, N., and M. Dunja. 2013. Class imbalance and the curse of minority hubs. Knowledge-Based Systems 53:157–172. doi:10.1016/j.knosys.2013.08.031.
  • Tomašev, N., M. Radovanovic, D. Mladenic, and M. Ivanovic. 2011. A probabilistic approach to nearest-neighbor classification: Naive hubness bayesian knn. In Proceedings CIKM.
  • Tomašev, N., M. Radovanović, D. Mladenić, and M. Ivanović. 2015b. Hubness-based clustering of high-dimensional data. In Partitional clustering algorithms, ed. M. E. Celebi, 353–386. Switzerland: Springer.
  • Tsanas, A., M. A. Little, P. E. McSharry, and L. O. Ramig. 2010. Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests. Biomedical Engineering, IEEE Transactions on 57 (4):884–893. doi:10.1109/TBME.2009.2036000.
  • Tsanas, A., M. A. Little, P. E. McSharry, and L. O. Ramig. 2011. Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average parkinson’s disease symptom severity. Journal of the Royal Society Interface 8 (59):842–855. doi:10.1098/rsif.2010.0456.
  • Witten, I. H., and E. Frank. 2005. Data mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco, California, USA.
  • Woźniak, M., M. Graña, and E. Corchado. 2014. A survey of multiple classifier systems as hybrid systems. Information Fusion 16:3–17. doi:10.1016/j.inffus.2013.04.006.
  • Zimprich, A., F. Asmus, P. Leitner, M. Castro, B. Bereznai, N. Homann, E. Ott, et al. 2003. Point mutations in exon 1 of the nr4a2 gene are not a major cause of familial Parkinson’s disease. Neurogenetics 4 (4):219–220. doi:10.1007/s10048-003-0156-x.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.