3,939
Views
4
CrossRef citations to date
0
Altmetric
Research Article

Combining Clinical Symptoms and Patient Features for Malaria Diagnosis: Machine Learning Approach

ORCID Icon, ORCID Icon & ORCID Icon
Article: 2031826 | Received 30 Oct 2021, Accepted 18 Jan 2022, Published online: 30 Jan 2022

ABSTRACT

Presumptive treatment and self-medication for malaria have been used in limited-resource countries. However, these approaches have been considered unreliable due to the unnecessary use of malaria medication. This study aims to demonstrate supervised machine learning models in diagnosing malaria using patient symptoms and demographic features. Malaria diagnosis dataset extracted in two regions of Tanzania: Morogoro and Kilimanjaro. Important features were selected to improve model performance and reduce processing time. Machine learning classifiers with the k-fold cross-validation method were used to train and validate the model. The dataset developed a machine learning model for malaria diagnosis using patient symptoms and demographic features. A malaria diagnosis dataset of 2556 patients’ records with 36 features was used. It was observed that the ranking of features differs among regions and when combined dataset. Significant features were selected, residence area, fever, age, general body malaise, visit date, and headache. Random Forest was the best classifier with an accuracy of 95% in Kilimanjaro, 87% in Morogoro and 82% in the combined dataset. Based on clinical symptoms and demographic features, a regional-specific malaria predictive model was developed to demonstrate relevant machine learning classifiers. Important features are useful in making the disease prediction.

Introduction

Machine learning (ML) is an emerging approach that has shown to be effective in making decisions and predictions from the large quantity of data produced by the healthcare industry. It learns from experience and detects valuable patterns from large, unstructured, and complex datasets to predict future incidences. Today, the biggest challenge in front of the healthcare industry is diagnosing disease with accuracy and at affordable costs. There is a massive amount of complex data available with the hospitals that can be used to extract useful information for diagnosis. The use of these data for future predictions can be done with the help of data mining. The health-care field generates a massive amount of data about clinical assessment, patient records, disease treatment, clinical follow-ups, and medication (Fatima and Pasha Citation2017; Iyer, S, and Sumbaly Citation2015). These massive data can improve health-care delivery when incorporated with machine learning techniques.

Accurate prediction of clinical outcomes is essential to successful decision-making and can lead to better patient care and disease management. For example, in malaria management, accurately predicting which patient should be prescribed a malaria medication and should undergo further checkups may prevent unnecessary use of malaria drugs (Menard and Dondorp Citation2017; Mwai et al. Citation2009). Apart from that, a lack of proper diagnosis might result in mismanagement of other diseases that have related symptoms to malaria. Given common behavior on self-medication with malaria drugs and challenges in the health system in most low-income countries like Tanzania necessitate a machine, learning-based diagnosis model. In addition, the model can assist in correctly diagnosing malaria for patients who cannot get a laboratory-based diagnosis.

The use of ML for malaria diagnosis is not necessarily the right solution. For example, a better solution would be to have rapid malaria diagnostics tests at pharmacies to ensure only malaria patients or those with an anti-malaria prescription are given anti-malarial drugs. However, the rapid tests would be costly for pharmacies and require administration by trained pharmacists or personnel, who may not be available in rural/remote areas. A cheap but effective tool for determining possible malarial status is therefore needed. The ML-based diagnostic tool could be one such tool. Different studies have shown how machine learning assisted other areas of the health-care system (Artificial Intelligence in Healthcare: Past, Present and Future, Citation2017; Davenport and Kalakota Citation2019; Khare et al. Citation2017; Shailaja, Seetharamulu, and Jabbar Citation2018; Sidey-Gibbons and Sidey-Gibbons Citation2019; Triantafyllidis and Tsanas Citation2019). Recently, supervised learning algorithms have been applied in various studies to diagnose malaria (Fuhad et al. Citation2020a; Madhu Citation2020; Masud et al. Citation2020; Muthumbi et al. Citation2019; Poostchi et al. Citation2018; Yang et al. Citation2019). Despite the successful application of machine learning in disease management, most of these applications focus on microscopic image analysis to detect malaria while neglecting that most health facilities do not have a microscope, and patients treat themselves. Noninvasive-based methods such as machine learning are reliable and efficient to classify healthy people and people with malaria. While several studies (Bibin, Nair, and Punitha Citation2017; Das et al. Citation2013; Femi Aminu, Onyebuchi Ogbonnia, and Shehi Shehu Citation2016; Fuhad et al. Citation2020b; Madhu Citation2020; Masud et al. Citation2020; Patil, Yaligar, and Meena Citation2018; Pillay et al. Citation2019; Rajaraman et al. Citation2018; Rajaraman, Jaeger, and Antani Citation2019; Shekalaghe et al. Citation2013; van Driel Citation2020) have suggested that using clinical symptoms in prediction of malaria is not a practical idea, the experiments performed in this study proved the feasibility of using clinical symptoms and patients’ demographic information to predict malaria using machine learning classifiers.

Related Work

Malaria shares similar symptoms with other febrile diseases such as dengue fever, typhoid fever, common cold, respiratory tract infection, dyspepsia, and pneumonia (Abba et al. Citation2011; Crump et al. Citation2017; Nadjm et al. Citation2010). Parasitological tests, in the form of microscopic and rapid diagnostic tests (RDT), are the recommended and standard tools for diagnosing malaria (WHO Citation2019, Citation2020, Citation2021). However, in areas where parasitological tests for malaria are not readily available, the complexity of malaria diagnosis may lead to misdiagnosis, overdiagnosis, and inappropriate presumptive treatment (D’Acremont et al. Citation2009; Gosling et al. Citation2008; Graz et al. Citation2011; Isiguzo et al. Citation2014; UM Citation2016). As specified by WHO, in situations such as rural areas where there is no parasitological test available within 2 hours of presenting for treatment in medical centers, medical doctors can provide a prognosis using a clinical examination and physical examination to treat suspected patients (WHO Citation2015, Citation2021; WHO-Guidelines. Citation2015). Consequently, suspected patients would be presumptively treated. A clinical diagnosis of malaria is traditional among medical doctors. This method is the least expensive and most widely practised. A clinical diagnosis called presumptive treatment is based on the patients’ signs and symptoms and physical findings at the examination. The earliest symptoms of malaria are very nonspecific and include fever, headache, body weakness, chills, dizziness, abdominal pain, diarrhea, nausea, vomiting, anorexia, and pruritus. With the clinical diagnosis, misdiagnosis is possible due to a lack of sufficient knowledge about significant malaria symptoms (other than shivering, fever, and sweating) and non-malaria related factors for clinical diagnosis of malaria (Bria, Yeh, and Bedingfield Citation2021). Presumptive treatment could increase the use of unnecessary anti-malarial drugs, which have side effects and increase the spread of resistance to the drugs (Attinsounon et al. Citation2019; Chipwaza et al. Citation2014; Debora and Moses Citation2017; Hertz et al. Citation2019; Kazaura Citation2017; Mwita et al. Citation2019).

Apart from that, there is a major tendency of self-treatment/medication with over-the-counter medication when malaria-related symptoms are observed. Based on the studies done in Tanzania, it was observed that drug-dispensing shops still sell non-prescription medications frequently, although it is advised that the anti-malarial medications should be administered after a parasitological confirmation of the disease dispense prescription-only treatments (Michael and Mkunde Citation2017; Ndomondo-Sigonda et al. Citation2004). This could lead to disease mismanagement, drugs resistance, and drug shortage (Grobusch and Schlagenhauf Citation2019; Mboera, Makundi, and Kitua Citation2007; Metta et al. Citation2014; Mwai et al. Citation2009; Wang et al. Citation2019). In the efforts of eliminating these issues, the government of Tanzania has established a “not every fever is Malaria” campaign, which aims to educate people that not every fever episode experienced is a malaria case (Baltzell et al. Citation2019), since there are other diseases such as typhoid, dengue, chikungunya, and urinary tract infections that present the same symptoms as malaria (Goodyer Citation2015). The significance of these issues was a substantial drive to develop a malaria prediction model using patients’ symptoms and demographic information. Machine learning techniques have been used as tools for predicting the risk of diseases such as heart disease, diabetes, brain stroke, liver, thyroids disease, and brain cancer (DB, P, and N Citation2018; Habib et al. Citationn.d.; Kim, Choo, and Chang Citation2021; MS, E, and J Citation2020; Priyadarshini, Dash, and Mishra Citation2014; Rao and Renuka Citation2020). In malaria diagnosis, machine learning has been used from diagnostic tools to the prediction of disease presence using patient symptoms and signs. Over the past decade, malaria research has been done in the areas of diagnostic testing (RDT) and microscopy, specifically the automation of these tools (Brown et al. Citation2020; Dharap and Raimbault Citation2020; Ford et al. Citation2020; Ravalji, Shah, and Nai Citation2020; Shekalaghe et al. Citation2013). These studies elicited how machine learning can assist in the reading of microscopic blood smear images to diagnose malaria and automate the complete blood count, which is the test that screens infection in the blood. The performance of machine learning in the automation of these tools has improved, and classifier prediction accuracy has shown potential (Fuhad et al. Citation2020b; Lee, Choi, and Shin Citation2021; Masud et al. Citation2020; van Driel Citation2020). Despite the promising results of these studies unavailability of a microscope and mRDT in some of the health facilities in constrained areas and the self-medication behavior of some of the patients (Bibin, Nair, and Punitha Citation2017; Das et al. Citation2013; Liang et al. Citation2017; Madhu Citation2020; Masud et al. Citation2020; Muthumbi et al. Citation2019; Poostchi et al. Citation2018; Rajaraman et al. Citation2018; Rajaraman, Jaeger, and Antani Citation2019) remain the major challenge.

On the other hand, several machine learning studies have used malaria symptoms, signs, and patient information to diagnose malaria. For example, the study done by (Bria, Yeh, and Bedingfield Citation2021) used malaria symptoms and non-symptom factors to diagnose malaria. It showed potential good prediction accuracy if the combined significant features were identified. However, these studies do not specifically identify significant or important symptoms, notwithstanding their contribution to malaria diagnosis improvement. Furthermore, other studies that used malaria symptoms to diagnose malaria used data mining techniques such as rule-based classification, which are considered weak in classification (A., n.d.; Bbosa, Wesonga, and Jehopio Citation2016; Oguntimilehin Citationn.d.).

In Tanzania, most of the studies have been done in diagnostic testing (RDT and microscopy; (Mpapalika and Matowo Citation2020; Mwanga et al. Citation2019a, Citation2019b). A malaria diagnosis study using symptoms and patients demographic features has never been done in Tanzania. This study aims to fill this important gap in malaria research in Tanzania since the country has settings where diagnostic tools are unavailable and self-treatment is over the chart.

The findings of this study can be used to raise public awareness on the potentiality of using machine learning in classifying malaria patients by developing a simple tool that will be used before administering anti-malaria drugs. Apart from that, the study can raise public awareness of significant malaria symptoms and patient features in the diagnosis of malaria at early stages within Tanzanian societies vulnerable to malaria and reduce the rate of self-medication and presumptive treatment in the country.

Theoretical Background

This study uses the most common supervised machine learning classifiers to build a malaria diagnosis model(Uddin et al. Citation2019). The popular machine learning classifiers for disease diagnosis are Logistics Regression (LR), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF), were used in the model development (Moreno-Ibarra et al. Citation2021).

Logistic Regression (LR) is a robust and well-established method for supervised classification. It can be considered an extension of ordinary regression and can model only a dichotomous variable that usually represents an event’s occurrence or nonoccurrence. This algorithm helps find the probability that the new instance belongs to a particular class. The outcome lies between 0 and 1 since it is a probability(Swaminathan et al. Citation2017; Ullah et al. Citation2019).

K-Nearest Neighbor (KNN) algorithm is a simple iterative method to partition a given dataset into a specified number of clusters, k. Several researchers across different disciplines have discovered this algorithm. The algorithm operates on a set of d-dimensional vectors, D = {xi | i = 1, …, N}, where xi ∈ Rd denotes the ith data point. The algorithm is initialized by picking k points in Rd as the initial k cluster. Techniques for selecting these initial seeds include sampling at random from the dataset, setting them as the solution of clustering a small subset of the data or perturbing the global mean of the data k times (Krishnani et al. Citation2019; Patil, Yaligar, and Meena Citation2018).

Support Vector Machine (SVM) algorithm can classify both linear and non-linear data. It first maps each data item into an n-dimensional feature space where n is the number of features. It then identifies the hyperplane that separates the data items into two classes while maximizing the marginal distance for both classes and minimizing the classification errors (Krishnani et al. Citation2019; Moreno-Ibarra et al. Citation2021; Ullah et al. Citation2019).

Decision Tree (DT) is one of the earliest and prominent machine learning algorithms. A decision tree models the decision logic, i.e., tests and corresponds outcomes for classifying data items into a tree-like structure. The nodes of a DT tree typically have multiple levels where the first or top-most node is called the root node. All internal nodes (i.e., nodes having at least one child) represent tests on input variables or attributes. Depending on the test outcome, the classification algorithm branches toward the appropriate child node, where the process of test and branching repeats until it reaches the leaf node. The leaf or terminal nodes correspond to the decision outcomes. DTs have been found easy to interpret and learn quickly and are a common component of many medical diagnostic protocols. When traversing the tree for the classification of a sample, the outcomes of all tests at each node along the path will provide sufficient information to conjecture about its class (Krishnani et al. Citation2019; S et al. Citation2019; Saranya and Pravin Citation2020; Swaminathan et al. Citation2017).

Random Forest (RF) is an ensemble classifier consisting of many DTs, similar to how a forest is a collection of many trees. DTs grown very deep often cause overfitting the training data, resulting in a high variation in classification outcome for a slight change in the input data. They are susceptible to their training data, making them error-prone to the test dataset(Azar et al. Citation2014; Chen, Liu, and Peng Citation2019; Iyer, S, and Sumbaly Citation2015).

Materials and Methods

This paper aims to develop the machine learning-based model to classify patients with malaria and those without malaria using their symptoms and non-symptoms factors. The machine learning-based model for malaria diagnosis development was structured in five stages, namely; (1) Dataset description and preprocessing, (2) Features selection, (3) Machine learning classifiers, (4) Cross-Validation methods and (5) Classifier’s performance evaluation.

Dataset Description

Study Area

Data were collected from four hospitals in two regions in Tanzania: Morogoro and Kilimanjaro (). The four health facilities are Mawenzi regional hospital and Majengo health center in the Kilimanjaro region and Morogoro regional hospital, and Mzumbe health center in the Morogoro region. Dataset represents the patients who live in the areas with low malaria transmission represented by the Kilimanjaro region and those who live in the areas with high malaria transmission represented by the Morogoro region. The choice of these regions was based on the prevalence of malaria, where Morogoro represents regions with a high prevalence with (15.0%) of malaria prevalence and Kilimanjaro represents regions with low prevalence with (1.0%) of malaria prevalence.

The Method Used and Participants

A malaria patient’s records extraction form was designed to summarize the MoH patient’s file and the information collected when visiting the health facility. The records were retrieved from the patient’s files who have been treated for malaria from 2015 to 2019. The aim was to identify the past state of clinical malaria diagnosis in the local health facilities and understand the standard practice in malaria diagnosis and treatment. The critical information collected was: (i) the patients’ demographic information, (ii) the symptoms presented by the patient when consulting a doctor, (iii) the tests taken and results, (iv) diagnosis based on the laboratory results and (v) the treatment provided. Training nurses administered data collection, and all participants provided written informed consent.

Ethical Clearance

The study was approved by the National Institute for Medical Research Tanzania (NIMR) before the participants were recruited and records were collected. All participants were provided written informed consent to participate in the study. For the case of patient records, consent was given by the health facilities with guidance from NIMR.

Dataset Descriptions and Preprocessing

The malaria diagnosis dataset was used in this study to develop a machine learning model for malaria diagnosis. The dataset was obtained by extracting malaria patients’ diagnosis records from the Tanzania Ministry of health’s patient files in two regions in Tanzania: Morogoro and Kilimanjaro.

The original Malaria diagnosis dataset has a sample size of 2556 patients’ records with 36 features. The targeted output variable has two classes representing patients with malaria (tested positive) and those with no malaria (tested negative). Instances that could lead to individual patients being located or identified were removed to maintain the confidentiality of the patient and ethical practice. Also, missing values were deleted from the dataset. Nominal features were encoded to conform to Scikit-learn and coded 1 for patients with malaria and 0 for patients with no malaria (health people).

Feature Selection

Three sets of features were generated from the malaria diagnosis dataset. The first feature set was derived from applying the features selection to a dataset of only Kilimanjaro (low endemic area) patients, the second from a dataset of only Morogoro (high endemic area) patients and the last from a dataset of both Morogoro and Kilimanjaro (combined areas) patients. Model-based feature selection method, which uses supervised machine learning algorithms to judge the importance of each feature in the dataset, was used in this study to select essential features. Feature selection is one of the vital processes for machine learning because including irrelevant features affect the classification performance of the machine learning model. Model-based feature selection has two approaches: feature importance and selection from the model to select the most significant features (Brodersen et al. Citation2011). Random Forest algorithm was used as a feature selection algorithm to determine important features from the Malaria diagnosis dataset. This algorithm used the tree-based strategies by naturally ranking and improving the purity of the node. Nodes with the most significant decrease in impurity happen at the start of the trees, while notes with the slightest reduction in impurity occur at the end of trees. Thus, a subset of the most important features was created by pruning trees below a particular node.

In both datasets, feature selection algorithms identified a large set of important features (up to 20 features). However, to minimize the complexity of the model, for the regional datasets, only the top 10 significant variables were selected, and for the whole malaria diagnosis dataset, only 15 features were selected. Both features were obtained from the feature selection methods and were employed for models’ development. Nevertheless, it was observed that the ranking of these features was different among datasets where some features that were considered to be the most significant to one region were not as substantial to another region, as shown in . Apart from that, features specific to a particular region, for example, Joint Pain and Dizziness symptoms were only significant in the Kilimanjaro region, and Muscle Pain and Confusion were only important in the Morogoro region. From the malaria diagnosis combined dataset, the most important features are residence area of a patient, fever, age of the patient, general body malaise, visit date, headache, abdominal pain, backache, chest pain, sex of a patient, vomiting, confusion, dizziness, coughing and joint pain.

Table 1. Features selected by random forest algorithm and their ranking for the different datasets

Prediction Classifiers

After the dataset had been described and preprocessed, features were selected based on different machine learning algorithms and the importance of every feature in the predictive variable was done. Then, machine learning classification algorithms were used to classify the patients with malaria and those who do not have malaria. The popular disease diagnosis machine learning classifiers, which are Logistics Regression (LR), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT) and Random Forest (RF), were used in model development. Finally, the machine-learning classifiers’ performance for malaria diagnosis and feature selection was computed and compared to obtain the best performing model.

Machine Learning Classifiers Validation

The study used the repeated K-fold cross-validation (CV) method and four performance evaluation metrics. In repeated k-fold cross-validation, the data set was divided into k equal size of parts. The k – 1 group was used to train the classifiers, and the remaining portion was used to check the outperformance in each step. The execution was repeated a number of times to attain the optimum results. The process of validation was repeated k times. The classifier performance was computed based on k results. For CV, different values of k were selected. In this experiment, k = 10 was used because of its good performance and recommendations in many pieces of literature. In the 10-fold CV process, 70% of data were used for training, and 30% were used for testing purposes. The process was repeated ten times for each fold of the process. All training and test groups instances were randomly divided over the whole dataset before selecting and testing new sets for the news cycle. At the end of the 10-fold process, averages of all performance metrics were computed.

Machine Learning Model Performance Evaluation

Various performance evaluation metrics were used in this study to check the performance of the classifiers. First, a confusion matrix was used, and every observation in the testing set was predicted in precisely one box (). Two matrix approach was deployed because there were two (2) classes of malaria positive (1) and malaria negative (0). Moreover, it gives two types of correct predictions of the classifier and two classifiers of an incorrect prediction. Apart from that classification report was computed to get the classification accuracy, precision, recall and F1 score of the classifiers.

Table 2. Confusion matrix

From the confusion matrix, TP: predicted output as true positive (TP), it was concluded that the positive malaria subject is correctly classified and subjects have malaria. TN: predicted output as true negative (TN); it was supposed that a negative malaria subject is correctly classified and healthy. FP: predicted output as false positive (FP), it was concluded that a negative malaria subject is incorrectly classified as having malaria (a type 1 error). FN: predicted output as false negative (FN), it was concluded that a positive malaria subject is incorrectly classified that the subject does not have malaria as the subject is healthy (a type 2 error).

Results

Classifiers Performance on Full Features with K-Fold Cross-Validation

In this experiment, the five-machine-learning classifiers were checked with 10-fold cross-validation methods in full 35 features of the complete malaria diagnosis dataset as described in . While different parameter values were passed through classifiers, the mean of 10-fold methods was computed.

Table 3. 10-fold CV classification performance evaluation of different classifiers on Malaria diagnosis dataset on full features

From this experiment with full features on a full malaria diagnosis dataset, Random Forest classifier showed overall good performance among other classifiers with a classification accuracy of 79%, AUC of 80%, sensitivity of 82%, specificity of 69%, precision of 71% and recall of 76% as shown in . The specificity value of Random Forest was 69% showing the probability that a diagnostic test was negative and the person does not have malaria. The decision tree classifier has demonstrated exemplary performance on Sensitivity of 85%, precision of 77% and recall of 76%. The K-Nearest Neighbor classifier has underperformed on the Specificity of 49% and AUC of 69% but scores the Sensitivity of 78%, precision of 71% and accuracy of 72%. The Support Vector Machine achieved an accuracy of 73%, specicity of 61%, precision of 71%, AUC of 74% and Sensitivity of 74%. Apart from that, the Linear Regression classifier achieved an accuracy of 75%, specificity of 57%, precision of 74%, AUC of 76% and Sensitivity of 77%. The performance comparison on AUC, Specificity and Sensitivity among the classifiers is shown in .

Figure 1. Performance of different classifiers with full features.

Figure 1. Performance of different classifiers with full features.

Results of Classifiers Performance on Selected Important Features with K-Fold Cross-Validation (n = 10)

The model was developed considering only 10 important features selected during the feature engineering process. In this experiment, all models had high performance in all metrics compared to when the full features were used (). For the Accuracy and AUC, the Random Forest classifier has the best performance with an accuracy of 82% and AUC of 83%, followed by the Logistic Regression classifier with an accuracy of 76% and AUC of 78%. Random forest and Decision Tree classifiers have the best precision of 81% and 76%, respectively. These models confidently predict true negatives that 81% of the negative malaria prediction were healthy (with no malaria).

Table 4. 10-fold CV classification performance evaluation of different classifiers on Malaria diagnosis dataset ten important features

For the classification of confident true positive that does not classify a sick patient as a healthy person, Decision Tree performed well with a Sensitivity of 85%, followed by Random Forest with Sensitivity of 84%. In this dataset, Random Forest had an F1 score of 81%. Support Vector Machine had the best performance on Specificity by 74%, while the KNN classifier performed the least in all aspects with the score of 72% accuracy, 70% AUC, 80% sensitivity, 60% specificity and 71% precision. It was also established that the Logistic Regression classifier’s accuracy and AUC dropped after selecting the important features. The average accuracy and AUC dropped from 76% to 75% to 75% and 73%, respectively, as shown in . This signifies that the dropped features dominated the predictive capacity of this classifier.

Figure 2. Performance of different classifiers on ten important features on the whole Malaria diagnosis dataset.

Figure 2. Performance of different classifiers on ten important features on the whole Malaria diagnosis dataset.

Results of Classifiers Performance on Selected 10 Important Features on Regional Datasets

The 10 selected important features from every regional dataset were checked on five machine learning classifiers with a 10-fold cross-validation method, and computation of the average metrics was presented in . The machine-learning classifiers were trained and tested in phases with different features to see features that will bring the best performance. First, the classifiers trained and tested the three most important features. Then three important features were added, and the last four important features were fed. It was observed that the performance of the classifiers was good at the ten important features. Results of classification accuracy, AUC, Specificity, Sensitivity, Precision and F-1 score on different graphs were used for better demonstration. These performance metrics were computed automatically.

Figure 3. Performance of classifiers on ten important selected features on Kilimanjaro dataset.

Figure 3. Performance of classifiers on ten important selected features on Kilimanjaro dataset.

In both two experiments Random Forest classifier has shown outstanding performance with 95% and 87% classification accuracy, 96% and 85% Sensitivity, 92% and 78% Specificity, 92% and 80% Precision, 97% and 86% AUC for Kilimanjaro and Morogoro, respectively. This classifier has outperformed all the other classifiers in all performance metrics. The Decision Tree classifier performed second best to Random Forest, and its performance in the Kilimanjaro dataset is better than in the Morogoro dataset. While the classifier archived well with 92% classification accuracy, 91% Sensitivity, 80% Specificity and 80% Precision in the Kilimanjaro dataset, its Specificity and Precision was poor by 67% and 68% in the Morogoro dataset.

For the Logistic Regression classifier, the classification accuracy scores, AUC and Sensitivity were good by 81%, 82% and 85%, respectively, for the Kilimanjaro dataset and 76%, 77% and 74% for the Morogoro dataset, respectively. On the other hand, the classifier had an unsatisfactory performance on Specificity 65% and Precision 65% in Kilimanjaro dataset and 68% Specificity, 67% Precision for Morogoro dataset. KNN performed well on the same metrics as Logistic Regression in all the datasets. Unlike Logistic Regression and KNN classifiers, Support Vector Machine classifier had a pretty good performance in all metrics for all the datasets, as shown in .

Table 5. Classification performance evaluation of different classifiers on Kilimanjaro and Morogoro dataset on ten important features

The main aim of conducting these experiments is to create a machine learning model that can classify patients correctly with malaria from healthy patients based on the symptoms presented and some of the patient’s demographic information. When the classification accuracy of the classifiers was compared between the regional datasets, Random Forest was found to be the best classifier with 95% accuracy for the Kilimanjaro dataset and 87% accuracy for the Morogoro dataset, as shown in .

Figure 4. F1 score comparison on the two regions datasets.

Figure 4. F1 score comparison on the two regions datasets.

The sensitivity score of the classifiers in each dataset is shown in . Random Forests and Decision Trees classifiers showed equal high performance of 96% Sensitivity in Kilimanjaro. As for Morogoro Random Forest classifier showed a good performance of 85% Sensitivity. The experiment also identified the harmonic mean between Precision and Recall (F 1 score), which tells how precise and robust the classifier incorrectly classified the true negative and true positive. As shown in , Support Vector Machine classifier performed with 81% F1 score in Morogoro dataset and Random Forest Classifier performed with 95% F1 in Kilimanjaro dataset as shown in ROC plot in . The summary of excellent performance metrics results and best classifiers are presented in .

Table 6. Excellent performance metrics results and best classifiers

Figure 5. Classification accuracy comparison in two regions dataset.

Figure 5. Classification accuracy comparison in two regions dataset.

Figure 6. Classifier’s sensitivity comparison on the two regions datasets.

Figure 6. Classifier’s sensitivity comparison on the two regions datasets.

Figure 7. ROC plot for randon forest performance evaluation.

Figure 7. ROC plot for randon forest performance evaluation.

Discussion

This study demonstrates the success of using supervised learning models in diagnosing malaria using patient symptoms and demographic features. However, overall, the ranking of the features was different among the regional datasets due to geographical location, which enhances the rate of disease transmission. These findings are aligned with the studies (Chandramohan et al. Citation2001; Ngasala et al. Citation2008; Nkumama, O’Meara, and Osier Citation2017; UM Citation2016) that indicated that malaria transmission depends on climatic conditions that may affect the number and survival of mosquitoes, such as rainfall patterns, temperature, and humidity.

Coughing and joint pain were significant for malaria diagnosis in Morogoro. Still, they have zero significance in Kilimanjaro, while dizziness and confusion are important in the diagnosis of malaria in Kilimanjaro and with no importance in Morogoro. A previous study conducted in Morogoro indicated that community perception associate coughing and joint pain are symptoms of Malaria (Mariki, Mduma, and Mkoba Citation2021).

It was also observed that some months of the year when patients visit the health facility with malaria-related symptoms are significant in malaria diagnosis. The months that are significant are either during the rain session or just after the rain session. This aligns with the guideline given by the WHO on malaria transmission behavior.

Six well-known machine learning classifiers such as Logistics Regression (LR), K-Nearest Neighbor (KNN), Naïve Bayes (NB), support vector machine (SVM), decision tree (DT) and Random Forest (RF), were used in cooperation with RF a feature selection classifier. In regional and combined datasets, Random Forest showed overall good performance compared to other classifiers with an accuracy of 79%, AUC of 80%, sensitivity of 82%, specificity of 69%, precision of 71% and recall of 76%. Furthermore, all models had a high performance with selected important features except the Logistic Regression recorded lower AUC and accuracy. In addition, in both regions, the Random Forest classifier has shown strong performance in predicting malaria. Although the random forest algorithm is considered a black box because the information is hidden inside the model structure, this study adopted it as a feature selection algorithm due to its robustness, execution speed, and intensive searching procedure. Similar findings were described in the studies conducted in Senegal, and Burkinafaso indicates that random forest is a promising classifier with high accuracy of predicting malaria using clinical symptoms (Harvey et al. Citation2021; Yadav et al. Citation2021).

In a clinical setting, our study demonstrates that clinicians can use the model to detect new malaria cases provided that patients symptoms and demographic features are available. This aligned with the guidelines described by both WHO (WHO Citation2015, Citation2021) and Tanzania Mainland’s malaria treatment guideline (Michael and Mkunde Citation2017) that for diagnosis of malaria to consider symptoms and demographics such as ages, Fever, location, headache, Joint pains, Malaise, Vomiting, Diarrhea, Body ache, body weakness, Poor appetite, Pallor and enlarged spleen as a diagnostic criterion.

Results of this study, however, are subject to certain limitations. First, our sample is restricted to patients’ records extracted from the patients’ files in the selected health facilities. More studies need to be conducted for the patients in different regions and health facilities. The additional potential limitation is the developed models were based on the data obtained in the four health facilities in two regions. Therefore, we can not generalize our results with the entire country population.

While several studies have shown that using clinical symptoms to predict malaria is not a practical idea, the strength of this study is using clinical symptoms and patients’ demographic information to predict malaria using machine learning classifiers. Another strength is that the study dataset represents the patients who live in low endemic and high endemic areas. In addition, our dataset included medical records from patients files and surveys from the patients visiting the health facility.

Conclusion

This study developed a regional-specific malaria predictive model used in malaria diagnosis based on clinical symptoms and demographic data. The model will create a clinically based diagnosis system for malaria. Furthermore, this study demonstrates that using the right machine learning classifiers and important features for each dataset is useful in making the disease prediction. Overall, Random Forest has shown an outstanding performance in classifying sick malaria patients and healthy ones in both the low and high endemic areas. For future studies, our results are a necessary first step in designing a decision support system through the proposed model, which will be more suitable for people who cannot access the laboratory-based diagnosis tools or access the health facility before any treatment. Therefore, we recommend future studies include more regions and enlarge the dataset to improve the model’s performance and inclusivity.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The author(s) reported there is no funding associated with the work featured in this article.

References

  • Abba, K., J. J. Deeks, P. L. Olliaro, C.-M. Naing, S. M. Jackson, Y. Takwoingi, S. Donegan, and P. Garner. 2011. Rapid diagnostic tests for diagnosing uncomplicated P. falciparum malaria in endemic countries. Cochrane Database of Systematic Reviews. doi:10.1002/14651858.CD008122.PUB2.
  • Attinsounon, C. A., Y. Sissinto, E. Avokpaho, A. Alassani, M. Sanni, and M. Zannou. 2019. Self-medication practice against malaria and associated factors in the City of Parakou in Northern Benin: Results of a population survey in 2017. Advances in Infectious Diseases 9 (3):263–2031. doi:10.4236/aid.2019.93020.
  • Azar, A. T., H. I. Elshazly, A. E. Hassanien, and A. M. Elkorany. 2014. A random forest classifier for lymph diseases. Computer Methods and Programs in Biomedicine 113 (2):465–73. doi:10.1016/J.CMPB.2013.11.004.
  • Baltzell, K., T. B. Kortz, E. Scarr, A. Blair, A. Mguntha, G. Bandawe, E. Schell, and S. Rankin. 2019. ’Not all fevers are Malaria’: A mixed methods study of non-malarial fever management in Rural Southern Malawi. Rural and Remote Health 19 (2). doi: 10.22605/RRH4818.
  • Bbosa, F., R. Wesonga, and P. Jehopio. 2016. Clinical malaria diagnosis: Rule-based classification statistical prototype. SpringerPlus 5 (1):1–14. doi:10.1186/S40064-016-2628-0/TABLES/7.
  • Bibin, D., M. S. Nair, and P. Punitha. 2017. Malaria parasite detection from peripheral blood smear images using deep belief networks. IEEE Access 5:9099–108. doi:10.1109/ACCESS.2017.2705642.
  • Bria, Y. P., C. H. Yeh, and S. Bedingfield. 2021. Significant symptoms and nonsymptom-related factors for malaria diagnosis in endemic regions of Indonesia. International Journal of Infectious Diseases 103:194–200. doi:10.1016/j.ijid.2020.11.177.
  • Brodersen, K. H., F. Haiss, C. S. Ong, F. Jung, M. Tittgemeyer, J. M. Buhmann, B. Weber, and K. E. Stephan. 2011. Model-based feature construction for multivariate decoding. NeuroImage 56 (2):601–15. doi:10.1016/j.neuroimage.2010.04.036.
  • Brown, B. J., P. Manescu, A. A. Przybylski, F. Caccioli, G. Oyinloye, M. Elmi, M. J. Shaw, V. Pawar, R. Claveau, J. Shawe-Taylor, et al. 2020. Data-driven malaria prevalence prediction in large densely populated urban holoendemic sub-Saharan West Africa. Scientific Reports 10 (1). doi: 10.1038/S41598-020-72575-6.
  • Chandramohan, D., I. Carneiro, A. Kavishwar, R. Brugha, V. Desai, and B. Greenwood. 2001. A clinical algorithm for the diagnosis of malaria: Results of an evaluation in an area of low endemicity. Tropical Medicine and International Health 6 (7):505–10. doi:10.1046/j.1365-3156.2001.00739.x.
  • Chen, P.-H. C., Y. Liu, and L. Peng. 2019. How to develop machine learning models for healthcare. Nature Materials 18 (5):410–14. doi:10.1038/s41563-019-0345-0.
  • Chipwaza, B., J. P. Mugasa, I. Mayumana, M. Amuri, C. Makungu, and P. S. Gwakisa. 2014. Self-medication with anti-malarials is a common practice in rural communities of Kilosa district in Tanzania despite the reported decline of malaria. Malaria Journal 13 (1):252. doi:10.1186/1475-2875-13-252.
  • Crump, J. A., P. N. Newton, S. J. Baird, and Y. Lubell. 2017. Febrile illness in adolescents and adults. Disease Control Priorities, Third Edition (Volume 6): Major Infectious Diseases 365–85. doi:10.1596/978-1-4648-0524-0_CH14.
  • D’Acremont, V., C. L. H. M. D. M. M. T. B. G, H. Mshinda, D. Mtasiwa, M. Tanner, and B. Genton. 2009. Time to move from presumptive malaria treatment to laboratory-confirmed diagnosis and treatment in African children with fever. PLoS Medicine 6 (1):e252. doi:10.1371/journal.pmed.0050252.
  • Das, D. K., M. Ghosh, M. Pal, A. K. Maiti, and C. Chakraborty. 2013. Machine learning approach for automated screening of malaria parasite using light microscopic images. Micron 45:97–106. doi:10.1016/J.MICRON.2012.11.002.
  • Davenport, T., and R. Kalakota. 2019. The potential for artificial intelligence in healthcare. Future Healthcare Journal 6 (2):94–98. doi:10.7861/futurehosp.6-2-94.
  • DB, D., F. P, and K. N. 2018. Machine learning approaches for clinical psychology and psychiatry. Annual Review of Clinical Psychology 14 (1):91–118. doi:10.1146/ANNUREV-CLINPSY-032816-045037.
  • Debora, C. K., and E. Moses. 2017. Self-medication practices and predictors for self-medication with antibiotics and anti-malarials among community in Mbeya City, Tanzania. Tanzania Journal of Health Research 19 (4). doi: 10.4314/THRB.V19I4.
  • Dharap, P., and S. Raimbault. 2020. Performance evaluation of machine learning-based infectious screening flags on the HORIBA Medical Yumizen H550 Haematology Analyzer for vivax malaria and dengue fever. Malaria Journal 19 (1):1–10. doi:10.1186/S12936-020-03502-3.
  • Fatima, M., and M. Pasha. 2017. Survey of machine learning algorithms for disease diagnostic. Journal of Intelligent Learning Systems and Applications 9 (1):1–16. http://file.scirp.org/Html/.
  • Femi Aminu, E., E. Onyebuchi Ogbonnia, and I. Shehi Shehu. 2016. A predictive symptoms-based system using support vector machines to enhanced classification accuracy of malaria and typhoid coinfection. International Journal of Mathematical Sciences and Computing 2 (4):54–66. doi:10.5815/ijmsc.2016.04.06.
  • Ford, C. T., G. Alemayehu, K. Blackburn, K. Lopez, C. C. Dieng, E. Lo, L. Golassa, and D. Janies. 2020. Modeling plasmodium falciparum diagnostic test sensitivity using machine learning with histidine-rich Protein 2 variants. MedRxiv 20114785. doi:10.1101/2020.05.27.20114785.
  • Fuhad, K. M. F., J. F. Tuba, M. R. A. Sarker, S. Momen, N. Mohammed, and T. Rahman. 2020a. Deep learning based automatic malaria parasite detection from blood smear and its smartphone based application. Diagnostics 10 (5):329. doi:10.3390/diagnostics10050329.
  • Fuhad, K. M. F., J. F. Tuba, M. R. A. Sarker, S. Momen, N. Mohammed, and T. Rahman. 2020b. Deep learning based automatic malaria parasite detection from blood smear and its smartphone based application. Diagnostics 10 (5):329. doi:10.3390/DIAGNOSTICS10050329.
  • Goodyer, L. 2015. Dengue fever and chikungunya: Identification in travellers. Clinical Pharmacist 7 (4). doi: 10.1211/cp.2015.20068429.
  • Gosling, R. D., C. J. Drakeley, A. Mwita, and D. Chandramohan. 2008. Presumptive treatment of fever cases as malaria: Help or hindrance for malaria control? In Malaria Journal 7(1):132. BioMed Central. doi:10.1186/1475-2875-7-132.
  • Graz, B., M. Willcox, T. Szeless, and A. Rougemont. 2011. Test and treat or presumptive treatment for malaria in high transmission situations? A reflection on the latest WHO guidelines. Malaria Journal 10 (1):1–8. doi:10.1186/1475-2875-10-136.
  • Grobusch, M. P., and P. Schlagenhauf. 2019. 16 – Self-diagnosis and self-treatment of malaria by the traveler. In Travel Medicine 169–78. doi:10.1016/B978-0-323-54696-6.00016-1.
  • Habib, P. T., A. M. Alsamman, S. E. Hassnein, G. A. Shereif, and A. Hamwieh. n.d. Assessment of machine learning algorithms for prediction of breast cancer malignancy based on mammogram numeric data. doi:10.1101/2020.01.08.20016949.
  • Harvey, D., W. Valkenburg, A. Amara, and T. R. Gadekallu. 2021. Predicting malaria epidemics in Burkina Faso with machine learning. PLoS ONE 16 (6):e0253302. doi:10.1371/JOURNAL.PONE.0253302.
  • Hertz, J. T., D. B. Madut, R. A. Tesha, G. William, R. A. Simmons, S. W. Galson, V. P. Maro, J. A. Crump, and M. P. Rubach. 2019. Self-medication with non-prescribed pharmaceutical agents in an area of low malaria transmission in northern Tanzania: A community-based survey. Transactions of the Royal Society of Tropical Medicine and Hygiene 113 (4):183–88. doi:10.1093/trstmh/try138.
  • Isiguzo, C., J. Anyanti, C. Ujuju, E. Nwokolo, A. De La Cruz, E. Schatzkin, S. Modrek, D. Montagu, J. Liu, and H. D. F. H. Schallig. 2014. Presumptive treatment of malaria from formal and informal drug vendors in Nigeria. PLoS ONE 9 (10):e110361. doi:10.1371/journal.pone.0110361.
  • Iyer, A., J. S, and R. Sumbaly. 2015. Diagnosis of diabetes using classification mining techniques. International Journal of Data Mining & Knowledge Management Process 5 (1):01–14. http://www.aircconline.com/ijdkp/V5N1/5115ijdkp01.pdf.
  • Jiang, Fei, Jiang, Yong, Zhi, Hui, Dong, Yi, Li, Hao, Ma, Sufeng, Wang, Yilong, Dong, Qiang, Shen, Haipeng, Wang, Yongjun. 01 12 2017. Artificial intelligence in healthcare: Past, present and future. Stroke and vascular neurology 2 (4): 230–243. 25 January 2022. doi:10.1136/SVN-2017-000101. https://pubmed.ncbi.nlm.nih.gov/29507784/
  • Kazaura, M. R. 2017. Level and correlates of self-medication among adults in a rural setting of mainland Tanzania. Indian Journal of Pharmaceutical Sciences 79 (3):451–57. doi:10.4172/pharmaceutical-sciences.1000248.
  • Khare, A., M. Jeon, I. K. Sethi, and B. Xu. 2017. Machine learning theory and applications for healthcare. In Journal of Healthcare Engineering 2017:1–2. Hindawi Limited. doi:10.1155/2017/5263570.
  • Kim, J. K., Y. J. Choo, and M. C. Chang. 2021. Prediction of motor function in stroke patients using machine learning algorithm: Development of practical models. Journal of Stroke and Cerebrovascular Diseases 30 (8):105856. doi:10.1016/J.JSTROKECEREBROVASDIS.2021.105856.
  • Krishnani, D., A. Kumari, A. Dewangan, A. Singh, and N. S. Naik. 2019. Prediction of coronary heart disease using supervised machine learning algorithms. IEEE Region 10 Annual International Conference, Proceedings/TENCON, 17 20 October 2019. Kochi, India, 367–72, October. doi:10.1109/TENCON.2019.8929434. https://ieeexplore.ieee.org/document/8929434
  • Lee, Y. W., J. W. Choi, and E. H. Shin. 2021. Machine learning model for predicting malaria using clinical information. Computers in Biology and Medicine 129:104151. doi:10.1016/J.COMPBIOMED.2020.104151.
  • Liang, Z., A. Powell, I. Ersoy, M. Poostchi, K. Silamut, K. Palaniappan, P. Guo, M. A. Hossain, A. Sameer, and R. J. Maude, et al. 19 January 2017. CNN-based image analysis for malaria diagnosis. Proceedings - 2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016, 15 - 18 December 2016. Shenzhen, China, 493–96. 25 January 2022. doi:10.1109/BIBM.2016.7822567. https://ieeexplore.ieee.org/document/7822567
  • Madhu, G. 2020. Computer vision and machine learning approach for malaria diagnosis in thin blood smears from microscopic blood images, 191–209. Singapore: Springer. doi:10.1007/978-981-15-3689-2_8.
  • Mariki, M., N. Mduma, and E. Mkoba. 2021. Feature selection approach to improve malaria diagnosis model for high and low endemic areas of Tanzania . doi:10.20944/PREPRINTS202111.0243.V1. https://www.preprints.org/manuscript/202111.0243/v1
  • Masud, M., H. Alhumyani, S. S. Alshamrani, O. Cheikhrouhou, S. Ibrahim, G. Muhammad, M. S. Hossain, and M. Shorfuzzaman. 2020. Leveraging deep learning techniques for malaria parasite detection using mobile application. Wireless Communications and Mobile Computing 2020:1–15. doi:10.1155/2020/8895429.
  • Mboera, L. E. G., E. A. Makundi, and A. Y. Kitua. 2007. Uncertainty in malaria control in Tanzania: Crossroads and challenges for future interventions. https://www.ncbi.nlm.nih.gov/books/NBK1714/
  • Menard, D., and A. Dondorp. 2017. Anti-malarial drug resistance: A threat to malaria elimination. Cold Spring Harbor Perspectives in Medicine 7 (7):1–24. doi:10.1101/cshperspect.a025619.
  • Metta, E., H. Haisma, F. Kessy, I. Hutter, and A. Bailey. 2014. “we have become doctors for ourselves”: Motives for malaria self-care among adults in southeastern Tanzania. Malaria Journal 13 (1):249. doi:10.1186/1475-2875-13-249.
  • Michael, D., and S. P. Mkunde. 2017. The malaria testing and treatment landscape in mainland Tanzania, 2016. Malaria Journal 16 (1):1–15. doi:10.1186/S12936-017-1819-7.
  • Moreno-Ibarra, M.-A., Y. Villuendas-Rey, M. D. Lytras, C. Yáñez-Márquez, and J.-C. Salgado-Ramírez. 2021. Classification of diseases using machine learning algorithms: A comparative study. Mathematics 9 (15):1817. doi:10.3390/MATH9151817.
  • Mpapalika, J. J., and N. Matowo. 2020. The application of Artificial Intelligence in the diagnosis and treatment of malaria in Tanzania. J Infectious Diseases Diagnosis 1. https://www.mdpi.
  • MS, S., F. E, and C. J. 2020. Machine learning for brain stroke: A review. Journal of Stroke and Cerebrovascular Diseases : The Official Journal of National Stroke Association 29 (10). doi: 10.1016/J.JSTROKECEREBROVASDIS.2020.105162.
  • Muthumbi, A., A. Chaware, K. Kim, K. C. Zhou, P. C. Konda, R. Chen, B. Judkewitz, A. Erdmann, B. Kappes, and R. Horstmeyer. 2019. Learned sensing: Jointly optimized microscope hardware for accurate image classification. Biomedical Optics Express 10 (12):6351. doi:10.1364/boe.10.006351.
  • Mwai, L., E. Ochong, A. Abdirahman, S. M. Kiara, S. Ward, G. Kokwaro, P. Sasi, K. Marsh, S. Borrmann, M. MacKinnon, et al. 2009. Chloroquine resistance before and after its withdrawal in Kenya. Malaria Journal 8 (1):106. doi:10.1186/1475-2875-8-106.
  • Mwanga, E. P., E. G. Minja, E. Mrimi, M. G. Jiménez, J. K. Swai, S. Abbasi, H. S. Ngowo, D. J. Siria, S. Mapua, C. Stica, et al. 2019b. Detection of malaria parasites in dried human blood spots using mid-infrared spectroscopy and logistic regression analysis. Malaria Journal 18 (1). doi: 10.1186/S12936-019-2982-9.
  • Mwanga, E. P., S. A. Mapua, D. J. Siria, H. S. Ngowo, F. Nangacha, J. Mgando, F. Baldini, M. González Jiménez, H. M. Ferguson, K. Wynne, et al. 2019a. Using mid-infrared spectroscopy and supervised machine-learning to identify vertebrate blood meals in the malaria vector, Anopheles arabiensis. Malaria Journal 18 (1). doi: 10.1186/s12936-019-2822-y.
  • Mwita, S., O. Meja, D. Katabalo, and C. Richard. 2019. Magnitude and factors associated with anti-malarial self-medication practice among residents of Kasulu Town Council, Kigoma-Tanzania. African Health Sciences 19 (3):2457–61. doi:10.4314/ahs.v19i3.20.
  • Nadjm, B., B. Amos, G. Mtove, J. Ostermann, S. Chonya, H. Wangai, J. Kimera, W. Msuya, F. Mtei, D. Dekker, et al. 2010. WHO guidelines for antimicrobial treatment in children admitted to hospital in an area of intense Plasmodium falciparum transmission: Prospective study. BMJ (Online) 340 (7751):848. doi:10.1136/BMJ.C1350.
  • Ngasala, B., M. Mubi, M. Warsame, M. G. Petzold, A. Y. Massele, L. L. Gustafsson, G. Tomson, Z. Premji, and A. Bjorkman. 2008. Impact of training in clinical and microscopy diagnosis of childhood malaria on anti-malarial drug prescription and health outcome at primary health care level in Tanzania: A randomized controlled trial. Malaria Journal 7 (1):199. doi:10.1186/1475-2875-7-199.
  • Nkumama, I. N., W. P. O’Meara, and F. H. A. Osier. 2017. Changes in malaria epidemiology in Africa and new challenges for elimination. Trends in Parasitology 33(2):128–40. Elsevier Ltd. doi:10.1016/j.pt.2016.11.006.
  • Oguntimilehin, A. n.d. A mobile malaria fever clinical diagnosis system based on Non-Nested Generalized Exemplar (NNGE). International Journal of Emerging Trends in Engineering Research 8 (2):259–64. Accessed January 8, 2022. https://www.academia.edu/43021750/A_Mobile_Malaria_Fever_Clinical_Diagnosis_System_Based_on_Non_Nested_Generalized_Exemplar_NNGE_
  • Patil, P., N. Yaligar, and S. Meena. 2018. Comparision of performance of classifiers - SVM, RF and ANN in potato blight disease detection using leaf images. 2017 IEEE International Conference on Computational Intelligence and Computing Research, ICCIC 2017, 14-16 Dec. 2017. 25 January 2022. Coimbatore, India. doi:10.1109/ICCIC.2017.8524301. https://ieeexplore.ieee.org/document/8524301
  • Pillay, E., S. Khodaiji, B. C. Bezuidenhout, M. Litshie, and T. L. Coetzer. 2019. Evaluation of automated malaria diagnosis using the Sysmex XN-30 analyser in a clinical setting. Malaria Journal 18 (1):15. doi:10.1186/s12936-019-2655-8.
  • Poostchi, M., K. Silamut, R. J. Maude, S. Jaeger, and G. Thoma. 2018. Image analysis and machine learning for detecting malaria. In Translational Research, vol. 194, 36–55. 25 January 2022. doi:10.1016/j.trsl.2017.12.004.
  • Priyadarshini, R., N. Dash, and R. Mishra. 2014. A novel approach to predict diabetes mellitus using modified extreme learning machine. 2014 International Conference on Electronics and Communication Systems, ICECS 2014, 13-14 Feb. 2014. 25 January 2022. Coimbatore, India. doi:10.1109/ECS.2014.6892740. https://ieeexplore.ieee.org/document/6892740?reload=true
  • Rajaraman, S., S. Jaeger, and S. K. Antani. 2019. Performance evaluation of deep neural ensembles toward malaria parasite detection in thin-blood smear images. PeerJ 7:e6977. doi:10.7717/peerj.6977.
  • Rajaraman, S., S. K. Antani, M. Poostchi, K. Silamut, M. A. Hossain, R. J. Maude, S. Jaeger, and G. R. Thoma. 2018. Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images. PeerJ 2018(4). doi:10.7717/peerj.4568
  • Rao, A. R., and B. S. Renuka. 2020. A machine learning approach to predict thyroid disease at early stages of diagnosis. 2020 IEEE International Conference for Innovation in Technology, INOCON 2020, 6-8 Nov. 2020. 25 January 2022. Bangluru, India. doi:10.1109/INOCON50539.2020.9298252. https://ieeexplore.ieee.org/document/9298252
  • Ravalji, R., N. Shah, and M. Nai. 2020. Malaria disease detection using machine learning. International Research Journal of Engineering and Technology (IRJET) 7 (12): 1180–1188. https://www.irjet.net/archives/V7/i12/IRJET-V7I12212.pdf
  • Saranya, G., and A. Pravin. 2020. A comprehensive study on disease risk predictions in machine learning related papers A comprehensive study on disease risk predictions in machine learning. International Journal of Electrical and Computer Engineering (IJECE) International Journal of Electrical and Computer Engineering (IJECE) International Journal of Electrical and Computer Engineering (IJECE) 10 (4):4217–25. doi:10.11591/ijece.v10i4.pp4217-4225.
  • Shailaja, K., B. Seetharamulu, and M. A. Jabbar. 2018. Machine learning in healthcare: A review. Proceedings of the 2nd International Conference on Electronics, Communication and Aerospace Technology, ICECA 2018, 29-31 March 2018. Coimbatore, India, 910–14. 25 January 2022. doi:10.1109/ICECA.2018.8474918. https://ieeexplore.ieee.org/document/8474918
  • Shekalaghe, S., M. Cancino, C. Mavere, O. Juma, A. Mohammed, S. Abdulla, and S. Ferro. 2013. Clinical performance of an automated reader in interpreting malaria rapid diagnostic tests in Tanzania. Malaria Journal 12 (1):1–9. doi:10.1186/1475-2875-12-141.
  • Sidey-Gibbons, J. A. M., and C. J. Sidey-Gibbons. 2019. Machine learning in medicine: A practical introduction. BMC Medical Research Methodology 19 (1):64. doi:10.1186/s12874-019-0681-4.
  • Sigonda-Ndomondo, M., R. Mbwasi, R. Shirima, N. Heltzer, and M. Clark. 2004. Accredited drug dispensing outlets: improving access to quality drugs and services in rural and periurban areas with few or no pharmacies. Second International Conference on Improving Use of Medicines, 30 March - 2 April 2004. Chiang Mai, Thailand.
  • Swaminathan, S., K. Qirko, T. Smith, E. Corcoran, N. G. Wysham, G. Bazaz, G. Kappel, A. N. Gerber, and M. Milanese. 2017. A machine learning approach to triaging patients with chronic obstructive pulmonary disease. PLOS ONE 12 (11):e0188532. doi:10.1371/JOURNAL.PONE.0188532.
  • Triantafyllidis, A. K., and A. Tsanas. 2019. Applications of machine learning in real-life digital health interventions: Review of the literature. In Journal of Medical Internet Research 21 (4):e12286. doi:10.2196/12286.
  • Uddin, S., A. Khan, M. E. Hossain, and M. A. Moni. 2019. Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making 19 (1):1–16/281. 25 January 2022. doi:10.1186/S12911-019-1004-8. https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-019-1004-8
  • Ullah, R., S. Khan, H. Ali, I. I. Chaudhary, M. Bilal, and I. Ahmad. 2019. A comparative study of machine learning classifiers for risk prediction of asthma disease. Photodiagnosis and Photodynamic Therapy 28:292–96. doi:10.1016/J.PDPDT.2019.10.011.
  • UM, C. 2016. Malaria treatment in children based on presumptive diagnosis: A make or mar? Pediatric Infectious Diseases: Open Access 1 (2). doi: 10.21767/2573-0282.100006.
  • van Driel, N. 2020. Automating malaria diagnosis: a machine learning approach: Erythrocyte segmentation and parasite identification in thin blood smear microscopy images using convolutional neural networks. https://repository.tudelft.nl/islandora/object/uuid%3A2db4839e-7c68-4536-895e-13e51821c9a9
  • Wang, D., P. Chaki, Y. Mlacha, T. Gavana, M. G. Michael, R. Khatibu, J. Feng, Z. B. Zhou, K. M. Lin, S. Xia, et al. 2019. Application of community-based and integrated strategy to reduce malaria disease burden in southern Tanzania: The study protocol of China-UK-Tanzania pilot project on malaria control. Infectious Diseases of Poverty 8 (1):1–6. doi:10.1186/s40249-018-0507-3.
  • WHO-Guidelines. 2015. For the treatment of malaria guidelines. www.who.int
  • WHO. 2015. WHO | Guidelines for the treatment of malaria - 3rd Edition. Geneva: World Health Organization.
  • WHO. 2019. World Malaria Report 2019. Geneva: World Health Organization. https://www.who.int/publications/i/item/9789241565721
  • WHO. 2020. World malaria report 2020. Geneva: World Health Organization. https://www.who.int/publications/i/item/9789240015791
  • WHO. 2021. INTRODUCTION - WHO guidelines for malaria - NCBI Bookshelf. Geneva: NCBI. https://www.ncbi.nlm.nih.gov/books/NBK568497/.
  • Yadav, S. S., V. J. Kadam, S. M. Jadhav, S. Jagtap, and P. R. Pathak. 2021. Machine learning based malaria prediction using clinical findings. 2021 International Conference on Emerging Smart Computing and Informatics, ESCI 2021, 5-7 March 2021, Pune, India, 216–22. 25 January 2022. doi:10.1109/ESCI50559.2021.9396850. https://ieeexplore.ieee.org/document/9396850
  • Yang, F., H. Yu, K. Silamut, R. J. Maude, S. Jaeger, and S. Antani. 2019. Smartphone-supported malaria diagnosis based on deep learning. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11861 LNCS 73–80. doi:10.1007/978-3-030-32692-0_9.