107
Views
5
CrossRef citations to date
0
Altmetric
Original Research

Predicting recurrent aphthous ulceration using genetic algorithms-optimized neural networks

, , , , , , , , , & show all
Pages 7-13 | Published online: 14 May 2010

Abstract

Objective

To construct and optimize a neural network that is capable of predicting the occurrence of recurrent aphthous ulceration (RAU) based on a set of appropriate input data.

Participants and methods

Artificial neural networks (ANN) software employing genetic algorithms to optimize the architecture neural networks was used. Input and output data of 86 participants (predisposing factors and status of the participants with regards to recurrent aphthous ulceration) were used to construct and train the neural networks. The optimized neural networks were then tested using untrained data of a further 10 participants.

Results

The optimized neural network, which produced the most accurate predictions for the presence or absence of recurrent aphthous ulceration was found to employ: gender, hematological (with or without ferritin) and mycological data of the participants, frequency of tooth brushing, and consumption of vegetables and fruits.

Conclusions

Factors appearing to be related to recurrent aphthous ulceration and appropriate for use as input data to construct ANNs that predict recurrent aphthous ulceration were found to include the following: gender, hemoglobin, serum vitamin B12, serum ferritin, red cell folate, salivary candidal colony count, frequency of tooth brushing, and the number of fruits or vegetables consumed daily.

Recurrent aphthous ulceration

Recurrent aphthous ulceration (RAU) affects healthy as well as medically-compromised people. Aphthous ulcers are painful, shallow, and usually covered with a grayish white pseudomembrane that is surrounded by an erythematous margin.Citation1

Although the clinical characteristics of RAU are well defined, the precise etiology remains unclear, and therefore the term “idiopathic” is widely used.Citation2 Nevertheless, a number of predisposing factors have been linked to a minority of patients. A genetic background has been found for some RAU patients; those having positive family history for oral ulcerations have shown an increased frequency of human leukocyte antigen (HLA) types A2, A11, B12, and DR2.Citation2

Dietary patterns could be playing a role in the pathogenesis, either by causing hypersensitivity or by deficiency of some vitamins, proteins, or minerals.Citation3 Results of recent studies implicate cows milk in the etiology of RAU.Citation4Citation6 Recurrent aphthous-like ulcers are seen as oral manifestations of hematinic deficiencies of vitamin B1, B2, B6, B12, folic acid, or iron.Citation2 While some researchers found a significant relationship between vitamin B12 deficiency and RAU, it was also found that hemoglobin level and serum levels of folic acid and ferritin did not have a statistically significant effect on RAU.Citation1 Some researchers noticed that RAU patients ate acidic foods like oranges and lemons more frequently than participants in a control group.Citation3 Food allergies including chocolate, cheese, gluten, cinnamaldehyde, methyl methacrylate, mercury, wheat flour, tomatoes, peanuts, and strawberries might be responsible for the onset of oral ulcers.Citation2,Citation7Citation9

A minority of patients may be predisposed to aphthous-like ulcers by systemic conditions or diseases.Citation10 Gender seems to be unrelated to the occurrence of RAU,Citation1 however, patients affected by RAU are usually nonsmokers.Citation2

Based on knowledge of the aforementioned predisposing factors, the diagnosis of RAU can be established by obtaining a proper history that confirms recurrence and excludes trauma as a predisposing factor. The clinical features of RAU are also important tools in establishing the diagnosis. RAU can appear in one of three forms: minor, major, and herpetiform.Citation11,Citation12

Artificial neural networks (ANN) is an example of an intelligent data analysis tool and is claimed to be superior to classic regression.Citation13,Citation14 ANNs function in much the same way as neurons in the brain, which have the capability of acquiring, storing, and utilizing experiential knowledge.Citation15,Citation16 An ANN consists of an interconnected group of artificial neurons that process information using a connectionist approach to computation. It is an adaptive system that changes the values of some constants related to certain input data based on their effect on the output data.Citation15,Citation16

Genetic algorithms (GAs) are based on the triangle of genetic reproduction, evaluation, and selection.Citation17 Genetic reproduction is performed by means of two basic genetic operators: crossover and mutation. Evaluation is performed by means of the fitness function, which is dependent on the specific problem. Selection is the mechanism that selects parent individuals with probability proportional to their relative fitness. Some genetic algorithms (like the one used in this work) consist of the following steps: Initialization. An initial population comprising a number of individuals is randomly generated in this phase. Evaluation. The fitness, a positive measure of quality used as a measure to reflect the degree of goodness of the individual, is calculated for each individual in the population. Selection. Individuals are chosen from the current population to enter a mating pool devoted to the creation of new individuals for the next generation such that the chance of a given individual to be selected to mate is proportional to its relative fitness. This means that best individuals produce more copies in subsequent generations so that their desirable traits may be passed onto their offspring. This step ensures that the overall quality of the population increases from one generation to the next. Crossover. Provides the means by which valuable information is shared among the population. It combines the features of two parent individuals to form two children individuals who may have new patterns compared to those of their parents. Crossover also plays a central role in GAs. Mutation. Often introduced to guard against premature convergence. Generally, over a period of several generations, the gene pool tends to become more and more homogeneous. The purpose of mutation is to introduce occasional perturbations to the parameters to maintain genetic diversity within the population. Replacement. After generating the offspring’s population through the application of the genetic operators to the parents’ population, the parents’ population is totally replaced by the offspring’s population. This is known as non-overlapping, generational replacement. This completes the “life cycle” of the population. Termination. The GA is terminated when some convergence criterion is met. Possible convergence criteria are: the fitness of the best individual so far found exceeds a threshold value, or the maximum number of generations is reached. After terminating the algorithm, the optimal solution of the problem is the best individual so far found. The block diagram of the genetic algorithm is given in .

Figure 1 Diagram of the steps of genetic algorithms.

Figure 1 Diagram of the steps of genetic algorithms.

The parameters that are optimized using the genetic algorithm are the number of layers, the number of neurons, and the corresponding weights during the training phase. The network’s output for each individual is compared with the desired output and the overall error rate is minimized throughout the evolution process of the genetic algorithm.Citation18

ANN was originally used in medicine to investigate the causality of a number of diseases and it was found to have relatively high accuracy.Citation19Citation23 Some researchers used ANN to diagnose celiac disease based on the occurrence of oral lesions including RAU.Citation24 Others used it to predict survival rates of cancer patients undergoing esophagus and esophagogastric junction resections,Citation25 to predict relapse in breast cancer patients,Citation26 to predict lymph node metastasis in gastric cancer,Citation27 to diagnose and predict survival of patients with colon cancer, Citation28 to predict radiation-induced liver disease,Citation29 and to study pancreatic cancer.Citation21 Despite the promising medical applications of ANN, its use in oral medicine is still limited and is mainly focused on oral cancer and precancer.Citation30Citation35

The aim in this study was to find the predisposing factors suitable for constructing artificial neural networks capable of predicting the occurrence of RAU.

Participants and method

Participants

All ninety six participants included in this study were patients attending the Orthodontics clinic in the Dental Department at The University of Jordan Hospital. Patients were first-time attendees seeking orthodontic treatment for mild to moderate malocclusion. Participants included in this study reported a medical history free of any disease except for common infectious diseases like flu or common colds and had no oral or dental pathologies. The 96 patients in this study were divided into two groups. Group 1 consisted of 86 patients for the construction phase of the ANNs and group 2 consisted of 10 patients for the reproduction (prediction) phase of the study.

Method

Patients were asked to fill out a questionnaire containing items concerning: oral hygiene habits (tooth brushing, use of mouth wash, and use of dental floss), nutritional habits (daily consumption of fresh fruits and vegetables), and history of recurrent nontraumatic oral ulceration.

All patients were investigated for complete blood count, serum vitamin B12, serum ferritin, and red cell folate. Blood samples for complete blood count and red cell folate were collected in ETDA tubes. Blood samples for ferritin and B12 were collected in plain tubes. All tests were analyzed in batch samples at the University of Jordan Hospital Clinical Laboratories.

A sample of saliva was collected from each patient. Patients were instructed to expectorate all saliva in a sterile container for a period of 5 minutes; additionally, they were asked not to eat or drink for at least 1 hour prior to the procedure. Salivary samples were cultured within 2 hours of collection on Sabouraud glucose agar plates using the streaking method, and incubated at 35°C for 24–48 hours. All yeast-like colonies were recorded and identified if they were Candida or budding yeast cells by using wet preparations, ChromCandida agar, and RapID Yeast Plus Systems for Yeast Species (Remel, KS, USA).

The software used to construct the ANNs employed genetic algorithms for network optimization. The population size was set to 50 ie, generations were patches of 50 ANNs. When the fitness of a certain individual (certain ANN configuration) is less than 100%, the operation proceeds to the next generation where another patch of 50 ANNs are produced and so on until at least one ANN is produced with a fitness of 100%.Citation18

Nine ANNs were constructed. For each network, different group of predisposing factors were used as input data as detailed below. Output data was always the same for the networks and described the presence (expressed as 1) or absence (expressed as 2) of oral ulceration for each participant.

Input data (predisposing factors) for each network were as follows:

Network 1: Gender, hemoglobin, serum vitamin B12, serum ferritin, red cell folate, salivary candidal colony count, frequency of tooth brushing and flossing daily, frequency of using mouth wash weekly, and the number of fruit and vegetables consumed daily.

Network 2: Gender, hematological, and mycological results.

Network 3: Gender, hematological, and mycological results, tooth brushing, and consumption of fresh fruits and vegetables.

Network 4: Gender and hematological results.

Network 5: As network 3 but without gender.

Network 6: As network 3 but without the data for hemoglobin.

Network 7: As network 3 but without the data for vitamin B12.

Network 8: As network 3 but without the data for ferritin.

Network 9: As network 3 but without the data for red cell folate.

All networks were designed and constructed based on GA optimization.

Following the construction phase of the networks, the output was reproduced (network output) and the network was trained on input and output data until the deviation of the network output from the actual output was very small.

In the final phase, each network was used to obtain predictions of the RAU status of patients in group 2 using the input data of the patients in that group.

Network predictions that were less than 1.5 were considered equivalent to 1, indicating presence of RAU. If the network prediction was 1.5 or more, it was considered equivalent to 2, indicating absence of RAU.

Statistical tests of significance

Statistical tests of significance were used to explore statistically significant differences in: gender, hemoglobin, vitamin B12, ferritin, red cell folate, or candidal colonies count or oral hygiene and dietary habits between the group with RAU and the group without RAU ().

Table 2 Results of tests of significance to differences in gender, hematology, and colonies counts between participants with and without RAU

Ethical committee approval (by the University of Jordan Hospital) was obtained to carry out this study. All participants (or their guardians when required) signed a consent form to participate in this study.

Results

displays the predictions of the nine networks for patients in group 2. Networks 3 and 8 produced the most accurate predictions. Networks 5 and 7 produced the least accurate predictions.

Table 1 Accuracy of predictions made by the different networks to the status of RAU in patients of group 2

Accuracy of predictions with networks 3 and 8 were 90%, with networks 4, 6, and 9 it was 80%, with networks 1 and 7 it was 70%, and with networks 2 and 5 it was 60%.

displays the results of statistical tests of significance. At a 95% confidence interval, participants with RAU were not significantly different when compared with those without RAU regarding all possible predisposing factors.

Discussion

Little work has been done on the use of artificial intelligence to predict diseases. This study is the first to utilize ANN for the prediction of this rather unclear entity of diseases termed RAU. Although a number of factors have been linked to RAU,Citation2,Citation3 it can be considered an idiopathic disease with an unknown etiology in most cases.Citation2

While ordinary statistical tests of significance could not detect significant differences in the possible predisposing factors between participants who were affected by RAU and those who were not, some ANNs constructed in this study and trained on the same values could detect a pattern that incriminates some of the above-mentioned factors as predisposing factors to RAU.

It is important to notice that statistical tests were performed at a 95% confidence interval. Some researchers advocate the use of different confidence intervals when testing statistical significance in certain situations.Citation36

The better performance of ANN in this study over ordinary statistics is in agreement with the findings of Kattan.Citation14 The ANN software used in this study employed a specialized genetic algorithm for the build-up and optimization of all tested networks; however, the performance of the different networks was not consistent as accuracy depended on the choice of assumed input data (predisposing factors) for any given network.

For a neural network estimation of certain values, it is common to have a difference between the actual output and the estimated values. Hence, for a network output of either 1 or 2 in this study, the network output is rounded up to the nearest number.

It has been noticed that the more the network is trained on the supplied set of data, the more accurate it becomes.

As far as the prediction of unknown data is concerned and depending on the aforementioned factors, the accuracy of the ANNs can be reach 90%. This in itself has a significant clinical value.

Gender, hematological and mycological data, tooth brushing, and consumption of fruits and vegetables were the most important factors that produced networks with high accuracy. However, the elimination of ferritin as a predisposing factor did not affect the accuracy of the network. In fact, predictions made with network 8 have a sum of deviation equals to zero. This renders network 8 the most accurate network in this study ().

Figure 2 Network 8 and the weights of its different neurons.

Figure 2 Network 8 and the weights of its different neurons.

If trained on more input data and output data from new patients in the future, this network may be able to reach 100% accuracy.

Conclusion

Gender, hematological (without ferritin) and mycological data, tooth brushing, and fruits and vegetables consumed were found to be related to the occurrence of RAU.

Acknowledgements

We would like to thank the Deanship of Scientific Research/ University of Jordan, Amman, Jordan for providing the necessary funds to carry out this study.

We would also like to thank Mrs Dareen Yaseen and Mrs Manal Saleh for performing the hematological and mycological laboratory tests.

Disclosure

The authors report no conflicts of interest in this work.

References

  • KoybasiSParlakAHSerinEYilmazFSerinDRecurrent aphthous stomatitis: investigation of possible etiologic factorsAm J Otolaryngol20062722923216798397
  • FemianoFLanzaABuonaiutoCGuidelines for diagnosis and management of aphthous stomatitisPediatr Infect Dis J20072672873217848886
  • GonulMGulUCakmakSKKilicAThe role of the diet in patients with recurrent aphthous stomatitisEur J Dermatol200717979817324843
  • CalderonPEValenzuelaFACarrenoLEMadridAMA possible link between cow milk and recurrent aphtous stomatitisJ Eur Acad Dermatol Venereol20082289889918194234
  • PoddarUYachhaSKKrishnaniNSrivastavaACow’s milk protein allergy: An entity for recognition in developing countriesJ Gastroenterol Hepatol20102517818219817954
  • BesuIJankovicLMagduIUKonic-RisticARaskovicSJuranicZHumoral immunity to cow’s milk proteins and gliadin within the etiology of recurrent aphthous ulcersOral Dis20091556056419563417
  • NolanALameyPJMilliganKAForsythARecurrent aphthous ulceration and food sensitivityJ Oral Pathol Med1991204734751753349
  • O’FarrellyCO’MahonyCGraeme-CookFFeigheryCMcCartanBEWeirDGGliadin antibodies identify gluten-sensitive oral ulceration in the absence of villous atrophyJ Oral Pathol Med1991204764781753350
  • HayKDReadePCThe use of an elimination diet in the treatment of recurrent aphthous ulceration of the oral cavityOral Surg Oral Med Oral Pathol1984575045076587298
  • ScullyCGorskyMLozada-NurFThe diagnosis and management of recurrent aphthous stomatitis: a consensus approachJ Am Dent Assoc200313420020712636124
  • ScullyCPorterSOral mucosal disease: recurrent aphthous stomatitisBr J Oral Maxillofac Surg20084619820617850936
  • WooSBSonisSTRecurrent aphthous ulcers: a review of diagnosis and treatmentJ Am Dent Assoc1996127120212138803396
  • GrossmanRKamathCPKKumarVNamburuRData Mining for Scientific and Engineering ApplicationsDordrecht, The NetherlandsKluwer Academic Publishers2001
  • KattanMWEditorial comment on: development, validation, and head-to-head comparison of logistic regression-based nomograms and artificial neural network models predicting prostate cancer on initial extended biopsyEur Urol 20085461118207318
  • ZuradaJMArtificial Neural SystemsNew York, NYWest Publishing Company1992
  • HaykinSNeural Networks: a Comprehensive FoundationNew York, NYMacmillan College Publishing Company1994
  • GoldbergDEGenetic Algorithms in Search, Optimization, and Machine LearningNew York, NYAddison-Wesley1989
  • Pythia, The Neural Network Designer © [computer program]Carson City, NVRuntime Software2000
  • ValavanisIKMougiakakouSGGrimaldiKANikitaKSAnalysis of postprandial lipemia as a Cardiovascular Disease risk factor using genetic and clinical information: an Artificial Neural Network perspectiveConf Proc IEEE Eng Med Biol Soc20084609461219163743
  • ColakMCColakCKocaturkHSagirogluSBarutcuIPredicting coronary artery disease using different artificial neural network modelsAnadolu Kardiyol Derg2008824925418676299
  • Bartosch-HarlidAAnderssonBAhoUNilssonJAnderssonRArtificial neural networks in pancreatic diseaseBr J Surg20089581782618551536
  • AllisonJSHeoJIskandrianAEArtificial neural network modeling of stress single-photon emission computed tomographic imaging for detecting extensive coronary artery diseaseAm J Cardiol20059517818115642548
  • AcharyaURKannathalNNgEYMinLCSuriJSComputer-based classification of eye diseasesConf Proc IEEE Eng Med Biol Soc200616121612417945937
  • CampisiGDi LibertoCIaconoGOral pathology in untreated coeliac diseaseAliment Pharmacol Ther2007261529153617919276
  • MofidiRDeansCDuffMDde BeauxACPaterson BrownSPrediction of survival from carcinoma of oesophagus and oesophago-gastric junction following surgical resection using an artificial neural networkEur J Surg Oncol20063253353916618533
  • JerezJMFrancoLAlbaEImprovement of breast cancer relapse prediction in high risk intervals using artificial neural networksBreast Cancer Res Treat20059426527216254686
  • BollschweilerEHMonigSPHenslerKBaldusSEMaruyamaKHolscherAHArtificial neural network for prediction of lymph node metastases in gastric cancer: a phase II diagnostic studyAnn Surg Oncol20041150651115123460
  • AhmedFEArtificial neural networks for diagnosis and survival prediction in colon cancerMolecular Cancer 200542916083507
  • ZhuJZhuXDLiangSXPrediction of radiation induced liver disease using artificial neural networksJpn J Clin Oncol20063678378817068085
  • SpeightPMElliottAEJullienJADownerMCZakzrewskaJMThe use of artificial intelligence to identify people at risk of oral cancer and precancerBr Dent J19951793823878519561
  • van StaverenHJvan VeenRLSpeelmanOCWitjesMJStarWMRoodenburgJLClassification of clinical autofluorescence spectra of oral leukoplakia using an artificial neural network: a pilot studyOral Oncol20003628629310793332
  • WangCYTsaiTChenHMChenCTChiangCPPLS-ANN based classification model for oral submucous fibrosis and oral carcinogenesisLasers Surg Med20033231832612696101
  • PaulRRMukherjeeADuttaPKA novel wavelet neural network based pathological stage detection technique for an oral precancerous conditionJ Clin Pathol20055893293816126873
  • NayakGSKamathSPaiKMPrincipal component analysis and artificial neural network analysis of oral tissue fluorescence spectra: classification of normal premalignant and malignant pathological conditionsBiopolymers20068215216616470821
  • KanCWJiangBCNiemanLTSokolovKMarkeyMKComparison of linear and non-linear classifiers for oral cancer screening by optical spectroscopyAMIA Annu Symp Proc20071003
  • AkobengAKConfidence intervals and p-values in clinical decision makingActa Paediatr2008971004100718462462