70
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Deep learning in oral surgery for third molar extraction: empirical evidence and original model

, &
Article: 2349564 | Received 09 Nov 2023, Accepted 25 Apr 2024, Published online: 12 May 2024

Abstract

Preemptive analgesia is an analgesic intervention to influence postoperative pain sensation. Control of postoperative pain is a major challenge for any surgeon. Adequate control of postoperative pain continues to be a challenge for modern medicine. The advent of artificial intelligence (AI) in all spheres of life, including medicine, has created the technical ability to process a variety of types and characteristics of data related to many diseases. The application of artificial neural networks in medical science has made it possible to obtain an independent, objective assessment as a consequence of the application of preemptive analgesia. The data analysis by our original model, compared with the routinely used statistical methods, show the presence of a tendency for a positive effect of preemptive analgesia. In order to obtain an efficient self-learning neural network, it is necessary to use large arrays of properly selected data that fulfill the role of input parameters for the neural network. The results obtained from the original model used are comparable to the traditionally used statistical methods. This model objectifies to a certain extent the preemptive analgesia in the surgery of third mandibular molars.

Introduction

The anatomical structures relevant for oral surgery include numerous bones that form the facial skull, rich innervation and blood supply, and small and large salivary glands. This characteristic makes it a shockogenic area and presents a significant challenge in diagnosing many pathological processes. The diagnosis heavily relies on image processing, imposing a substantial workload on specialists dealing with pathology in this area. Surgical removal of impacted third molars is one of the most common procedures in oral surgery. Radiological examinations provide images that allow us to determine the complexity of the impending surgical procedure and its predictability. Many patients associate this procedure with significant pain. Managing postoperative pain is a major challenge for every surgeon. In recent decades, there has been continuous work on developing medications for effective postoperative pain management, as well as of technologies for new analgesic procedures. Despite advancements in medicine, adequate pain management after surgery remains a challenge, according to the literature. In three national studies on postoperative pain conducted in the United States at 10-year intervals, it was found that in the last decade there has even been a worsening in postoperative pain management [Citation1].

In recent years, the integration of artificial intelligence (AI) into all spheres of life, including medicine, has provided a technical possibility to process large data sets and thereby save doctors’ time [Citation2]. Initially, images obtained through various modalities were processed, but subsequently, a method was developed to process other types of data, which significantly saves time and human resources. According to Werner et al. [Citation3], deep learning (DL) demonstrates promising results in the application of electronic healthcare systems with sensory data.

Preemptive analgesia is a tool to influence postoperative pain sensation, although it is ambiguously accepted by specialists dealing with pain management. Advances in science allow for innovative forms of assessing the effectiveness of its use. According to several authors [Citation4], the use of this method for pain management is possible only within specific surgical specialties (orthopedics, oral and maxillofacial surgery, etc.). In recent years, various surgical procedures have been based on the objective assessment of postsurgical pain management.

To objectify the effect of preemptive analgesia, we chose the so-called “deep learning” to evaluate the quality of the performed analgesic procedure. The assessment is based on the use of artificial neural networks that resemble the connections in the human brain. Deep learning (DL) is a technique for obtaining predictive results from statistical algorithms using large data sets. DL-based algorithms have proven to be particularly effective in medical image analysis [Citation2,Citation5,Citation6]. The heterogeneity of the output data poses a challenge for DL in neural network analysis and training [Citation7].

There are a few studies related to the use of DL to analyse data sets other than images. In recent years, increasingly sophisticated techniques have been used to improve the diagnosis and treatment of patients with dental health problems based on deep learning. These systems address early diagnosis and treatment planning [Citation2,Citation5]. The accumulation of large arrays of imaging studies makes it difficult for clinicians to process them, which is why DL rapidly entered this field of medicine initially. The introduction of machine learning (ML) in oral and maxillofacial surgery has greatly improved diagnostic accuracy and the quality of treatment for patients with oral carcinomas [Citation8].

Analyzing predictive results in postsurgical pain management based on predetermined criteria for assessing pain intensity and the complexity of impacted third molar extraction is of great importance for both the doctor and the patient [Citation9]. Predictability in pain management is possible with the use of convolutional neural networks [Citation10], which is extremely important for clinical practice. According to Cui et al. [Citation11], electronic data in large volume would aid us in making decisions regarding the treatment plan.

The high frequency of impacted third molar extraction and the accompanying deterioration in quality of life justify the conduct of numerous studies in recent years. The development of DL as a technology, ranging from diagnosis using medical imaging to analysis of activity and emotional patterns, justifies its relatively frequent use in pain models based on impacted third molar extraction [Citation12]. According to some authors, the use of DL demonstrates the high performance of artificial neural networks [Citation13], creating prerequisites for its integration into routine practice.

The aim of this study was to develop an objective methodology for assessing the effect of preemptive analgesia on postsurgical pain intensity up to 48 h after impacted third molar extraction.

Subjects and methods

Ethics statement

Informed consent was obtained from all participants, in accordance with the requirements of the Ethics Committee at the Medical University of Plovdiv (Ref P-2898/10.11.2017).

Subjects

Forty clinically healthy subjects with bilaterally impacted mandibular third molars were included in the study. Inclusion criteria: Clinically healthy patients between 18 and 40 years of age with no evidence of pain in the area of the tooth to be extracted. Exclusion criteria: allergy to medications, acute inflammation in the area of the tooth to be extracted, regular alcohol consumption or drug abuse, women 5 d before and 5 d after their menstrual period (to exclude the influence of hormonal factors on pain), pregnancy and breastfeeding.

Study design

The distribution of medications previously to both surgical procedures was placebo-controlled, double-blind, randomized split-mouth based. The participants were divided into three groups, depending on the medication they were given preoperatively. The subjects were given a flask with the medication prior to the first surgery. Prior to the second surgery, the participants remained unaware of whether they received the same drug as the first time. This information was disclosed at the end of the trial. Fifteen participants received ibuprofen 400 mg (1 tablet 3 times a day) − 9 tablets in total, other 15 participants received ibuprofen 400 mg (1 tablet 3 times a day) − 9 tablets in total, and gabapentin 300 mg (1 tablet in the evening) − 3 tablets in total, and the last 10 participants received placebo (1 tablet 3 times a day). The time interval between the surgical procedures of each patient was set at two weeks. Prior to the second surgical procedure, each patient received a vial with medication using the principle of simple randomization. The patients were asked to self-report the pain levels using a 100 mm Visual Analogue Scale (VAS). The difficulty of the surgical procedure (extraction difficulty) was assessed by a 10-point Pederson scale.

Statistical analysis

Descriptive statistics was used for: (1) quantitative variables presented as mean (standard deviation) and median (25th percentile; 75th percentile) when variables lack normal distribution and (2) qualitative variables presented as frequencies and percentages (n and %). Continuous variables were tested for normality of statistical distribution by the Shapiro–Wilk test. Comparisons between two groups were analyzed with a t-test or Mann–Whitney (U) test for independent samples. The Kruskal–Wallis (H) test was used to determine statistically significant differences between two or more groups of an independent variable on a continuous ordinal dependent variable. The paired-sample t-test was applied to compare the means between two related groups or the Wilcoxon signed-rank (W) as a nonparametric test equivalent. Friedman’s two-way analysis of variance was applied to assess the statistically significant differences between the pain score in the three time points with analgetic administration. A 2-sided p-value of <.05 was considered statistically significant. The systematization, processing and analysis of the data were performed using SPSS v.26 for Windows (IBM Corp. Released 2019. Armonk, NY: IBM Corp).

Results

Empirical evidence

The cohort contained 40 patients with a median age of 22 years (22, 25 years), and 30% (n = 12) of them were male. Two patients were lost to follow-up.

During the first surgical extraction, all patients were randomly assigned to three medications, as follows: Ibuprofen, 35% (n = 14); Combo therapy, 45% (n = 18); and Placebo, 20% (n = 8). During the second surgical procedure, all patients were again randomly assigned to the three medications: Ibuprofen, 37.5% (n = 15); Combo therapy, 30% (n = 12); and Placebo, 27.5% (n = 11), and 5% (n = 2) were lost to follow-up.

The median pain levels were self-reported by the patients. The median pain levels on the 3rd and on the 24th hour after the first surgical extraction were similar to those after the second surgical extraction: W = 319, p = .948 and W = 186.5, p = .093, respectively. We observed a statistically significant decrease in pain at the 6th hour after the second extraction (5.0 ± 2.6 points) in contrast to the level reported at the same time point after the first extraction (4.0 ± 3.0 points) (t = 2.87, p = .007).

The difficulty of the surgical procedure was measured on a scale from 5 to 9. We transformed the numeric variable into a dichotomous categorical variable, defining the following levels of pain: ≤6 and ≥7. During the first surgical procedure, 55% (n = 22) of the molar extractions were categorized as 7 (7.0; 6.25, 7.0) in contrast to the second molar extractions, where 39.5% (n = 15) were reported to be 7 and 23.7% (n = 9) were on level 8 (7.0; 7.0, 8.0).

The results regarding the level of pain reported after the first and second surgical extraction on the 3rd, 6th and 24th hour are summarized in .

Table 1. Measurements of Central tendency and spread, and statistical inference of the self-reported pain levels, reported at different time points (3rd, 6th, 24th hour) after the first and second surgical extractions.

On the 24th hour after the second surgical extraction, the patients with difficulty ≥7 reported a statistically significant higher median pain (2.1; 0.9, 4.7 points) than those categorized with difficulty ≤6 (0.9; 0.0, 3.9 points) (U = 157, p = .031).

We observed statistically significant differences between the mean pain values reported at the 3rd hour after the second surgical extraction distributed by the categories of the medications prescribed: Ibuprofen and Placebo (H = 9.21; p = .036) and Combo and Placebo (H = 11.95, p = .010).

Statistically significant difference was also observed between the mean pain values, reported at the 6th hour after the second surgical extraction distributed by the categories of the medications prescribed: Ibuprofen (5.4 ± 4.0 points) and Placebo (3.0 ± 2.5 points) (t = 2.07; p = .049).

After the first surgical procedure, the highest pain score reported was at the 6th hour (5.4; 3.1, 8.0) in the Placebo group. The lowest score reported at the 3rd hour was 1.9; 0.5, 4.1 in the Ibuprofen group and 2.5; 1.2, 4.6 in the Combo group.

After the second surgical procedure, the highest pain score reported was at the 3rd hour (4.8; 2.6, 7.1) and 6th hour (4.8; 2.9, 9.0) in the Placebo group. The lowest score reported at the 3rd hour was 1.7; 1.0, 3.2 in the Ibuprofen group and 1.7; 0.08, 2.3 in the Combo group. summarizes the results of Friedman’s two-way analysis of variance.

Table 2. Statistically significant differences between the pain score (in points) reported at the three time points (3rd, 6th, 24th hours) by surgical procedure and medication group.

demonstrates the pain reported at the three time points (3rd, 6th, 24th hour) for each surgical procedure (first and second visit) by the type of medication prescribed and distributed by the difficulty of the molar extraction.

Figure 1. Box-plot diagram of the pain by time point, visit, medication therapy and difficulty of the molar extraction.

Figure 1. Box-plot diagram of the pain by time point, visit, medication therapy and difficulty of the molar extraction.

Statistically significant differences were observed between the mean pain values at the 3rd hour after the first molar surgery, categorized with extraction difficulty ≥7, as reported by patents who received Combo (2.7 ± 2.1 points) vs. Placebo (5.8 ± 3.6 points) (t = 2.3, p = .033). For this particularly distributed group, the results were confirmed after the second molar extraction: Combo (1.5 ± 1.0 points) vs. Placebo (5.1 ± 2.8 points) (t = 3.4, p = .004). Furthermore, statistically significant differences were proven between the mean pain values registered at the 3rd hour after the second molar surgery, categorized with extraction difficulty ≥7, reported by patients being treated with Ibuprofen (2.4 ± 2.1 points) vs. Placebo (5.1 ± 2.8 points) (t = 2.3, p = .031). No differences were observed in the mean pain values at all time points for both surgical procedures with extraction difficulty ≥7, as reported by patents allocated between the active treatments (Ibuprofen vs. Combo).

Original model

For our model, the magnitude of pain is categorized into 4 possible classes based on the score, each associated with the following labels:” No pain – Mild”,” Mild – Moderate”,” Moderate – Severe”,” Severe – Pain as bad as it could possibly be” in accordance to the “Numeric rating scale”. To forecast how each class aligns with a specific set of inputs, a neural network model was constructed using the Deep Learning (DL) approach.

Neural network design

From the point of view of modelling, the classification process requires a large number of training examples containing input and output data. In this case, we use a structured database consisting of 80 examples. Each of them contains 49 input parameters describing the patient’s condition and one output, showing the amount of pain at the 24th hour. In the learning process, the created model uses this database to determine how best to relate certain input parameters to a specific output class. Before being included in the modelling algorithm, the class labels are coded, and a unique integer is assigned to each of them.

To date, various types of algorithms are known that are successfully used for predictive modelling in solving classification problems. As no exact theory has been developed so far describing methods for comparing different algorithms for specific tasks, we have determined experimentally the type and configuration of the algorithm that we believe leads to the best accuracy. The model we created is based on a neural network and predicts the probability that the set of input data belongs to each output class.

Architecture

The neural network architecture is presented in . It consists of three types of layers: an input layer made up of 49 neurons. The information about the medical (clinical indicators) of the patient, which is a total of 49, is submitted to it. The output of this layer is connected to the first of four hidden layers, which are connected in series and have 100, 100, 50 and 25 neurons, respectively. Each of them receives data from the previous layer and transmits the data to the next layer. They are located between the input and output layers. The output layer of the neural network contains 4 neurons, which correspond to the output classes. Each of them can take values from 0 to 1, representing the probability that the input data belongs to the corresponding output.

Figure 2. Network architecture.

Figure 2. Network architecture.

The function we have chosen to activate all the hidden layers in the neural network is ReLU. It determines the way in which the weighted sum of the inputs of each of the neurons in a particular layer is transformed into an output. If the value of the input is positive, ReLU brings out this value directly to the output, in all other cases, the output will be zero. The standard approach for initializing the weights of nodes in neural network layers that use the ReLU function is called “he” initialization [Citation14,Citation15]. In this method, the weight is defined as a random number in a given Gaussian probability distribution interval (G) with a mean value of 0.0 and a standard deviation of sqrt (2/n), where n is the number of inputs to the corresponding neuron.

The output layer of the neural network is activated by the softmax function, which outputs a vector with values that are summed up to 1. These values can be interpreted as probabilities of belonging to a particular class, and the number of vector components is equal to the number of classes [Citation16].

Categorical_crossentropy is used as a loss function for the model, which is usually widely used in classification tasks. It is used in cases where an example can belong to only one of several possible categories and the model must decide which one it is. Obviously, this function calculates the deviation of the expected value from the real one by quantifying the difference between two probability distributions [Citation4,Citation17].

Categorical_crossentropy determines the error in a given data example by the following formula [Citation16]: Loss=i=1outputsizeyi.log y^i where yî  is the i-th scalar value at the model output, yi is the corresponding target value, output size is the number of scalar values at the model output, and Loss is an indication of how different the two discrete probability distributions yî and yi are. Since yi represents the probability that event i will occur and the sum of all yi is equal to 1, it can be concluded that one of the events will always occur. The minus sign in the formula indicates that Loss decreases as the two distributions approach each other [Citation18].

The loss function determines the accuracy of forecasting. For this reason, the effectiveness of the model will depend on the choice of the correct function.

As an optimization method in the developed model, we use Adam (Adaptive Moment Estimator). This algorithm is used to update the weights of the neural network in individual iterations. It can be seen as a successor to the AdaGrad and RMSProp algorithms, which automatically adapt the learning rate for each input variable for the target function. Adam is often used instead of the classic stochastic gradient reduction method, which maintains a constant learning speed (called alpha) for all weight updates, and the learning speed does not change during model training.

Using the optimization algorithm allows us to minimize the error loss function. In this way, a minimal difference is achieved between the actual value and the predicted outcome, which makes the developed model more accurate in performing the specific task.

Model output

The network training experiments were performed with different batch sizes and epochs. One of them consisted of the clinical data for the patients described in the first part of this study. When training the model with 92 epoch and 13 batch size, an accuracy of approximately 92.6% was achieved. The loss function is presented in .

Figure 3. The loss function.

Figure 3. The loss function.

The lowest value of the loss function obtained for the training data set is 5.6e-03, while that for the validation set is 0.23.

Numerous experiments with different values for epoch and batch size have been performed, showing that there is no significant change in accuracy. The model will use self-reported pain levels after a procedure of a given level of difficulty in one particular patient and then use this individual data as input to predict pain levels and analgesia dose/regimen for future procedures in patients with the same level of difficulty.

Prospects for the application of deep learning (DL) in oral surgery

In recent years, there has been a significant surge in research centered around artificial intelligence, particularly deep learning (DL). Extraction of impacted mandibular molars, being one of the most frequently performed surgical procedures, serves as an excellent model for evaluating the capabilities of neural networks. This model can be tailored to focus on both the surgical procedure’s complexity and the early diagnosis of potential complications. By implementing DL, it has become possible to objectify a contentious issue in medicine, namely preemptive analgesia. The tasks set before artificial neural networks to objectify the effect of preemptively administering medication are based on training resulting from the influence of multiple factors. Studies have demonstrated that the difficulty of extraction significantly impacts the efficacy of analgesic administration [Citation19–21]. Another critical aspect established is the influence of a thorough understanding of the surgical procedure. All of this holds immense importance in planning analgesic procedures prior to any surgical intervention related to the extraction of impacted mandibular third molars. The potential applications of DL involve creating a predictive model for pain intensity and the effectiveness of preemptive analgesia. Our conclusions align with those of other authors [Citation1]. Another crucial consideration is the possibility of error in case of improperly submitted data, which may misguide the neural network during its training. A similar perspective is advocated by Lötsch et al. [Citation22] in their work on the use of machine learning (ML) in pain research.

Limitations

The volume of data used in our study is limited, thus laying the groundwork for an incomplete assessment of the neural network’s capabilities.

Conclusions

The use of DL enables the prediction of pain intensity and the need for an additional dose of analgesics. The effectiveness of the applied analgesic procedure is directly proportional to the amount of data fed into the artificial neural network.

Disclosure statement

No potential competing interest was reported by the authors.

Data availability statement

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

Additional information

Funding

The author(s) reported there is no funding associated with the work featured in this article.

References

  • Wang R, Wang S, Duan N, et al. From patient-controlled analgesia to artificial intelligence-assisted patient-controlled analgesia: practices and perspectives. Front Med. 2020;7:145. doi: 10.3389/fmed.2020.00145.
  • AbuSalim S, Zakaria N, Islam MR, et al. Analysis of deep learning techniques for dental informatics: a systematic literature review. Healthcare. 2022;10(10):1892. doi: 10.3390/healthcare10101892.
  • Werner P, Lopez-Martinez D, Walter S, et al. Automatic recognition methods supporting pain assessment: a survey. IEEE Trans Affective Comput. 2022;13(1):530–552. doi: 10.1109/TAFFC.2019.2946774.
  • Aida S, Baba H, Yamakura T, et al. The effectiveness of preemptive analgesia varies according to the type of surgery: a randomized, double-blind study. Anesth Analg. 1999;89(3):711–716. doi: 10.1097/00000539-199909000-00034.
  • Yan KX, Liu L, Li H. Application of machine learning in oral and maxillofacial surgery. AIMI. 2021;2(6):104–114. doi: 10.35711/aimi.v2.i6.104.
  • Celik ME. Deep learning based detection tool for impacted mandibular third molar teeth. Diagnostics. 2022;12(4):942. doi: 10.3390/diagnostics12040942.
  • Yang S, Zhu F, Ling X, et al. Intelligent health care: applications of deep learning in computational medicine. Front Genet. 2021;12:607471. doi: 10.3389/fgene.2021.607471.
  • Alhazmi A, Alhazmi Y, Makrami A, et al. Application of artificial intelligence and machine learning for prediction of oral cancer risk. J Oral Pathol Med. 2021;50(5):444–450. doi: 10.1111/jop.13157.
  • Yoo JH, Yeom HG, Shin W, et al. Deep learning based prediction of extraction difficulty for mandibular third molars. Sci Rep. 2021;11(1):1954. doi: 10.1038/s41598-021-81449-4.
  • Cascella M, Schiavo D, Cuomo A, et al. Artificial intelligence for automatic pain assessment: research methods and perspectives. Pain Res Manag. 2023;2023:6018736–6018713. doi: 10.1155/2023/6018736.
  • Cui Q, Chen Q, Liu P, et al. Clinical decision support model for tooth extraction therapy derived from electronic dental records. J Prosthet Dent. 2021;126(1):83–90. doi: 10.1016/j.prosdent.2020.04.010.
  • Kim BS, Yeom HG, Lee JH, et al. Deep learning-based prediction of paresthesia after third molar extraction: a preliminary study. Diagnostics. 2021;11(9):1572. doi: 10.3390/diagnostics11091572.
  • Kang IA, Ngnamsie Njimbouom S, Lee KO, et al. DCP: prediction of dental caries using machine learning in personalized medicine. Appl Sci. 2022;12(6):3043. doi: 10.3390/app12063043.
  • Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge: The MIT Press; 2016.
  • He A, Zhang X, Ren S, et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. IEEE International Conference on Computer Vision (ICCV), 7-13 Dec. 2015, Santiago, Chile. IEEE Computer Society. p. 1026–1034.
  • Bishop CM. Neural networks for pattern recognition. Oxford: Department of Computer Science and Applied Mathematics Aston University, Clarendon Press Oxford; 1995.
  • Murphy K. Machine learning: a probabilistic perspective. Cambridge: MIT; 2012.
  • Barry J. A deep learning approach to diagnosing schizophrenia. Orlando FL: University of Central Florida; 2019.
  • Tenglikar P, Munnangi A, Mangalgi A, et al. An assessment of factors influencing the difficulty in third molar surgery. Ann Maxillofac Surg. 2017;7(1):45–50. doi: 10.4103/ams.ams_194_15.
  • Lago-Méndez L, Diniz-Freitas M, Senra-Rivera C, et al. Relationships between surgical difficulty and postoperative pain in lower third molar extractions. J Oral Maxillofac Surg. 2007;65(5):979–983. doi: 10.1016/j.joms.2006.06.281.
  • Barreiro-Torres J, Diniz-Freitas M, Lago-Méndez L, et al. Evaluation of the surgical difficulty in lower third molar extraction. Med Oral Patol Oral Cir Bucal. 2010;15(6):e869–e874. doi: 10.4317/medoral.15.e869.
  • Lötsch J, Ultsch A. Machine learning in pain research. Pain. 2018;159(4):623–630. doi: 10.1097/j.pain.0000000000001118.