Full article: Evaluating machine and deep learning techniques in predicting blood sugar levels within the E-health domain

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

This paper focuses on exploring and comparing different machine learning algorithms in the context of diabetes management. The aim is to understand their characteristics, mathematical foundations, and practical implications specifically for predicting blood glucose levels. The study provides an overview of the algorithms, with a particular emphasis on deep learning techniques such as Long Short-Term Memory Networks. Efficiency is a crucial factor in practical machine learning applications, especially in the context of diabetes management. Therefore, the paper investigates the trade-off between accuracy, resource utilisation, time consumption, and computational power requirements, aiming to identify the optimal balance. By analysing these algorithms, the research uncovers their distinct behaviours and highlights their dissimilarities, even when their analytical underpinnings may appear similar.

KEYWORDS:

1. Introduction

Machine learning has significantly transformed multiple fields by allowing computers to learn and make predictions without explicit programming. However, we're not just celebrating this transformation; we're diving into practical considerations. We're starting with a specific case study: predicting glucose levels in diabetic patients' blood. Our goal is to find the most reliable algorithm among the ones we're comparing and to understand their complexity and efficiency. In this paper, our primary focus is on thoroughly analysing and comparing various machine learning approaches. We're particularly interested in doing this analysis to prepare for future experiments, starting with a smaller dataset.

To achieve this, we begin with an overview of different algorithms, especially those related to regression. We then explore deep learning, focusing on neural networks like Recurrent Neural Networks (RNNs) (Medsker & Jain, Citation2001) and Long Short-Term Memory (LSTM) Networks (Hochreiter & Schmidhuber, Citation1997).

Efficiency is a key concern in real-world machine learning applications, especially when simple models might suffice. We investigate whether complexity is justified in straightforward situations and analyse the trade-offs between accuracy, resource usage, and computational power.

To assess the quality of our model's predictions, we rely on these metrics:

RMSE (Root Mean Square Error): RMSE is a measure of the average magnitude of the errors or residuals between predicted values and actual values in a regression or forecasting model. It quantifies the standard deviation of these errors. (1) $R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2}}$ (1) Lower RMSE values indicate better model accuracy. RMSE is in the same units as the target variable.
R-squared ( $R^{2}$ ): $R^{2}$ measures the proportion of the variance in the dependent variable (target) that is explained by the independent variables (features) in a regression model. It ranges from 0 to 1, with higher values indicating a better fit. (2) $R^{2} = 1 - \frac{\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2}}{\sum_{i = 1}^{n} (y_{i} - \bar{y})^{2}}$ (2) $R^{2}$ of 1 indicates a perfect fit, while lower values suggest a less accurate model fit.
(Mean Absolute Percentage Error): MAPE measures the percentage difference between predicted and actual values. It quantifies the average relative error of a model, making it suitable for comparing accuracy across different datasets. (3) $M A P E = \frac{1}{n} \sum_{i = 1}^{n} | \frac{y_{i} - {\hat{y}}_{i}}{y_{i}} | \times 100 %$ (3) Lower MAPE values represent better accuracy, with 0% indicating a perfect prediction.
MBD (Mean Bias Deviation): MBD measures the average bias or systematic error in predictions. It quantifies the average overestimation or underestimation of a model compared to actual values. (4) $M B D = \frac{1}{n} \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})$ (4) A positive MBD indicates overestimation, while a negative MBD indicates underestimation. A value of 0 suggests no systematic bias.

The main contributions in this work are as follows:

Comparative Analysis of Machine Learning Techniques: We conducted a comprehensive comparison of machine learning techniques, including Linear Regression, Polynomial Regression, Lasso Regression, Ridge Regression, and LSTM (Long Short-Term Memory) deep learning, for the analysis and prediction of glycemic trends. This study provides valuable insights into the performance of these algorithms when applied to real data obtained from glucose sensors.
Development of Predictive Models: We developed predictive models using glucose levels measured over four months as input data and insulin boluses injected into a patient's bloodstream. Notably, our models successfully replicated glycemic trends with promising results.
Algorithm Performance and Resource Considerations: Our findings revealed that Regression algorithms demonstrated good reliability, low resource consumption, and efficient training cycles. In contrast, LSTM outperformed other techniques in terms of Mean Squared Error (MSE), allowing for preemptive predictions of glycemic trends. However, it demands substantial computational resources, including a GPU, for optimal performance.

In addition to these main contributions, our work also points to future directions:

Patient-Specific Modeling: Future experiments will involve testing the algorithms with data from different patients. This personalised approach acknowledges the inherent variability in individual glycemic patterns and responses to insulin boluses. Our goal is to develop a general, robust model capable of predicting glycemic trends for various patients, thereby enhancing patient care.
Privacy-Aware Federated Learning: We are mindful of privacy concerns related to personal health data. In light of this, we are considering Federated Learning approaches that inherently prioritise data privacy by design. This signifies our commitment to advancing healthcare technology while respecting patient privacy.

These contributions collectively demonstrate our work's significance in the domain of glycemic trend analysis and its potential to drive improvements in patient care and privacy-conscious data analysis.

The paper is organised as follows: Section 2 describes the main algorithms for regression and deep learning that will be applied in the proposed study; Section 3 introduces the E-health case study, describing the specific objectives that we want to achieve through artificial intelligence; Section 4.1 describes the use of Machine Learning algorithms, in particular for Regression, and compares the obtained results, also in terms of efficiency; Section 5 focuses on the LSTM approach, describing the exploited network and the results of the prediction model; Section 6 compares the results obtained by the Machine and Deep Learning approaches, and focuses on their resource consumption; Section 7 closes the paper with final remarks and pointers to future developments.

2. Related works

In recent years, there has been a growing body of research focused on the application of artificial intelligence and machine learning algorithms in the healthcare sector. Specifically, recurrent neural networks such as LSTM (Long Short-Term Memory) have gained significant attention due to their ability to handle sequential data, making them well-suited for analysing longitudinal medical data. This type of data refers to information collected over time from the same individuals.

One notable study that highlights the importance of these techniques utilised the ADNI (Alzheimer's Disease Neuroimaging Initiative) database and involved 1677 participants (Nguyen et al., Citation2020). The research demonstrated the effectiveness of an RNN-based model known as minimalRNN in predicting clinical diagnosis and disease biomarker progression up to 6 years into the future. These findings underscore how artificial intelligence and neural networks can serve as valuable predictive tools for diagnosing and monitoring diseases such as Alzheimer's. Studies have also demonstrated the usefulness of Machine Learning approaches with the prediction of Multiple Sclerosis progression, even with small datasets (Branco et al., Citation2022).

In a related domain, other research efforts have focused on using machine learning techniques to predict the risk of depression among the elderly (Xu et al., Citation2019). By leveraging data from a 22-year longitudinal survey, a multitask LSTM model was developed to capture temporal patterns that are often overlooked by traditional methods. This approach proved useful in assisting doctors and social workers in the early detection of depression among the elderly. The proposed deep learning methodologies hold potential for implementation as clinical decision support systems.

In the field of diabetes management, the University of Pavia proposed custom LSTM models for 100 patients using the UVA/Padova simulator. The objective is to predict blood glucose levels within a 40-minute forecasting window, taking into account factors such as meals, insulin intake, and previous blood glucose values as inputs (Iacono et al., Citation2022). The integration of Machine and Deep Learning methodologies into the realm of healthcare, specifically in the context of diabetes management, is a field of considerable interest and ongoing development. Notably, Reinforcement Learning techniques have already found application in forecasting the risk of Type 2 Diabetes in patients (Zohora et al., Citation2020). These endeavours underscore the potential of Artificial Intelligence (AI) to complement the efforts of healthcare professionals.

In parallel, the landscape of glucose level prediction has witnessed significant advancements, traversing a notable trajectory from basic feed-forward neural networks to the adoption of more intricate Recurrent Neural Networks (RNNs) (Pappada et al., Citation2008). This evolution is indicative of the dynamic nature of the field, propelled by the quest for greater accuracy and effectiveness.

Within this landscape, Long Short-Term Memory (LSTM) based approaches have emerged as particularly formidable. Numerous studies (Aliberti et al., Citation2020; Meijner & Persson, Citation2017), have illuminated the exceptional predictive capabilities and robustness of LSTM models. These models have excelled in both the prediction of glucose values and the early detection of hypoglycemia, a critical facet of diabetes management exploration (Mujahid et al., Citation2021).

In sum, the amalgamation of AI, machine learning, and deep learning into diabetes management holds substantial promise. The amalgamation of these cutting-edge techniques not only provides invaluable support for healthcare practitioners but also showcases the remarkable potential to enhance the quality of care for patients grappling with diabetes. These studies utilise publicly accessible datasets, with a primary emphasis on the outcomes of their predictions rather than the resource utilisation of the algorithms they employ. However, it's crucial to underscore the significance of resource allocation in this context. This is particularly pertinent because insulin pumps, responsible for regulating insulin dosages in response to glucose levels, are constrained by limited computational capacity and battery power. Implementing complex learning algorithms directly on these pumps is practically unfeasible.

One potential solution lies in adopting Federated Learning approaches, which would alleviate the computational burden on individual pumps while concurrently augmenting the prediction capabilities of interconnected devices. However, it's important to note that this approach necessitates the presence of a remote, centralised server responsible for carrying out the primary computational tasks. Studies have been carried out with the goal of verifying the efficacy and efficiency of Federated Learning techniques, in particular, to satisfy privacy-related issues (Aral et al., Citation2023; Fan, Zhang et al., Citation2023; Pezzullo et al., Citation2023). The importance of model integrity and robustness against external malicious attacks, and the protection of users' data, are central in Deep Learning approaches, especially when such models are proposed as a Service (Fan, Xu et al., Citation2023).

The E-health domain, in general, has been the research subject of several studies, aiming at identifying useful approaches to support users' better lifestyles (Zappatore et al., Citation2023), or to trace supply chains of medical equipment (Li et al., Citation2023).

While these advancements are significant, there are still open questions to address, including the trade-off between accuracy and computational efficiency. This is particularly important when deploying complex Machine and Deep Learning models on resource-constrained portable devices. Our work aims to further explore this issue by conducting a comparative evaluation of different algorithms, paying particular attention to the balance between accuracy, resource usage, and computational power.

2.1. Algorithms background

Regression is a technique used to establish a relationship between independent variables and a dependent variable, which represents our output or the expected data. This technique is often employed in supervised learning, where well-defined input and output variables are required. Regression enables continuous prediction by assigning a continuous value to a given pattern. Common examples of regression applications include estimating selling prices of real estate, financial forecasting, healthcare trend prediction, and supporting marketing decisions.

In regression, the goal is to draw a line that closely fits the input dataset. This is achieved by calculating the distance between each data point and the line, ultimately determining the best possible fit.

Linear Regression is the simplest technique among those mentioned; it is easy to work with, and most of the phenomena (the simplest) are related in a linear way. Furthermore, if the variables are not linearly related to each other, we can think of transforming the relationship into a linear one and adapting it to our case studies.

In linear regression, the set of hypotheses is denoted as H, consisting of linear (affine) functions from X to Y: (5) $\begin{aligned} h \in H \Leftrightarrow h (x) = w_{1} + w_{2} x_{2} + \dots + w_{Θ} x_{Θ} \\ (w_{1}, \dots, w_{Θ} \in R) \end{aligned}$ (5) where: $w_{1}$ is the intercept (the model value when x is zero), while $w_{2}$ is the coefficient expressing the dependence of $h (x)$ on the kth component of x.

By simple linear regression, we mean having a set of algorithms and techniques for Machine Learning capable of predicting an output variable given only one independent variable, i.e.: (6) $y = w_{1} + x w_{2}$ (6) in order for Linear Regression to work effectively, there needs to be a linear relationship between the input data. If such a relationship doesn't exist, we can attempt to achieve a “best fit.” However, even with Linear Regression, results may still be less accurate, resulting in predictions that deviate significantly from reality.

To address non-linear relationships, Polynomial Regression is employed. With Polynomial Regression, we can establish a “curvilinear relationship” between the dependent and independent variables. This is achieved by incorporating polynomial terms into the equation. (7) $y = a_{0} + a_{1} x + a_{2} x^{2} + \dots + a_{n} x^{n}$ (7) Determining the appropriate degree of the polynomial is not a straightforward process. Setting it too high could lead to overfitting, where the model becomes too closely tailored to the training data and performs poorly on new data. Conversely, setting it too low could result in underfitting, where the model fails to capture important patterns in the data.

It's crucial to strike a balance and choose an optimal degree of the polynomial to achieve the best fit for the given data.

Two specific issues can arise with Linear Regression. Firstly, if the data is not linearly related (as we have already seen, this can result in poor predictions). Secondly, as the number of features increases, there is a risk of overfitting, which should be avoided.

In Linear Regression, we aim to find the best-fitting straight line for our model, which can be represented as: (8) $y = w x + b$ (8) Here, w represents the weight (the slope of the line), and b represents the bias (the intercept). The goal is to determine the optimal values for these two parameters. When using Ridge Regression, the slope and bias parameters are calculated by minimising the sum of squared errors, with the addition of an extra term: (9) $λ \cdot w^{2}$ (9) Here, λ is a penalty term that determines the severity of the penalty. If λ is set to zero, Ridge Regression reduces to Linear Regression. The value of λ determines the slope of the line and indicates the sensitivity of the dependent variable to changes in the independent variable. It's worth noting that λ can take any value between 0 and $+ \infty$ . Increasing λ reduces the slope, approaching zero asymptotically. In practice, a technique called Cross Validation is often used to determine the optimal value of λ.

By employing cross-validation, we can determine the most suitable value for λ to ensure the best performance of the Ridge Regression model.

Now, with Lasso Regression, we introduce a different penalty parameter: (10) $λ \cdot ∣ w ∣$ (10) Similar to Ridge Regression, λ can take any value between 0 and $+ \infty$ . When λ is set to zero, Lasso Regression reduces to simple linear regression. It's worth noting that Cross Validation is used to find the optimal value of λ.

Intuitively, Lasso Regression, similar to Ridge Regression, results in a straight line with a lower slope compared to simple linear regression.

Therefore, both regression techniques exhibit similar behaviour and mathematical forms. In both cases, the predictions for the dependent variable are less sensitive to changes in the independent variables (features).

The LSTM (Long Short-Term Memory) architecture was developed to address the issue of long-term dependencies. Rather than replacing RNN networks, LSTM enables them to retain and process inputs over extended periods. Similar to a computer's memory, LSTM has the capability to read, write, and delete information from its memory.

Let's examine the structure of an LSTM neuron, as shown in Figure . Notably, there is a specific gate called the forget gate, which determines which data should be disregarded and which require attention. Referring to the provided image, we can describe the behaviour of the forget gate as follows.

Figure 1. Structure of an LSTM neuron. Image taken from Liu et al. (Citation2020).

First, the current input and output information (ht−1) are considered, and the sigmoidal function is applied to these data, producing an output between 0 and 1. This value is then multiplied by the previous state (ct−1). If the value is 0, it indicates that the previous state is forgotten, whereas if it is 1, the previous state is retained and not forgotten. The outputs of the input gate and forget gate become the current state of the neuron, ct. Finally, the output gate, utilising xt and ht−1, determines whether the current state should be output as the next input, ht.

It can be inferred that training networks with these neurons become more complex due to the increased number of parameters involved. However, these networks are fundamental in various applications. They excel in tasks such as classifying, processing and predicting based on time series. LSTM has achieved remarkable success, such as in the development of OpenAI's robots that outperformed humans in the game Dota 2, or in the field of robotics where they control human-like robotic hands to manipulate physical objects with exceptional precision.

3. Description of the E-health case study

The term “blood sugar” refers to the glucose concentration found in the bloodstream, measured in mg/dL. Typically, the normal range for blood sugar in adults settles around 60/70 mg/dL, with values increasing during the post-prandial phase, occasionally reaching levels of 130/150 mg/dL. Our body possesses various adaptive mechanisms to ensure that glucose remains within the physiological parameters, which is of utmost importance. Glucose serves as the primary energy source for all our tissues, especially the brain, which relies on a direct supply of glucose to function optimally.

However, there are instances where the body's regulatory mechanisms for maintaining blood sugar within the physiological range may become disrupted. It is during these cases that we truly appreciate the significance of blood sugar regulation. One such alteration is hyperglycemia, commonly associated with diabetes. In particular, the focus of this work is related to the prediction of glycemia rates in patients with Type 1 diabetes, in relation to the quantity of insulin that the patient is injected, and the current glucose levels in her blood.

We should note that a patient with Type 1 diabetes has an altered blood sugar regulation mechanism, causing even fasting blood sugar values to settle around 126 mg/dL, which is notably higher than the norm (typically, fasting blood sugar levels in physiological conditions should be around 100 mg/dL). Figure summarises the threshold values for glucose.

Figure 2. Reference glucose levels.

This time the dataset was retrieved from an insulin pump Medtronic 780G (Pintaudi et al., Citation2022). This insulin pump utilises feedback from glucose sensors, which are applied to the patient's skin, to regulate the delivery of insulin boluses. Patients can monitor their Glucose levels and Insulin boluses, and also download the data in CSV format. The data we have used in this study are related to a single patient, who has been monitored from 02/01/2022 to 15/03/2022. Considering that the Glucose sensors provide data every 5 minutes, we have over than 20 thousand samples to work with. The data that can be obtained from the pump are characterised by more than 50 attributes, which cover each aspect of the patient's daily glucose trends (Diabetes, Citation2022).

However, most of these attributes are not useful for our research. Therefore, we can manipulate the original data and extract only the information we need. These include 'Bolus Volume,' 'Sensor Glucose,' and 'ISIG Value'. Also, we need to include Date and Time parameters. In the end, the relevant attributes taken into consideration are:

“Bolus Volume” is the quantity of insulin injected by the patient at a specific time.
“Sensor Glucose” is the glycemic value that the sensor, previously applied to the patient, is measuring.
The 'ISIG Value' is a 'raw' parameter representing the interstitial signal. It is used to assess whether the sensor is functioning correctly.

However, even after this first filtering, the dataset does not fully meet our requirements. Since the values are digitally retrieved from a measuring instrument, there may be measurement errors resulting in values that are either too high or too low. Also, there are special situations in which very low or high value may be caused by improper use of the pump, or problems with the insulin injection sites, or even temporary illness of the patient. This situation is a common condition, where outlier values can lead to incorrect results, as machine learning can be penalised in the presence of this data. While outliers can be useful in specific applications, such as for fraud detection in case of anomalous transactions, in our case we need to avoid them. There are several techniques to handle these outliers. For example, we can use methods like the interquartile range (IQR) or standard deviation.

The IQR measures variability by dividing the dataset into quartiles. These quartiles are the values that divide the data into four equal parts: the first, second, third, and fourth quartiles. The IQR is the range between the first and third quartile: (11) $\begin{aligned} I Q R = Q 3 - Q 1 \end{aligned}$ (11) (12) $\begin{aligned} x < Q 1 - (1.5 \cdot I Q R) o r x < Q 3 + (1.5 \cdot I Q R) \end{aligned}$ (12) To implement this in Python, we can use the code shown in Listing 1.

We obtain a lower value of $- 151$ mg/dL and an upper value of 252.2 mg/dL.

As one might expect, the 'Lower' threshold does not modify the dataset, while the 'Upper' threshold is relatively high and primarily removes a few peaks within the dataset. For completeness, we also introduce the standard deviation technique. The idea is straightforward: the standard deviation is a statistical dispersion index. It provides an estimate of the variability within a population of data or a random variable. Graphically, this can be obtained as in Figure Outliers can be determined by considering a Lower and Upper threshold as before: (13) $\begin{aligned} L o w e r & = G l u c o s e . m e a n () - [1.5 \cdot G l u c o s e . s t d () \end{aligned}$ (13) (14) $\begin{aligned} U p p e r & = G l u c o s e . m e a n () + [1.5 \cdot G l u c o s e . s t d () \end{aligned}$ (14)

Figure 3. Use of standard deviation to remove outliers.

The choice of the 1.5 value is somewhat arbitrary, but it is commonly used in the literature.

In our application, the 'Lower' and 'Upper' values calculated using both techniques are very close to each other. Therefore, we can use either method without significantly affecting the final result.

4. Application of machine learning techniques

In order to run Machine Learning algorithms, it is necessary to identify the input to the algorithm, and the requested output. In our case, the objective is to predict the future values of the Glucose concentration in the patient's blood, starting from the current measurement (Sensor Glucose) and the insulin Bolus injected. Therefore, we will have 4 inputs: the Date and Now parameters, As is common in all Machine Learning approaches, we need to divide the data into training and test sets. We have data from 02/01/2022 to 05/04/2022, and we've divided it as follows: Data from 02/01/2022 to 15/03/2022 is used for training, while the remaining data is used for testing.

Before we begin applying Machine Learning techniques, let's take a look at the data we will use for testing, which is shown in Figure .

Figure 4. Graphical representation of the whole dataset. The x-axis shows the date, the y-axis represents the glucose values. All glucose levels are in mg/dL.

4.1. Regression techniques

In this section, the results obtained through the application of Regression Algorithms will be shown. We have applied three different algorithms:

The classic Linear Regression
The Ridge Regression
The Lasso Regression

To apply the algorithms, the ScikitLearn library has been used.

Figure reports the results of the Lasso Regression, showing in Blu the test data, and in red the prediction. Training data are not shown as they are not just used to create the regression model. It is clear that the algorithms follow quite well the overall data, with limitations on the positive and negative peaks. However, the Figure itself does not allow us to have a clear vision of what is happening, because the data are highly concentrated. We are thus limiting the visualisation to a single day so that the results become clearer.

Figure 5. Results of the Lasso Regression over the whole test period. All glucose levels are in mg/dL.

It is much clearer from Figure that the regression algorithms follow the test data, but as it was already foreseeable from Figure they struggle to accurately capture the lower and upper peaks (Table ).

Figure 6. Comparison of the Regression results over a single day. (a) Results of Linear and Polynomial regression over one day, extracted from the test set. All glucose levels are in mg/dL. (b) Results of Lasso regression over one day, extracted from the test set. All glucose levels are in mg/dL and (c) Results of Ridge regression over one day, extracted from the test set. All glucose levels are in mg/dL.

Table 1. Performance comparison of the Regression algorithms in terms of MSE and R2.

Download CSV Display Table

Let's see analytically what the errors obtained in this case from several regressors are. Table reports the values of Mean Square Error and R2 for the tested algorithms.

Table 2. Performance comparison of the Regression algorithms on Google Colab (Free access).

Download CSV Display Table

From an analytical point of view, the regression algorithm performs quite well, but they are far from acceptable for the prediction on a real patient. As already mentioned several times the trends Blood sugars are hard to predict as they don't follow any type of pattern known a priori. Despite this, we can notice two things: the first is that in this case the four algorithms do behave in the same way, with some small differences clearly, and also the resulting graphs are pretty much the same for all of them; the second point is that despite the value of MSE is quite high, and therefore we expect forecasts that are not very close to the true results, in reality, the real values are approximated acceptably. This also shows that the value of R2 can often be considered more significant compared to the MSE value.

4.2. Performance evaluation

The algorithms were executed in the Google Colab environment, enabling the execution of Python code within an online notebook. The default CPU for Colab is an Intel Xeon CPU with 2 vCPUs (virtual CPUs) and 13GB of RAM.

Table reports the performance evaluations of the four regression algorithms considered here. It is evident that the Linear Regression, despite being the simplest, is the best if we consider performance-to-complexity ratios.

5. Application of deep learning techniques

In order to test Deep learning algorithms for the Case Study, we employed the Keras library, and, in particular, we utilised its Long Short Term Memory (LSTM) implementation. The data used as an input are the same as in the Regression case. Figure describes the neural network that has been created with the Keras library. The network consists of an initial LSTM layer with three input neurons corresponding to the variables 'Bolus Volume,' 'Sensor Glucose,' and 'Date + Time,' along with four intermediate neurons. The output layer consists of a single neuron, corresponding to the predicted Glucose value. Two intermediate Dense layers are used to build the LSTM based regressor. The look_back parameter is set to 3. This parameter tells the LSTM layer to keep in memory past information, up to the specified “look_back” number of previous inputs. This is generally set to a relatively high value, typically in the order of 15 to 20 steps. However, glycemic values do not depend on very far away parameters, so a much lower look_back value is needed. Tests were made with values of look_back equal to (3,5,10,15,20), and 3 was considered the best one.

Figure 7. Organisation of the LSTM neural network.

After 35 epochs of training, the following results were achieved: $\begin{aligned} T r a i n i n g L o s s & = 293.73 \\ V a l i d a t i o n L o s s & = 455.64 \end{aligned}$

Figure shows how Training loss evolved during the training. So we can see that already from the fifth epoch we have results very close to the final ones, however, we tried to obtain the best outcome with more epochs. Using more epochs may have led the system to overfit, however, this was not the case as we stopped at 35 epochs when the loss was stabilised. As we did before with the regression algorithms, we considered a specific day as a test bench, in order to compare the actual glycemic trend and the predicted one. Let's take into consideration the day of April 4, 2022. Figure shows the reference values: red is used for the real trend, and green for the predicted one.

Figure 8. Training loss variation during the training epochs.

Figure 9. Comparison between the real glycemic trend (in red) and the predicted one (in green).

The prediction is not as accurate as with the regression algorithms, but it seems to follow the real trend quite well. In particular, the LSTM approach seems to predict the glycemic trend with a slight time shift, that is its predictions seem to anticipate the real trend. This should not be viewed as a negative trait.: indeed, the possibility to predict the glycemic trends with slight anticipation has to be considered a major advantage, as an automatic insulin pump would be able to inject an insulin bolus to regulate future glucose levels with more accuracy.

The high computational demand of the LSTM approach is probably the major drawback to be considered. Indeed, the LSTM algorithm requires the use of a GPU to speed up the learning process. While it is not strictly necessary to use a GPU, the training time without one is much higher (3 minutes per epoch) than with (25 seconds per epoch), making it unfeasible for training over a good amount of epochs.

6. Approaches comparison

Now we can compare the results obtained through the LSTM neural network with those of the Linear, Lasso, Ridge and Polynomial models. The first thing we are going to compare is the error rate, calculated through the Mean Square Error (Table ).

Table 3. Performance comparison between LSTM and Linear Regression.

Download CSV Display Table

Comparing the best results from the regression algorithms with the LSTM outcome, it appears that the LSTM neural network LSTM neural network achieves significantly better results, according to the considered metrics.

An important factor to consider is the training time required for the models. Linear, Ridge, Lasso, and Polynomial regression techniques can quickly prepare the model for predictions. However, training the LSTM neural network takes considerably more time to reach a satisfactory loss compared to the regression models. Therefore, striking a balance between the number of epochs and the desired loss is crucial.

Regarding the computational resources needed for training, the regression models typically do not require GPU usage, and their RAM consumption remains relatively low. The Polynomial regression, though, may have higher memory demands (exceeding 12 GB of RAM). On the other hand, while using a GPU is not mandatory for the LSTM neural network, its absence significantly extends the execution time per epoch. Therefore, if a GPU-equipped machine is unavailable, opting for simpler regression models is advisable.

Lastly, managing an LSTM neural network can be more intricate compared to linear, Ridge, Lasso, and Polynomial models. One must pay attention to various parameters, including the look back, the number of dense layers, the optimisation method, and the number of epochs, to ensure proper functionality.

Table provides a comprehensive comparison of the characteristics of the algorithms used for this case study. It is important to notice that LSTM produces a considerable improvement in the results, in comparison with other considered approaches, but it requires consistent computational power, with a GPU being highly recommended for training.

Table 4. Performance comparison of the regression algorithms on Google Colab (Free access).

Download CSV Display Table

In conclusion, the LSTM neural network exhibits greater potential than do Linear, Ridge, Lasso, and Polynomial models. However, for straightforward case studies like this, it may be more cost-effective to utilise simpler models that deliver excellent results with enhanced efficiency and minimal maintenance.

7. Conclusions and future work

In this work, Machine and Deep Learning techniques were compared for the analysis and prediction of glycemic trends. Real data obtained from glucose sensors were used to train Regressor algorithms, such as Linear, Polynomial, Lasso and Ridge, together with an LSTM deep network. Using the glucose levels measured over four months as input, and the insulin boluses injected into a patient's bloodstream, it was possible to build a predictive model that replicated the glycemic trends with interesting results. In particular, the Regression algorithms provided good reliability, with low resource consumption and a fast training cycle. LSTM performed better in terms of MSE and allowed for a preemptive prediction of the glycemic trend, which could be exploited to better manage a patient's glucose levels. However, it resulted in being much more resource-consuming, as it required a GPU to perform well.

In the future, further tests will be carried out using data from different patients. This will allow us to compare and evaluate the algorithms in different situations. Indeed, each patient generally has a different glycemic pattern and responds differently to insulin boluses. Using a general, robust model to predict the glycemic trends of different patients would be optimal for improving patient care. Furthermore, as privacy issues may arise when using personal health data from patients, Federated Learning approaches will be taken into consideration, in order to exploit their privacy by design characteristics.

Acknowledgments

The work described in this paper has been supported by the Project VALERE “SSCeGov - Semantic, Secure and Law Compliant e-Government Processes”.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by Universitá degli Studi della Campania Luigi Vanvitelli.

References

Aliberti, A., Bagatin, A., Acquaviva, A., Macii, E., & Patti, E. (2020). Data driven patient-specialized neural networks for blood glucose prediction. In 2020 IEEE international conference on multimedia & expo workshops (ICMEW) (pp. 1–6).
Google Scholar
Aral, A., Esposito, A., Nagiyev, A., Benkner, S., Di Martino, B., & Bochicchio, M. A. (2023). Experiences in architectural design and deployment of ehealth and environmental applications for cloud-edge continuum. In International conference on advanced information networking and applications (pp. 136–145). Springer.
Google Scholar
Branco, D., Martino, B. d., Esposito, A., Tedeschi, G., Bonavita, S., & Lavorgna, L. (2022). Machine learning techniques for prediction of multiple sclerosis progression. Soft Computing, 26(22), 12041–12055. https://doi.org/10.1007/s00500-022-07503-z
Web of Science ®Google Scholar
Diabetes, M. (2022). Carelink system user guide. Retrieved September 10, 2023. https://www.medtronicdiabetes.com/sites/default/files/library/download-library/workbooks/CareLink-System-User-Guide.pdf
Google Scholar
Fan, Y., Xu, B., Zhang, L., Song, J., Zomaya, A., & Li, K.-C. (2023). Validating the integrity of convolutional neural network predictions based on zero-knowledge proof. Information Sciences, 625, 125–140. https://doi.org/10.1016/j.ins.2023.01.036
Web of Science ®Google Scholar
Fan, Y., Zhang, W., Bai, J., Lei, X., & Li, K. (2023). Privacy-preserving deep learning on big data in cloud. China Communications, 1–11. https://doi.org/10.23919/JCC.ea.2020-0684.202302
Web of Science ®Google Scholar
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
PubMed Web of Science ®Google Scholar
Iacono, F., Magni, L., & Toffanin, C. (2022). Personalized LSTM models for glucose prediction in type 1 diabetes subjects. In 2022 30th Mediterranean conference on control and automation (MED) (pp. 324–329). IEEE.
Google Scholar
Li, J., Han, D., Wu, Z., Wang, J., Li, K.-C., & Castiglione, A. (2023). A novel system for medical equipment supply chain traceability based on alliance chain and attribute and role access control. Future Generation Computer Systems, 142, 195–211. https://doi.org/10.1016/j.future.2022.12.037
Web of Science ®Google Scholar
Liu, B., Zhao, S., Yu, X., Zhang, L., & Wang, Q. (2020). A novel deep learning approach for wind power forecasting based on WD-LSTM model. Energies, 13(18), 4964. https://doi.org/10.3390/en13184964
Web of Science ®Google Scholar
Medsker, L. R., & Jain, L. (2001). Recurrent neural networks. Design and Applications, 5(64-67), 2.
Google Scholar
Meijner, C., & Persson, S. (2017). Blood glucose prediction for type 1 diabetes using machine learning long short-term memory based models for blood glucose prediction.
Google Scholar
Mujahid, O., Contreras, I., & Vehi, J. (2021). Machine learning techniques for hypoglycemia prediction: trends and challenges. Sensors, 21(2), 546. https://doi.org/10.3390/s21020546
PubMed Web of Science ®Google Scholar
Nguyen, M., He, T., An, L., Alexander, D. C., Feng, J., & Yeo, B. T., & Alzheimer's Disease Neuroimaging Initiative (2020). Predicting alzheimer's disease progression using deep recurrent neural networks. NeuroImage, 222, 117203. https://doi.org/10.1016/j.neuroimage.2020.117203
PubMed Web of Science ®Google Scholar
Pappada, S. M., Cameron, B. D., & Rosman, P. M. (2008). Development of a neural network for prediction of glucose concentration in type 1 diabetes patients. Journal of Diabetes Science and Technology, 2(5), 792–801. https://doi.org/10.1177/193229680800200507
PubMedGoogle Scholar
Pezzullo, G. J., Esposito, A., & di Martino, B. (2023). Federated learning of predictive models from real data on diabetic patients. In International conference on advanced information networking and applications (pp. 80–89). Springer.
Google Scholar
Pintaudi, B., Gironi, I., Nicosia, R., Meneghini, E., Disoteo, O., Mion, E., & Bertuzzi, F. (2022). Minimed medtronic 780G optimizes glucose control in patients with type 1 diabetes mellitus. Nutrition, Metabolism and Cardiovascular Diseases, 32(7), 1719–1724. https://doi.org/10.1016/j.numecd.2022.03.031
PubMed Web of Science ®Google Scholar
Xu, Z., Zhang, Q., Li, W., Li, M., & Yip, P. S. F. (2019). Individualized prediction of depressive disorder in the elderly: A multitask deep learning approach. International Journal of Medical Informatics, 132, 103973. https://doi.org/10.1016/j.ijmedinf.2019.103973
PubMed Web of Science ®Google Scholar
Zappatore, M., Longo, A., Martella, A., Di Martino, B., Esposito, A., & Gracco, S. A. (2023). Semantic models for IoT sensing to infer environment–wellness relationships. Future Generation Computer Systems, 140, 1–17. https://doi.org/10.1016/j.future.2022.10.005
Web of Science ®Google Scholar
Zohora, M. F., Tania, M. H., Kaiser, M. S., & Mahmud, M. (2020). Forecasting the risk of type ii diabetes using reinforcement learning. In 2020 Joint 9th international conference on informatics, electronics & vision (ICIEV) and 2020 4th international conference on imaging, vision & pattern recognition (icIVPR) (pp. 1–6). IEEE.
Google Scholar

Evaluating machine and deep learning techniques in predicting blood sugar levels within the E-health domain

Abstract

1. Introduction