1
BACKGROUND
Postbariatric hypoglycaemia (PBH) is an increasingly recognized late complication
of bariatric surgery, particularly Roux‐en‐Y gastric bypass (RYGB).
1
,
2
Between 20% and 80% of patients after RYGB may develop the condition, which is characterized
by postprandial hypoglycaemic episodes with increased severity after ingestion of
carbohydrates with a high glycaemic impact.
3
Prevalence estimates range widely, because of the current lack of standardized diagnostic
criteria. Although incompletely understood, accelerated nutrient absorption alongside
with excessive postprandial incretin and insulin exposure are key pathophysiological
features.
4
In the absence of an approved pharmacotherapy, dietary management is the first‐line
treatment of PBH.
5
Dietary measures, however, can be very restrictive, insufficiently effective and challenging
to implement in the long‐term. Given these limitations, continuous glucose monitoring
(CGM) devices, which provide real‐time (RT) information on current glucose levels
and rate of change, have the potential to support PBH management.
6
CGM can be leveraged to develop RT predictive algorithms allowing for preventive or
timely corrective actions (e.g. carbohydrate intake), which may be particularly useful
for the frequently encountered patients with PBH with hypoglycaemia unawareness and
related safety concerns.
7
,
8
While hypoglycaemia forecasting has been widely studied in type 1 diabetes (T1D),
9
the topic remains understudied in the PBH population. The first and only contribution
in this field was the development of a heuristic‐based predictive algorithm for a
glucose‐responsive glucagon delivery system in an experimental inpatient session10,
11
To address this gap, the purpose of this work was to assess the feasibility of forecasting
PBH episodes by exploiting three different predictive algorithms using only CGM data.
2
METHODS
2.1
Dataset and postbariatric hypoglycaemia event definition
Data were generated by 39 adults with confirmed PBH after RYGB (defined as symptomatic
plasma or sensor glucose <54 mg/dl relieved by glucose administration) wearing the
Dexcom G6 (Dexcom Inc., San Diego, CA, USA) CGM sensor for a median of 10 days (IQR
9‐30) in daily life conditions. Data were obtained from usual care and research settings
(NCT04330196, NCT04334161, NCT04332289). Overall, the percentage of days using CGM
in blinded, unblinded, unknown mode is: 14.5%, 62.7% and 22.8%, respectively. Based
on a PBH event definition of sensor glucose <54 mg/dl for at least 15 min,
7
,
12
we identified, in total, 542 PBH episodes (≈4 every 10 days per subject) with an average
duration of 25 min. Participants' details are summarized in Table S1 (see Supplementary
Material, Appendix A). Following preprocessing for anomalies and noise (for more details
see Supplementary Material, Appendix A), the dataset was split into a training (31
subjects and 489 PBH events) and a test (eight remaining subjects and 53 PBH events)
set. In addition, given that the CGM length may significantly vary between individuals,
to create a test set that is as balanced as possible and to avoid any bias on the
results, we applied the following criteria to include/exclude a subject in/from the
test set: (a) the patient has >5 and <20 consecutive monitoring days, and (b) the
patient showed at least one PBH episode over four monitoring days.
Clinical and demographic information about the training‐test partition are detailed
in Table S1 (Supplementary Material, Appendix A).
2.2
Predictive algorithms
Based on our previous work on the prediction of hypoglycaemia in T1D,
13
we considered the following three algorithms: an autoregressive model with recursive
parameter estimation (AR1),
14
which represents a good example of consolidated adaptive method; an autoregressive
integrated moving average (ARIMA) model,
13
which turned out to be the best linear predictor in T1D; and a feed‐forward neural
network (NN),
15
as representative of non‐linear methodologies. These methods, besides being considered
as state‐of‐art glucose predictive algorithms for T1D, were also shown to be the best
performing for short‐term prediction when CGM data are the only available source of
information.
13
Regarding the prediction horizon (PH; i.e. how far ahead the method predicts the event),
we considered 15, 20, 25 and 30 min.
For each combination of algorithm and PH, model parameters and/or hyperparameters
were estimated in the training set. Then, the algorithms were applied to the test
set, simulating the acquisition of CGM data in RT (see Supplementary Material, Appendix
B). Two examples of RT PBH forecasting using the proposed algorithms with PH = 20 min
are visualized in Figure 1.
FIGURE 1
Examples of real‐time forecasting of postbariatric hypoglycaemia (PBH) events and
preventive alert generation using the three model‐based algorithms fed by the past
continuous glucose monitoring (CGM) values (blue dotted line, sampling time 5 min).
Green circles indicate future CGM samples. Top panel: at time 21:15 (actual time),
the three algorithms fed by the past CGM values, predict the next four CGM values.
Red asterisks, ARIMA; magenta triangles, AR1; black squares, neural network (NN).
As the last predicted CGM values is below
th
PBH
and there are no recent alarms, a preventive PBH alarm (red, black and magenta arrow
for ARIMA, NN and AR1, respectively) is triggered. Bottom panel: at time 16:07 (actual
time), AR1 predicts a value below
th
PBH
and raises a false alarm (magenta arrow), whereas ARIMA and NN correctly predict the
increase in glucose concentration and do not generate any alert
Algorithm performance was evaluated as the ability to predict/detect PBH events (see
Supplementary Appendix B for details). (This work does not consider Dexcom Urgent
Low Soon alerts algorithm and it does not provide any evaluation of its performance.)
For each raised PBH alarm, we counted: a true positive (TP) if a PBH event occurred
in the following 45 min; a false positive (FP) if no PBH events occurred in the following
45 min. A false negative was counted when no alarms were generated despite the occurrence
of a PBH event.
13
Based on TP, FP and false negative, the following aggregated metrics were calculated:
precision (P), recall (R), F1 score (F1).
P can be seen as the percentage of the correct alarms over the total number of raised
alarms. R, also known as the sensitivity or TP ratio, is the ratio of correctly predicted
hypoglycaemic events over the total number of events. F1 is the harmonic mean of the
two previous metrics. In addition, we evaluated the daily number of false alarms (FP/day)
raised by the algorithms, and the time gain (TG) defined as the temporal distance
between a TP alarm and the corresponding PBH event onset, thus representing the time
window for a preventive intervention.
Because of the short CGM recording period of the test set (median 10 days) and consequently
low prevalence of PBH events, the value of hypoglycaemic prediction metrics was obtained
by considering all hypoglycaemic events of different subjects according to a population‐based
approach. The results are expressed as a single value for all the considered metrics
except for TG, which is expressed as median [25th‐75th], as it can be computed for
each TP.
To contrast our results with the previously published work, we reimplemented and trained
the PBH Detection System (PBH‐DS) algorithm developed by Laguna Sanz et al.,
10
in particular we referred to the version denoted as PBH‐DS v002 in that study.
10
All the implementations were done in MATLAB (2021a version).
3
RESULTS
Performance metrics of the AR1, ARIMA and NN algorithms for each considered PH as
well as of the previously published PBH‐DS are shown in Table 1 (for details on parameter
identification see Supplementary Material, Appendix C). The ARIMA configuration with
PH = 20 min performed best, achieving P = 79.10%, R = 100%, F1 = 88.33%, FP/day = 0.17
and median TG = 20 min. In practical terms, provided that CGM reflects blood glucose
precisely and accurately, this means that PBH episodes can be predicted 20 min beforehand,
with no missed events and generating only one false alert every 6 days.
TABLE 1
PBH prediction metrics for the algorithms under investigation (ARIMA, AR1, NN and
PBH‐DS) according to different PHs based on a test set containing 53 PBH events
Algorithm
PH (min)
Metrics
P (%)
R (%)
F1 (%)
FP/day
TG (min)
ARIMA
15
72.15
98.28
83.21
0.27
15 [15‐15]
AR1
36.11
98.11
52.79
1.15
10 [10‐15]
NN
68.29
96.55
80
0.32
15 [10‐15]
ARIMA
20
79.10
100
88.33
0.17
20 [15‐20]
AR1
35.97
94.34
52.08
1.11
10 [5‐10]
NN
82.26
96.23
88.70
0.14
15 [15‐20]
ARIMA
25
54.08
100
70.20
0.56
25 [20‐25]
AR1
42.24
92.45
57.99
0.84
10 [5‐10]
NN
62.32
81.13
70.49
0.32
20 [15‐25]
ARIMA
30
41.94
98.11
58.76
0.89
25 [20‐30]
AR1
44.45
90.57
59.63
0.76
10 [5‐10]
NN
54.67
77.36
64.06
0.43
25 [20‐30]
PBH‐DS
—
23.87
100
38.55
2.11
25 [20‐30]
Abbreviations: AR1, autoregressive model; ARIMA, autoregressive integrated moving
average; F1, F1‐score; FP/day, false positives per day; NN, neural network; P, precision;
PBH‐DS, postbariatric hypoglycaemia detection system; PH, prediction horizon; R, recall;
TG, time gain.
Note: Results of TG are reported as median [25th‐75th] percentile.
ARIMA predictors with PH = 25 and 30 min, despite achieving a larger TG (i.e. window
for intervention), resulted in inferior P, R and FP/day. This is illustrated by the
F1 trend, which decreased as the PH horizon increased (F1 = 83.21%, 70.20%, 58.76%)
because of the critical decrease of P (P = 72.15%, 54.08%, 41.94%).
Compared with ARIMA, AR1 was inferior for all PHs, particularly in terms of P (P = 36.11%,
35.97%, 42.24% and 44.45% for PH = 15, 20, 25 and 30 min, respectively). In addition,
AR1 provides the largest FP/day = 1.15, in line with its known susceptibility for
unstable predictions.
14
The NN configuration performed similarly to ARIMA but yielded lower TG.
Reimplementation of the previously published PBH‐DS resulted in R = 100% with median
TG of 25 min and P = 23.87%. Consequently, the FP/day was 2.11, which is 10 times
the number of FP raised by ARIMA for PH = 20 min (FP/day = 0.17).
In addition, we analysed the performance of the predictive algorithms for individuals
wearing the CGM sensor in blinded and unblinded mode (full results are reported in
the Supplementary Material, Appendix D). In particular, the test set comprises four
patients with blinded recordings (for a total of 24 hypoglycaemic episodes) and four
patients with unblinded recordings (for a total of 29 hypoglycaemic episodes). The
results are consistent with those reported for the complete dataset (Table 1): (a)
the best performing algorithm is confirmed to be ARIMA with a PH = 20 min, granting
high precision (77.42% and 80.56% for blinded and unblinded subsets, respectively),
high recall (100% in both cases) and low FP/day (0.17 vs 0.18 for blinded and unblinded
sets, respectively); (b) AR1 is inferior to ARIMA for all the PHs for both blinded
and unblinded subsets; and (c) NN performed similarly to ARIMA but it yields to a
slightly inferior median TG (15 min) both in blinded and unblinded subsets. Of note,
the number of FP/day is slightly larger in the unblinded than in the blinded subset.
4
CONCLUSIONS
In this proof‐of‐concept study, we assessed the feasibility to forecast PBH events
in RT using various linear and non‐linear black‐box predictive algorithms fed by CGM
data only. The highest performance was achieved with ARIMA approach using a PH of
20 min, which was able to predict PBH events with a median lead time of 20 min, with
no missed events and only one false alert every 6 days. The ARIMA approach outperformed
the previously published hypoglycaemia prediction algorithm, which yielded two false
alarms per day when applied on our data.
10
,
11
Apart from usability aspects, avoidance of false alarms is particularly important
for the PBH population as unnecessary corrective ingestion carbohydrates can cause
rebound hypoglycaemia and predispose to weight regain.
1
Although comparability is limited, the herein achieved performance metrics for hypoglycaemia
prediction can even compete with those reached in T1D and T2D populations using similar
methods.
13
Thus, our findings are encouraging and support the feasibility to forecast PBH episodes
by leveraging CGM data in combination with an ARIMA‐based predictor. Of note, compared
with models used in Prendin et al.,
13
the proposed ARIMA model shows an inferior number of parameters to describe the glucose
dynamics (i.e. the autoregressive model order). Thus, suggesting that glucose dynamics
are faster in PBH than the T1D population.
4.1
Limitations of the study
Despite the promising and encouraging results obtained in this study we acknowledge
various limitations. First, the model assessment on original non‐processed CGM data
would have been interesting, but not solid because of noise, which could have negatively
impacted the identification and training procedure of the algorithms. However, it
is worth noting that, despite the offline data preprocessing aimed at removing noise/anomalies
that could have introduced a bias in the evaluation, all predictive algorithms have
been applied simulating an RT application.
Another potential bias in the analysis may be the presence of CGM recordings acquired
in unblinded modality. In fact, we found that the number of FPs is higher in the unblinded
subset, which might be explained as follows: low glucose alarms generated by the unblinded
CGM sensor and/or the possibility of reading in RT CGM values may have triggered a
preventive carbohydrate intake, thereby mitigating against the impending PBH episode.
Unfortunately, lack of information on either alert settings or preventive carbohydrate
intakes in the dataset precludes a definitive confirmation. Still, it is important
to note that the inclusion of unblinded CGM recordings could have generated an underestimation
of the performance of our algorithms because of FPs.
Finally, we acknowledge that previously a time‐lag of about 10‐15 min
16
between interstitial and blood glucose concentrations, could reduce the actual time
anticipation of PBH events via CGM sensor data to 5‐10 min. As intravascular sensors
are currently not a viable option, further studies are required to provide an estimation
of the blood‐to‐interstitial fluid time lag during the rapid dynamics of a PBH episode
and thus assess the true effectiveness of PBH CGM‐based predictive algorithms.
4.2
Future developments
Future work will focus on the development of subject‐specific algorithms, which allows
considering the large heterogeneity that characterizes the PBH population. This will
only be possible once large CGM longitudinal datasets are available. A further and
natural extension of this work will assess the improvement in forecasting PBH events
by increasing the input data by additional information such as meal and physical activity.
A more thorough understanding of the correlation/causation between the PBH episodes
and adverse clinical events will further help to determine practical clinical impact
of PBH RT prediction.
In conclusion, CGM data can be leveraged to forecast PBH and future research and clinical
validation trials will unravel whether the technology translates into patient benefits.
AUTHOR CONTRIBUTIONS
FP, GC, DH, AF and LB designed the analysis, AT recruited participants and collected
the CGM data, AT and FP reviewed and prepared the data for analysis, FP, GC and AF
performed the analysis, FP, GC, AT, DH, AF and LB interpreted the data. FP and GC
wrote the first draft of the manuscript. AT, DH, AF and LB critically reviewed the
manuscript. LB and AF are the guarantors of this work and, as such, had full access
to all the data in the study and take responsibility for the integrity of the data
and the accuracy of the data analysis. All authors approved the final draft of the
manuscript for submission.
FUNDING INFORMATION
Swiss National Science Foundation (PCEGP3_186978), product support from the Dexcom
External Research Program (OUS‐2020‐014), ‘SID‐Networking Project 2021’ (DVTDSS project).
Product support was provided by Dexcom.
CONFLICT OF INTEREST
The authors declare that they have no competing interests.
PEER REVIEW
The peer review history for this article is available at https://publons.com/publon/10.1111/dom.14783.
Supporting information
Appendix S1 Supporting information
Click here for additional data file.