3,140
Views
0
CrossRef citations to date
0
Altmetric
Review

Repeated maximal exercise tests of peak oxygen consumption in people with myalgic encephalomyelitis/chronic fatigue syndrome: a systematic review and meta-analysis

ORCID Icon & ORCID Icon
Pages 119-135 | Received 31 May 2022, Accepted 29 Jul 2022, Published online: 16 Aug 2022

ABSTRACT

Background

Repeated maximal exercise separated by 24 hours may be useful in identifying possible objective markers in people with ME/CFS that are not present in healthy controls.

Aim

We aimed to synthesise studies in which the test-to-retest (24 hours) changes in VO2 and work rate have been compared between people with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and controls.

Methods

Seven databases (CINAHL, PubMed, PsycINFO, Web of Knowledge, Embase, Scopus and MEDLINE) were searched. Included studies were observational studies that assessed adults over the age of 18 years with a clinical diagnosis of ME/CFS compared to healthy controls. The methodological quality of included studies was assessed using the Systematic Appraisal of Quality for Observational Research critical appraisal framework. Data from included studies were synthesised using a random effects meta-analysis.

Results

The pooled mean decrease in peak work rate (five studies), measured at retest, was greater in ME/CFS by −8.55 (95% CI −15.38 to –1.72) W. The pooled mean decrease in work rate at anaerobic threshold (four studies) measured at retest was greater in ME/CFS by −21 (95%CI −38 to −4, tau = 9.8) W. The likelihood that a future study in a similar setting would report a difference in work rate at anaerobic threshold which would exceed a minimal clinically important difference (10 W) is 78% (95% CI 40%–91%).

Conclusion

Synthesised data indicate that people with ME/CFS demonstrate a clinically significant test–retest reduction in work rate at the anaerobic threshold when compared to apparently healthy controls.

Introduction

Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is an illness characterised by debilitating fatigue which causes a reduction in levels of physical activity [Citation1], a diminished ability to carry out tasks of daily living [Citation2, Citation3] and a decline in quality of life [Citation4]. Post-exertional malaise (PEM), cognitive dysfunction and disturbed/unrefreshing sleep are common symptoms for the majority of people with ME/CFS [Citation5].

PEM or post-exertional symptom exacerbation (PESE) has been described as a hallmark of ME/CFS [Citation6] and is where there is an increase in the severity of symptoms (such as fatigue, flu-like symptoms, pain, sore throat, cognitive impairment, sensory overload) following a physical, cognitive or psychological stressor [Citation6, Citation7]. This worsening may occur 24–48 hours later [Citation6] and can last for over five days [Citation8].

Findings from a systematic review and meta-analysis provided evidence of reduced peak oxygen uptake (VO2peak) in people with ME/CFS [Citation9], and this may increase their risk of cardiovascular and all-cause mortality. However, it was not clear from this review how high-intensity exercise may affect the symptoms of people with ME/CFS. It was also not clear from this review if findings from high-intensity exercise tests can provide a measurable physiological marker in people with ME/CFS, which is not demonstrated in healthy controls. This would be of particular importance in providing an understanding of the consequences of high-intensity exercise in this group as well as identifying possible objective markers such as changes in VO2peak that may be found in people with ME/CFS but are not present in healthy controls.

One method used to explore the effects of high-intensity exercise in people with ME/CFS is the use of provocation studies [Citation10] which involves participants completing two repeated VO2peak tests separated by 24 hours [Citation11]. To date, research evidence assessing VO2peak in repeated maximal exercise tests in ME/CFS is equivocal. Two studies have demonstrated a reduced VO2peak in people with ME/CFS in their second test while controls demonstrated a similar VO2peak or increased VO2peak [Citation7, Citation8]. One study demonstrated a reduced VO2peak in both the ME/CFS group and the control group [Citation12]. One study reported no change in VO2peak in both people with ME/CFS and controls [Citation13] while in another study, an improvement in both the ME/CFS group and the control group in the second test was reported [Citation14]. Nelson et al. [Citation13] reported that work rate (WR) at anaerobic threshold (AT) was lower in people with ME/CFS but not in controls in the second of the two exercise tests. This indicates that it may be a change in WR at AT and not the change in VO2peak, which may be the variable of interest.

A meta-analysis has been conducted assessing this difference [Citation15]. However, whilst this review reported the mean difference in the change and reported statistical significance, we aim to build on this work further to assess the magnitude of the difference in the change. We also aim to define a clinically relevant threshold and assess the pooled effect in relation to this threshold and we place inferential emphasis on a 95% prediction interval [Citation16]. We further aim to estimate the proportion of future studies that if conducted in a similar setting would report findings that would exceed a clinically relevant threshold. It was therefore the aim of this review to quantify the size of the difference in the change in VO2, and WR at peak and AT over two maximal exercise tests separated by 24 hours in people with ME/CFS compared to apparently healthy controls.

Methods

The research design used in this study was a systematic review of observational studies and meta-analyses.

This review was registered in the Prospero register for systematic reviews (CRD42019117837) (https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=117837). There were two changes from the original protocol: (1) Original intent was to assess the variability of the change between the two groups. However, due to the small number of studies and the large confidence interval around the estimate of the pooled mean difference this was not conducted. (2) The meta-analyses assessing the difference in VO2peak between people with ME/CFS and controls at test 1 and a separate analysis at test 2 were not conducted due to retrieving only a small number of studies. The PRISMA guidelines were used in the reporting of this review [Citation17].

Criteria for selecting studies

The eligibility criteria for this review were:

Exposure. Adults (over 18 years old) with any clinical diagnosis of ME/CFS using any recognised diagnostic definition including; the Canadian Criteria [Citation2], The International Consensus Criteria (ICC) [Citation18], Fukuda et al. [Citation19] and Holmes et al. [Citation20]. To be included, studies were required to compare people with ME/CFS with apparently healthy controls.

Outcome. Any study that assessed VO2max or VO2peak as a maximal test was included. Studies were required to include two VO2peak tests separated by 24 hours. Studies must have collected data on expired air to be included and were excluded if a predicted VO2peak was calculated from other variables or from a submaximal test. Studies had to report the primary outcome (VO2peak) to be included and data pertaining to the secondary outcomes were then extracted from these papers.

Types of study. Any observational study was included. Studies were required to be published in a peer-reviewed journal with a description of the data collection methods.

Search strategy

A comprehensive literature search was conducted from inception to February 2022 of CINAHL, PubMed, PsycINFO, Web of Science, Embase, Scopus and MEDLINE. The comprehensive search strategy was peer reviewed by a Senior Librarian at Teesside University to ensure its accuracy and effectiveness at retrieving appropriate studies. The following search terms and strategy (involving Boolean operators) were used:

(MH ‘Fatigue Syndrome, Chronic’) OR ‘myalgic encephalo*’ OR ‘CFS’ OR ‘ME’ OR ‘CFS/ME’ OR ‘ME/CFS’

AND

‘Repeated’

AND

‘VO2peak’ OR (MH ‘Aerobic Capacity’) OR ‘VO2’ OR ‘oxygen uptake’ OR ‘maximal oxygen uptake’ OR ‘maximal oxygen consumption’ OR (MH ‘Oxygen Consumption’)

As well as searching online databases, reference lists were checked. Grey literature was not included, as papers were required to be peer reviewed. During the searching process, a number of grey literature studies were identified and the authors were contacted directly to assess if this data had been published. Following discussions with authors, it became clear that the published versions of this data were already retrieved in the literature search and therefore this process retrieved no new papers. The website MEpedia was also checked (https://me-pedia.org/wiki/Two-day_cardiopulmonary_exercise_test).

Selection of studies

Selection of studies was conducted in two stages and was conducted independently by JF and MG. The first selection was based on title and abstract and this was recorded as a ‘yes’ if all inclusion criteria stated above were met, a ‘maybe’ if it was unclear if all criteria were met and a ‘no’ if any of the criteria was not met. After the first selection, JF and MG discussed the results of this process and came to a consensus on the studies for the second selection. Any paper that was deemed by one of the reviewers as a yes or a maybe was included in the second selection. This involved the two reviewers independently reading the full texts and assessing for eligibility using the inclusion criteria stated above. Following second selection, a discussion took place between the reviewers regarding the suitability of the final papers. A consensus was then reached on the final papers for inclusion.

Assessment of methodological quality

To assess the quality of the included papers, the Systematic Appraisal of Quality for Observational Research (SAQOR) framework [Citation21] was used. This framework was modified as described in Franklin et al. [Citation9]. Question 1 which related to the representativeness of the sample was removed as it was unclear how this would be assessed in the ME/CFS literature. Questions on follow-up were also removed as this was not relevant to the research question. The potential confounders were identified as (1) the methods and criteria used to assess maximum effort. (2) Any methods used to standardise equipment during testing such as calibration of equipment or the use of familiarisation. (3) Any other confounder such as controlling for physical activity or caffeine and/or alcohol intake in the days leading up to testing. This may have also included standardising the timing of testing on both days. For each question on the framework, a numerical score was provided. To calculate the score, we awarded a ‘0’ if the criterion was not met in the study, a ‘1’ if the study had made an attempt and a ‘2’ if the criterion was met. This resulted in a maximum score of 32 for each paper. The assessment of methodological quality was assessed by JF and two papers were peer reviewed by MG. Each paper was scored twice and then the scores were checked for consistency. A discussion took place between the researchers on aspects of the design to agree the key design features to focus on to ensure consistency across all papers.

Data extraction

Sample size, data relating to VO2 and WR at peak and AT were extracted from each paper and inputted directly into Microsoft Excel. For the analysis, the aim was to assess the difference in the mean change in VO2peak between test 1 and test 2 (test 2 VO2peak minus test 1 VO2peak), between people with ME/CFS and controls. To conduct the meta-analysis, the mean change and the standard deviation (SD) of the change were required for both the ME/CFS and control groups [Citation22]. One of the included papers reported this data [Citation7]. For the remaining studies, authors were contacted directly and asked to provide this data. Two authors provided this information [Citation13, Citation14]. For the three papers that did not provide this information, the SD of the change was estimated using the method described by Higgins et al. [Citation22]. This involved firstly calculating a correlation coefficient for a study that provided an SD of the change [Citation13]. The correlation coefficients derived from Nelson et al. [Citation13] were used to calculate the SDs of the change in the remaining papers. The mean change and SD of the change for Lien et al. [Citation23] were presented in figures and extracted using the Digitizelt computer programme [Citation24] and inputted directly into the data extraction spreadsheet. For data relating to VO2 at AT, peak WR and WR at AT, the correlation coefficients were calculated using the same method as described above and can be found in . The final data that were inputted into the meta-analyses can be found in and . The data extraction and the process described above were conducted independently by JF and MG before comparing both datasets for accuracy. For more information on this process, please refer to the Cochrane Handbook [Citation22].

Table 1. Correlation coefficients used to calculate change SDs.

Table 2. Data inputted into meta-analysis for difference in change VO2peak and WR at peak between people with ME/CFS and controls.

Table 3. Data inputted into meta-analysis for difference in change VO2 and WR at AT between people with ME/CFS and controls.

Although five of the included papers provided WR at AT data, only four of these studies were included in this analysis. The information provided relating to change in WR at AT in Lien et al. [Citation23] displayed two results at +10 W, two results at −10 W and the remaining values on exactly 0 W. These results seemed highly improbable and therefore this research team was contacted to clarify these findings. However, we were unable to verify this data and consequently this data set was excluded from the analysis. The data for the other variables in this paper were extracted and included in the other analyses. Anomalies which were identified during data extraction and the decisions which were made in relation to these can be found in .

Table 4. Anomalies and decision making during data extraction.

Data analysis

Mean change, change SDs and sample sizes were inputted into Stata Statistical Software version 16 in duplicate for data analysis. A random effects meta-analysis was conducted however due to the small number of studies included within this review care was taken to ensure the data analysis methods mitigated for this as much as possible. The restricted maximal likelihood (REML) estimator with a t-distribution (truncated-Knapp and Hartung) was applied to assess heterogeneity. Notably, when using a small number of studies, a wider confidence interval is needed to reflect the uncertainty in the between study variance [Citation25] which is more effectively provided using the Knapp and Hartung (t-distribution) approach. Due to these reasons, the Knapp and Hartung method is consistently viewed as a more precise approach when the number of studies is <20 and specifically when the number of studies ≤5 [Citation25–27].

To estimate the magnitude of the effect, the standardised mean difference (SMD) was calculated for variables of interest by dividing the pooled mean by the pooled SD, generating a Cohen’s d statistic which reports the effect size in standard deviation units [Citation28]. The pooled SD was calculated using the test 1 ME/CFS group SD for the respective variable. This allowed the generation of a statistic for the change from test 2 to test 1 in ME/CFS vs. controls (pooled mean) in relation to the variability in the ME/CFS group (pooled SD). The pooled SD was calculated by meta-analysing the variance and the standard error (SE) of the variance derived using the formula described in Hopkins [Citation29] (SE of variance = √(2*sd^4)/df). The point estimate (pooled variance) was then converted to an SD. SMD was interpreted as: 0.2, a small effect; 0.5, a moderate effect; and 0.8 or greater, a large effect [Citation28].

When defining a clinically relevant threshold, data was available for WR at AT in Nelson et al. [Citation13] which provided a range of between 7.5 W and 12.5 W and appeared to provide a reasonable degree of sensitivity and specificity. However, there was uncertainty where on this range to apply a point estimate for a clinically relevant threshold. As a second method, it was hypothesised that half an SD may be useful in providing an estimate to be used alongside the data generated in Nelson et al. [Citation13]. Half SD estimates have been useful in defining the MCID in previous studies and have been equivalent to MCID derived from anchor-based methods [Citation30].

Half an SD for the pooled data was 10.9 W; these findings are similar to the centre of the range provided by Nelson et al. [Citation13]. As there was a satisfactory level of agreement between the two methods, the research team agreed that the midpoint from Nelson et al. [Citation13] provided an appropriate point estimate for the MCID threshold for this review. Therefore, the MCID for WR at AT defined in this review was a difference in the mean change between people with ME/CFS and controls of 10 W. As no other data were available to calculate an anchor-based MCID for the other variables these were not defined in this review.

A prediction interval was calculated to provide the range of the likely effects in a future study if this was conducted in a similar settings [Citation16]. Using the methods described by Mathur and VanderWeele [Citation31], the proportion of future studies conducted in a similar setting that is estimated would exceed the MCID (10 W) for WR at AT was calculated.

Results

Results of the search

The comprehensive literature search yielded 439 papers in total; all identified through electronic databases and this was reduced to 307 after the removal of duplicates. Following the first selection of studies, 293 papers were removed, the percentage agreement for both reviewers was 80%. Fourteen papers were assessed for eligibility to be included in the review through the second selection stage. This was reduced to six papers after the second selection, percentage agreement for both the reviewers for this stage was 100%. An overview of the selection process can be seen in .

Figure 1. PRISMA flow chart. From: Page et al. [Citation17].

Figure 1. PRISMA flow chart. From: Page et al. [Citation17].

Summary of included papers

provides a summary of the included papers. Five of the included papers [Citation7, Citation8, Citation12–14] used Fukuda as a diagnostic definition; however, Hodges et al. [Citation14] and Nelson et al. [Citation13] also required patients to meet the Canadian [Citation2] and ICC [Citation18] definitions. Lien et al. [Citation23] used the Carruthers et al. [Citation2] diagnostic definition. The six papers performed a VO2peak test using a cycle ergometer except for two participants with ME/CFS in VanNess et al. [Citation8] who performed a modified Bruce protocol on a treadmill. All papers included healthy and/or sedentary control group; however, two papers Vermeulen et al. [Citation7] and Lien et al. [Citation23] did not match for any characteristics such as age, sex or activity levels.

Table 5. Summary of included papers.

Overview of assessment of methodological quality

provides a summary of the methodological quality score for included papers. The quality scores for the six papers ranged from 14 to 24 with VanNess et al. [Citation8] having the lowest quality score and Nelson et al. [Citation13] having the highest score. With regard to the overview of the ME/CFS group, although five of the six papers provided information about the source of the sample [Citation7, Citation8, Citation12, Citation13, Citation23] only one paper gave any information about the sampling process [Citation8], and sample size was calculated in one of the six studies [Citation23]. The six papers all provided a good overview of the inclusion and exclusion criteria. All papers included a comparison group, and this was easily identifiable from the patient group. The source of the control group was reported in two of the included papers [Citation13, Citation23]. Matching for any characteristics was conducted in four of the six studies [Citation8, Citation12–14], two papers did not assess for any statistical differences between the two groups at baseline [Citation8, Citation14].

Table 6. Summary of the assessment of methodological quality.

The assessment of the exposure (ME/CFS) was good in all six papers. Assessment of the outcome (VO2peak) was good in four of the six papers [Citation12–14, Citation23]. The assessment of the outcome was poor in VanNess et al. [Citation8]. The description of the maximum effort test was good in three papers [Citation12–14] and adequate in three papers [Citation7, Citation8, Citation23]. Four of the six papers [Citation8, Citation12–14] discussed the use of criteria (or a measure) to assess maximum effort; however, only two of these papers provided specific detail about what the criteria for assessing maximum effort was [Citation12, Citation13]. Information on how many participants achieved each criterion is not reported in any of the included papers. Three papers provided a good overview of controlling for other possible extraneous variables [Citation8, Citation13, Citation14]. Missing data were not discussed in five of the papers and only addressed in Lien et al. [Citation23]. Five papers provide an adequate overview of the results [Citation7, Citation8, Citation12–14]; however only Vermeulen et al. [Citation7] reported the change data between test 1 and test 2. The overview of the results was scored as poor in Lien et al. [Citation23].

Results of data synthesis

The pooled group mean differences in test-to-retest change in VO2peak over 24 hours between people with ME/CFS and controls was −1.15 (95% CI −3.33 to 1.04) ml kg−1 min−1. Indicating that people with ME/CFS had a lower VO2peak at test 2 compared to test 1. Tau, which provides an estimate of the between study variation, was 1.59 ml kg−1 min−1, which demonstrates substantial heterogeneity. The 95% PI (−6.16 to 3.87 ml kg−1 min−1) demonstrates a wide range of possible effects from favouring ME/CFS to favouring controls with no clear indication of the direction of the difference.

The pooled group mean differences in test-to-retest changes in VO2 at AT was −1.83 (95% CI −3.98 to 0.32) ml kg−1 min−1, Tau was 1.4 ml kg−1 min−1, 95%PI (−6.91 to 3.25 ml kg−1 min−1).

The pooled group mean differences in test-to-retest changes in WR at peak exercise was −8.55 W (95% CI −15.38 W to −1.72 W), indicating people with ME/CFS have a reduced peak WR in the second test compared to controls. Tau was 4.4 W, the 95% PI (−24.59 W to 7.49 W).

The pooled group mean differences in test-to-retest changes in WR at AT was −21 W (95% CI −38 W to −4 W), demonstrating that people with ME/CFS had a reduced power output at AT in the second of the two tests compared to apparently healthy controls. Tau, was 9.84 W (95% CI 0–W). The effect size for this difference was large (d = −0.96) providing evidence that WR at AT effectively discriminates between ME/CFS and controls. The 95% PI (−69 W to 27 W) indicated a high degree of uncertainty. The probability that the effect in a new study conducted in a similar setting (difference between ME/CFS and controls for the test 2 – test 1 difference) would be beyond the MCID of a reduction of 10 W is 78% (95% CI 40%–91%).

The point estimates in each study are all beyond 10 W (). Although the estimate of the heterogeneity variance is approximate given the small number of studies, there is reasonable evidence supporting a meaningful reduction in power output at the AT in test 2 in people with ME/CFS vs. apparently healthy controls.

Table 7. Individual study level differences for changes in WR at AT (W).

Discussion

The main findings from this review support the findings of Lim et al. [Citation15], that WR at AT is reduced in the second of two tests in people with ME/CFS when compared to apparently healthy controls. However, building on the work of Lim et al. [Citation15] we demonstrate that the SMD of the difference in WR at AT was large, almost 1 standard deviation indicating that WR at AT effectively differentiates between people with ME/CFS and controls. There are only four studies in the meta-analysis for WR at AT which is illustrated by the large prediction interval which showed a possible greater reduction in favour of controls (−69 W) to a possible improvement in the ME/CFS group (27 W).

Although there are only a limited number of studies in this review the overall findings are supported by the following observations; at individual study level all studies demonstrated a reduced WR at AT in ME/CFS compared with controls and that this difference in all four studies exceeded the MCID of 10W. The lower end of each studies 95%CI was in the direction of a reduction in ME/CFS compared to controls. The overall pooled effect reported in this review demonstrated a reduced WR at AT in favour of controls that exceeded the MCID defined in this review. Further still we estimate that 78% of future studies would demonstrate a difference that would exceed the MCID (10W) in favour of controls. Therefore, while acknowledging the small number of studies in this analysis, we believe these factors combined provide evidence that the change in WR at AT for ME/CFS is clinically relevant and evidence indicates that changes in WR over two maximal exercise tests separated by 24hours may be effective in discriminating between ME/CFS and apparently healthy controls.

A possible mechanism for this decrease in WR at AT, at test 2 was described in Vermeulen et al. [Citation7], that people with ME/CFS may have a limited oxygen transport capacity which would explain an increase in anaerobic metabolism in the second test. A further possible explanation was discussed by Tomas et al. [Citation32] that people with ME/CFS may have a reduced aerobic respiratory capacity which results in transferring the cells towards anaerobic energy sources to fulfil energy demands. Tomas et al. [Citation32] provided evidence that for someone with ME/CFS, mitochondrial dysfunction may be a contributing factor to their symptoms. However, these mechanisms were not explored in this review.

While the mechanism which resulted in the reduced WR at test 2 is unclear, data from this review suggests that in the 24 hours following high-intensity exercise people with ME/CFS are unable to achieve the same WR at AT that they had 24 hours earlier. Further to this, these results would appear to demonstrate a physiological response in the 24 hours that follow high-intensity exercise in people with ME/CFS but not controls, which could be related to PEM. However, this requires further exploration to fully understand the mechanisms which cause this reduction and how this relates to ME/CFS symptoms.

Results from this review demonstrated a marginal decrease in the pooled point estimate of VO2peak in people with ME/CFS compared to apparently healthy controls of approximately 1 ml kg−1 min−1. However, the 95% PI provided no conclusions to the direction of the difference between patients and controls in a future study. Therefore it is not clear if the difference in the change in VO2peak between people with ME/CFS and apparently healthy controls in two maximal exercise tests separated by 24 hours is clinically relevant.

It is unclear from these findings if the absence of any meaningful difference in the VO2 between people with ME/CFS and controls is attributable to any real difference in oxygen uptake/utilisation or measurement error. For example, the VO2 differences observed in the individual studies in this review may be a consequence of the different averaging strategies applied during expired gas analysis [Citation33], such as breath averages, time averaging or trial averaging [Citation34–36. Only one of the six studies in this review presented details of their averaging strategy [Citation23]. Lien et al. [Citation23] used a 30 s time average during analysis of the breath-by-breath data. Despite this being a common approach [Citation34, Citation35], time averaging filters between 6 and 60 s can lead to between 6 and 10% decay in VO2 values [Citation33]. Furthermore, Martin-Rincon et al. [Citation33] reported that increases in the time and breath intervals led to higher levels of decay in the accuracy of the data, and that this error was higher in untrained subjects. The level of decay was also higher in bike tests compared to treadmill tests. Therefore, it is plausible that the differences in VO2 data, or lack thereof, may be due to differences in time averaging used between research groups. An alternative explanation could be in the exercise mode used during the maximal exercise tests. For example, a treadmill and bicycle ergometer represent distinctly different physiological demands and have been used in previous research to elicit different dimensions of fatigue (e.g. central versus peripheral fatigue) [Citation36], with termination of the bicycle tests due to exhaustion/fatigue of the leg muscles [Citation7]. The lack of difference in the VO2 data in the included studies may therefore be affected by the predominant use of a bicycle ergometer as the exercise choice for the exercise tests with less of a contribution from central/respiratory fatigue to termination of the exercise test.

Findings from this review further demonstrated reductions in workload at peak exercise although the change in peak workload was smaller than that at the AT. Possible explanation could be the methods used for measuring peak exercise. The method used to assess maximum effort was only described in detail in three of the six papers [Citation12–14]; however, the data for demonstrating how many participants met each criterion was not reported in any of the papers. It is therefore difficult to establish if the difference reported in this review is accurate or due to participants not achieving their true physiological maximum during testing.

Methods to assess maximum effort such as an inability to maintain a particular pedal rate as in Hodges et al. [Citation14] or a particular power output has also been criticised as this may be due to lack of effort, rather than an indication of maximum effort [Citation37]. Another area to study could be to assess subject’s willingness and perceived ability to give maximal effort prior to each test [Citation37]. Indeed, Poole and Jones [Citation38] argued that even using criteria for maximum effort could underestimate VO2peak by 30–40% due to individual variations, and achieving true peak is limited to those who are familiar with the protocol and are highly motivated.

Moreover, only one of the included papers [Citation13] described using any method of familiarisation with the test. This is important to control for any learning effect which could result in an improvement in test 2 due to experience with the test rather than any physiological improvement. This was reported in Poole and Jones [Citation38] that VO2peak may be overestimated in repeated testing as the subject gains experience and possibly enhanced confidence and therefore the accuracy of the initial VO2peak is questionable.

Importantly, a maximal exercise test cannot discriminate among subjects who cease exercise because of lack of motivation, perceived discomfort, or any other reasons, none of which are related necessarily to their maximal rate of oxygen transport/utilisation [Citation38]. However, a strength of the findings of this review is that the AT is not dependent on motivation and is more of an objective marker than peak exercise. Therefore, there may not be a need for people with ME/CFS to exercise to peak to produce this response. In this instance, it would instead be important to know the lowest demand needed to produce a measurable response in people with ME/CFS. Testing at lower exercise intensities which would continue above the AT but terminate before peak exercise maybe a possible direction for future studies. This may also place less demand on those with ME/CFS and possibly widen recruitment to studies.

Although these results provide information relating to possible measurable differences in WR at AT between people with ME/CFS and apparently healthy controls, the following limitations should be noted. (1) The high degree of uncertainty around the pooled mean difference in the change in WR, demonstrated by the wide 95% prediction interval, ranging from a possible larger reduction in WR in people with ME/CFS at test 2 compared to controls (−69 W) to a reduction in WR in the control group at test 2 but not the ME/CFS group (27 W). (2) The large heterogeneity estimated with Tau. (3) The estimates in this review are made using a limited number of studies and only four papers (effects) were meta-analysed to estimate the pooled mean difference for change in WR at AT. (4) The SD of the change was estimated in three of the studies which will affect the precision of the pooled effect. Although the process used to estimate the SDs of the change is based on the methods described by Higgins et al. [Citation22] it must be acknowledged that these are only estimates and not the true SDs of the change for these studies. Higgins et al. [Citation22] stated that these methods should be used carefully, as there is no way of ensuring that the calculated correlation coefficients are accurate and they may be affected by factors such as the characteristics of the participants themselves. However, for the purpose of this analysis the most important statistic was viewed to be the change in the outcomes over the 24 hours. However, the limitations of estimating the SD of the change should be considered when interpreting the results of the analysis. Therefore, caution is recommended when interpreting these findings and it must be acknowledged that these results require verifying by much larger, well-powered studies.

Finally, it should be noted that the methods used in these studies are likely to trigger or exacerbate ME/CFS symptoms [Citation12]. These studies have undoubtedly provided essential information in understanding the condition; however, future research could assess if similar trends can be identified at lower exercise intensities.

Conclusion

In conclusion, results from this review demonstrate that people with ME/CFS have a reduced WR at AT in the second of two maximal effort tests which is not the case in apparently healthy controls. These findings provide some evidence of possible limitations of aerobic capacity which would appear to happen in the 24 hours following high-intensity exercise in ME/CFS, but not for controls. Based on these findings, it would be useful to explore the lowest demand needed to illicit this response and assess the feasibility of repeated exercise at lower intensities. This review provides evidence that people with ME/CFS appear to demonstrate a measurable response to high-intensity exercise that is not present in apparently healthy controls. These findings add support to the hypothesis of a possible physiological mechanism associated with ME/CFS.

Acknowledgements

We would like to thank Professor Greg Atkinson and Professor Alan Batterham for their guidance and support throughout the development and completion of this review.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

All data that has been collated in the development of this review is presented in this manuscript.

Additional information

Funding

This study was a component of John Franklin’s PhD which was part funded by Teesside University.

Notes on contributors

John Derek Franklin

Dr. John Franklin is a Senior Lecturer in Research Methods at Teesside University and Fellow of the Higher Education Academy.

Michael Graham

Dr. Michael Graham is a Senior Lecturer in Sport and Exercise Science at Teesside University with a specialism in health and physical activity promotion and a Fellow of the Higher Education Academy.

References

  • Evering RMH, Van Weering MGH, Groothuis-Oudshoorn KCGM, et al. Daily physical activity of patients with the chronic fatigue syndrome: a systematic review. Clin Rehabil. 2011;25:112–133.
  • Carruthers BM, Jain AK, De Meirleir KL, et al. Myalgic encephalomyelitis/chronic fatigue syndrome: clinical working case definition. Diagnostic and treatment protocols. J Chronic Fatigue Syndr. 2003;11:7–115.
  • Jason L. The energy envelope theory and myalgic encephalomyelitis/chronic fatigue syndrome. AAOHN J. 2008.
  • Collin SM, Crawley E, May MT, et al. The impact of CFS/ME on employment and productivity in the UK: A cross-sectional study based on the CFS/ME national outcomes database. BMC Health Serv Res. 2011;11(217). https://doi.org/10.1186/1472-6963-11-217.
  • Collin SM, Nikolaus S, Heron J, et al. Chronic fatigue syndrome (CFS) symptom-based phenotypes in two clinical cohorts of adult patients in the UK and The Netherlands. J Psychosom Res. 2016;81:14–23.
  • Chu L, Valencia IJ, Garvert DW, et al. Deconstructing post-exertional malaise in myalgic encephalomyelitis/chronic fatigue syndrome: A patient-centered, cross-sectional survey. PLoS One. 2018;13(6): e0197811, https://doi.org/10.1371/journal.pone.0197811.
  • Vermeulen RCW, Kurk RM, Visser FC, et al. Patients with chronic fatigue syndrome performed worse than controls in a controlled repeated exercise study despite a normal oxidative phosphorylation capacity. J Transl Med. 2010;8(93). https://doi.org/10.1186/1479-5876-8-93.
  • Vanness JM, Snell CR, Stevens SR. Diminished cardiopulmonary capacity during post-exertional malaise. J Chronic Fatigue Syndr. 2007;14:77–85.
  • Franklin JD, Atkinson G, Atkinson JM, et al. Peak oxygen uptake in chronic fatigue syndrome/myalgic encephalomyelitis: A meta-analysis. Int J Sports Med. 2019;40:77–87.
  • Komaroff AL. Advances in understanding the pathophysiology of chronic fatigue syndrome. JAMA. 2019;322:499–500.
  • Davenport TE, Stevens SR, Stevens J, et al. Properties of measurements obtained during cardiopulmonary exercise testing in individuals with myalgic encephalomyelitis/chronic fatigue syndrome. Work. 2020;66:247–256.
  • Snell CR, Stevens SR, Davenport TE, et al. Discriminative validity of metabolic and workload measurements for identifying people with chronic fatigue syndrome. Phys Ther. 2013;93:1484–1492.
  • Nelson MJ, Buckley JD, Thomson RL, et al. Diagnostic sensitivity of 2-day cardiopulmonary exercise testing in myalgic encephalomyelitis/chronic fatigue syndrome. J Transl Med. 2019;17:1–8. DOI:10.1186/s12967-019-1836-0
  • Hodges LD, Nielson T, Baken D. Physiological measures in participants with chronic fatigue syndrome: multiple sclerosis and healthy controls following repeated exercise: a pilot study. Clin Physiol Funct Imaging. 2018;37:639–644. DOI:10.1111/cpf.12460
  • Lim EJ, Kang EB, Jang ES, et al. The prospects of the two-day cardiopulmonary exercise test (CPET) in ME/CFS patients: A meta-analysis. J Clin Med. 2020;9:1–13.
  • Inthout J, Ioannidis JPA, Rovers MM, et al. Plea for routinely presenting prediction intervals in meta-analysis. BMJ Open. 2016;6, Available from: http://www.r-project.org/
  • Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Br Med J. 2021;372.
  • Carruthers BM, Van de Sande MI, De Meirleir KL, et al. Myalgic encephalomyelitis: international consensus criteria. J Intern Med. 2011;270:327–338.
  • Fukuda K, Straus SE, Hickie I, et al. The chronic fatigue syndrome: a comprehensive approach to its definition and study. Ann Intern Med. 1994;121:953–959.
  • Holmes GP, Kaplan JE, Gantz NM, et al. Chronic fatigue syndrome: a working case definition. Ann Intern Med. 1988;108:387–389.
  • Ross LE, Grigoriadis S, Mamisashvili L, et al. Quality assessment of observational studies in psychiatry: an example from perinatal psychiatric research. Int J Methods Psychiatr Res. 2011;20:224–234.
  • Higgins JPT, Li T, Deeks JJ. Chapter 6: choosing effect measures and computing estimates of effect. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA, editor. Cochrane handbook for systematic reviews of interventions version 6.3 (updated February 2022). Cochrane. 2022. Available from www.training.cochrane.org/handbook
  • Lien K, Johansen B, Veierød MB, et al. Abnormal blood lactate accumulation during repeated exercise testing in myalgic encephalomyelitis/chronic fatigue syndrome. Physiol Rep. 2019;7:e14138. DOI:10.14814/phy2.14138
  • Digitizelt. Digitizer software. [Accessed 2021 Jul]. Available from: http://www.digitizeit.de/
  • Jackson D, Law M, Rücker G, et al. The Hartung-Knapp modification for random-effects meta-analysis: A useful refinement but are there any residual concerns? Stat Med. 2017;36:3923–3934.
  • Inthout J, Ioannidis JP, Borm GF. The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method. BMC Med Res Methodol. 2014;14(25). https://doi.org/10.1186/1471-2288-14-25.
  • Röver C, Knapp G, Friede T. Hartung-Knapp-Sidik-Jonkman approach and its modification for random-effects meta-analysis with few studies. BMC Med Res Methodol. 2015;15(99). https://doi.org/10.1186/s12874-015-0091-1.
  • Vacha-Haase T, Thompson B. How to estimate and interpret various effect sizes. J Couns Psychol. 2004;51:473–481.
  • Hopkins G. Individual responses made easy. J Appl Physiol. 2015;118:1444–1446.
  • Farivar SS, Liu H, Hays RD. Half standard deviation estimate of the minimally important difference in HRQOL scores? Expert Rev Pharmacoecon Outcome Res. 2004;4:515–523.
  • Mathur MB, VanderWeele TJ. New metrics for meta-analyses of heterogeneous effects. Stat Med. 2018: 1–7.
  • Tomas C, Elson JL. The role of mitochondria in ME/CFS: a perspective. Fatigue Biomed Heal Behav. 2019;7:52–58.
  • Martin-Rincon M, González-Henríquez JJ, Losa-Reyna J, et al. Impact of data averaging strategies on V̇O2max assessment: mathematical modeling and reliability. Scand J Med Sci Sport. 2019;29:1473–1488.
  • Robergs RA, Dwyer D, Astorino T. Recommendations for improved data processing from expired gas analysis indirect calorimetry. Sport Med. 2010;40:95–111.
  • McNulty CR, Robergs RA. Repeat trial and breath averaging: recommendations for research of VO2 kinetics of exercise transitions to steady-state. Mov Sport Sci - Sci Mot. 2019;106:37–44.
  • McLaren SJ, Graham M, Spears IR, et al. The sensitivity of differential ratings of perceived exertion as measures of internal load. Int J Sports Physiol Perform. 2016;11:404–406.
  • Midgley AW, McNaughton LR, Polman R, et al. Criteria for determination of maximal oxygen uptake: A brief critique and recommendations for future research. Sport Med. 2007;37:1019–1028.
  • Poole DC, Jones AM. Measurement of the maximum oxygen uptake Vo2max: Vo2peak is no longer acceptable. J Appl Physiol. 2017;122:997–1002.