Abstract
Background Physical activity (PA) measurements are becoming common in interstitial lung disease (ILD); however, standardisation has not been achieved. We aimed to systematically review PA measurement methods, present PA levels and provide practical recommendations on PA measurement in ILD.
Methods We searched four databases up to November 2022 for studies assessing PA in ILD. We collected information about the studies and participants, the methods used to measure PA, and the PA metrics. Studies were scored using 12 items regarding PA measurements to evaluate the reporting quality of activity monitor use.
Results In 40 of the included studies, PA was measured using various devices or questionnaires with numerous metrics. Of the 33 studies that utilised activity monitors, a median of five out of 12 items were not reported, with the definition of nonwear time being the most frequently omitted. The meta-analyses showed that the pooled means (95% CI) of steps, time spent in moderate to vigorous PA, total energy expenditure and sedentary time were 5215 (4640–5791) steps·day−1, 82 (58–106) min·day−1, 2130 (1847–2412) kcal·day−1 and 605 (323–887) min·day−1, respectively, with considerable heterogeneity.
Conclusion The use of activity monitors and questionnaires in ILD lacks consistency. Improvement is required in the reporting quality of PA measurement methods using activity monitors.
Tweetable abstract
Reporting quality of physical activity measurement in ILD can be low due to the severe heterogeneity of the measurement methods and metrics used. Standardisation of measurement and improvement in reporting quality is essential to compare studies. https://bit.ly/3GRyq3p
Introduction
Physical activity (PA) means any bodily movement produced by skeletal muscles resulting in energy expenditure [1]. PA is a complex behaviour described according to the type of PA, movement intensity, movement duration or a combination thereof. The PA guideline for Americans recommends a minimum of 150 min·week−1 of at least moderate-intensity PA to gain health benefits across the adult population, including adults with chronic diseases [2]. The benefits of regular PA include reducing the risk of all-cause and cardiovascular disease mortality, cardiovascular disease, hypertension, type 2 diabetes and other chronic diseases [2].
Participation in regular PA is also crucial in patients with interstitial lung disease (ILD). PA is reduced in patients with ILD compared to healthy controls [3, 4]. Greater dyspnoea and exercise intolerance are associated with lower PA [5]. Reduced PA is one of the strong risk factors for hospitalisation and all-cause mortality in patients with ILD [6, 7]. Therefore, the number of studies regarding PA has been increasing.
Currently, there is no systematic review of PA measurements in patients with ILD. Some researchers have used questionnaires [4, 6–9], while others used activity monitors [3–5, 10]. Questionnaires have multiple limitations, including recall bias, missing data and less precision [11]. Thus, activity monitors containing accelerometers are preferred. Furthermore, different activity monitors and collecting and processing data methods were used in previous studies. Inaccurate assessment of PA can adversely impact the advancement of PA research in ILD. Therefore, understanding how activity monitors or questionnaires have been used in ILD is crucial.
This systematic review, therefore, aimed 1) to explore the types of activity monitors or questionnaires used for collecting PA data in patients with ILD, 2) to evaluate activity monitor-based or questionnaire-based metrics used for assessing PA, 3) to examine the quality of reporting on data collection and processing using activity monitors, 4) to describe PA levels using each metrics, and 5) to provide practical recommendations on how to measure PA and sedentary time (ST).
Methods
This review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analysis statement (supplementary material 1). The review protocol was registered on the International Prospective Register of Systematic Reviews (CRD42021264114).
Eligibility criteria
Study designs
Observational and interventional studies were included. Observational studies were cross-sectional studies, cohort studies or case-control studies. Interventional studies were studies that investigated the effects of interventions. Case series, case reports and grey literature (e.g. conference abstracts) were excluded.
Participants
We included studies examining adults with ILD of any origin, diagnosed according to investigator definitions. Participants with exacerbation histories in the preceding 4 weeks [12] were excluded to minimise the influence of exacerbations. There were no restrictions by a history of pulmonary rehabilitation because it is a standard treatment for ILD [13].
PA measurements
We included studies that used activity monitors or questionnaires to measure PA. Additionally, studies had to report on at least the characteristics of patients with ILD separately.
Setting
There were no restrictions on the type of setting.
Language
We included studies reported in English.
Information sources and search strategy
We searched PubMed, CENTRAL, PEDro and OTseeker from the inception of the databases up to 28 November 2022. Search strategies were developed using medical subject headings and text words related to ILD and PA. No study design, date or language limits was imposed on the search. The full search strategy is presented in supplementary material 1.
Selection process and data collection
Two review pairs (M.I. and A.K.; Y.O. and Y.O.) independently performed the first screening (titles and abstracts) and the second screening (full text). Reviewers independently extracted data from each study that met inclusion criteria. Reviewers resolved disagreements by discussion and an arbitrator (A.T. or M.A.S.) adjudicated unresolved disputes.
Data items and outcomes
We extracted the following data:
patient characteristics – age, sex, forced vital capacity (FVC), diffusing capacity of the lungs for carbon monoxide (DLCO), modified Medical Research Council (mMRC) dyspnoea score [14], 6-min walk distance (6MWD) and use of long-term oxygen therapy (LTOT);
types of activity monitors or questionnaires – brand, model, sensor type and sensor location;
PA data collection – period of wear (requested days wear time, weekend/weekday wear requirements and overnight wear) and number of hours wear for a valid day;
PA data management – valid days requirement, rules for exclusion of days and a method for nonwear time detection;
PA metrics – steps, time spent in the specific intensity of PA (e.g. time spent in moderate to vigorous PA (MVPA)), energy expenditure (EE) (e.g. total EE (TEE) and activity-related EE (AEE)), ST and any other types of metrics.
We extracted data from baseline assessments for cohort or interventional studies to avoid influences of exposures or treatments on their PA. Additionally, we extracted the same data from healthy controls in included studies.
We assessed reporting quality, in the objectified measurement of PA, following the checklist by Montoye et al. [15]. This checklist consists of 12 questions on accelerometer information, data collection and processing. Reviewers gave a “+1” score for a sufficiently reported item and a “−1” score for an insufficiently documented item. The number of “−1” scores is summed and each study was given a score of 0–−12, with scores closer to 0 indicating complete reporting.
Risk of bias assessment
The two pairs independently evaluated the risk of bias following the Joanna Briggs Institute (JBI) critical appraisal checklist for an analytical cross-sectional study [16]. We used the JBI checklist for all studies regardless of these designs because only baseline data were collected. A judgment on the possible risk of bias on criteria 1–7 was made from the extracted information, rated as “yes”, “no”, “unclear” or “not applicable”. We did not use criterion 8 (was appropriate statistical analysis used?) because we focused on baseline descriptive data. If there was insufficient detail, we judged the risk of bias as “unclear”.
Data synthesis process
We performed all statistical analyses using RStudio (version 1.2.5001). Each metric was combined and calculated using the “meta” package [17].
Assessment of heterogeneity
We tested the clinical heterogeneity by considering the variability in participant factors, types of activity monitors or questionnaires, and PA data collection and management. Statistical heterogeneity was tested using the I2 statistic.
Dealing with missing data
After the author contacts, the missing data were excluded using listwise deletion.
Data synthesis
If high heterogeneity existed among the studies (I2≥50% or p<0.1), we conducted meta-analyses using a random-effects model. Calculating a pooled mean and 95% confidence interval of each metric was performed using the inverse variance method. If a study reported only a single median with an interquartile range or range, we estimated a mean and sd using sample size, median and interquartile range or range following the method reported by Wan et al. [18].
Investigation of heterogeneity
Subgroup analyses using aggregate data were performed to explore possible sources of heterogeneity based on the types and locations of activity monitors and patients’ characteristics (ILD subtype, age, FVC, DLCO, 6MWD and LTOT). We divided the included studies into two groups as follows: not old (<65 years) and old (≥65 years); preserved FVC (≥65% predicted) and low FVC (<65% predicted) [19]; preserved DLCO ≥45% predicted) and low DLCO (<45% predicted) [19]; and preserved 6MWD (≥350 m) and low 6MWD (<350 m) [20]. Regarding LTOT, studies were divided into three groups: studies including patients with and without LTOT (mix), only without LTOT (without) and only with LTOT (with). Moreover, studies were divided by the MVPA or ST definition. Pooled mean, 95% confidence interval and I2 for each subgroup were estimated if a subgroup included ≥3 studies.
Sensitivity analysis
Sensitivity analysis was performed to explore the impact of reporting quality of PA measurements on the heterogeneity. We divided included studies into adequate reporting quality (−6–0 points) and low reporting quality (−12–−7 points), based on the quartile 1 of −6.5 points in this review. Pooled mean, 95% confidence interval and I2 for each subgroup were estimated if a subgroup included ≥3 studies.
We conducted meta-analysis and subgroup analyses on TEE by omitting Khor et al. [21] and Prasad et al. [22] to test the robustness of the estimations because the means of TEE in those studies are approximately 3.5 times higher than the other studies included in the meta-analysis.
Narrative synthesis
A narrative synthesis was provided to summarise and explain the characteristics and PA metrics following the Centre for Reviews and Dissemination guidance [23].
Confidence in cumulative evidence
The quality of evidence on PA metrics, included in meta-analyses, was assessed across the domains of risk of bias [24], inconsistency [25], indirectness [26] and imprecision [27] following the Grading of Recommendations Assessment, Development and Evaluation working group methodology [28]. Quality was adjudicated as high, moderate, low or very low.
Results
Description of studies
Details of the included studies are available in table 1.
Characteristics of included studies (40 studies)
Results of the search
The search yielded 15 407 citations and ended with 40 studies [3–5, 7–10, 21, 22, 29–59] from 49 citations (figure 1).
Flowchart of literature review. ILD: interstitial lung disease.
Hur et al. [60] and Hur et al. [8] used the same cohort data. All PA metrics reported by Hur et al. [60] were shown in the other study [8]. Thus, we excluded Hur et al. [60]. Three studies by Vainshelboim et al. [6, 7, 61] used the same participant data (Registration No. NCT01499745). A PA metric (overall PA calculated in metabolic equivalent of tasks (METs) min·week−1) shown in two studies [6, 61] was not reported in the other one [7]. Therefore, we combined the two studies [6, 61] with Vainshelboim et al. [7] and excluded them [6, 61]. Two studies by Dale et al. [44, 62] used the same cohort data (Registration No. ACTRN12608000147381). We excluded the latter one [62]. Two reports by Bahmer et al. [5, 63] used the same participant data (Registration No. DRKS00006170) and Bahmer et al. [5] reported complete data at baseline. Thus, we excluded Bahmer et al. [63]. We excluded four studies that did not show the participant characteristics with ILD separately from other participants [64–67].
Included studies
Table 1 shows the characteristics of the included studies. Of the 40 studies, 11 studies [9, 21, 34, 35, 37, 46, 49–52, 58] were intervention studies, 13 studies [3, 7, 8, 22, 29, 30, 33, 36, 41, 43, 47, 54, 56] were cohort studies and 16 studies [4, 5, 10, 31, 32, 38–40, 42, 44, 45, 48, 53, 55, 57, 59] were cross-sectional studies. Of the 16 cross-sectional studies, one was a validation study of a PA questionnaire [8].
15 studies included patients with ILD due to any causes [8, 21, 30–34, 37, 46–48, 51, 53, 55, 59], 15 studies included only idiopathic pulmonary fibrosis (IPF) [5, 7, 9, 22, 29, 40–43, 45, 49, 50, 54, 56, 58], seven studies included only sarcoidosis [4, 35, 36, 38, 39, 52, 57] and three studies included another ILD subtype [3, 10, 44]. The total sample size of each study ranged from 13 [46] to 629 [57]. The mean or median age was more than 65 years in 26 studies (65%).
PA measurement
Of 40 studies, 31 (78%) studies used only activity monitors [3, 5, 10, 21, 22, 29–32, 34–45, 48–52, 55–59], seven (18%) used only questionnaires [7, 9, 33, 46, 47, 53, 54] and two (5%) used both [4, 8]. Thus, 33 (83%) of 40 studies used activity monitors and nine (23%) used questionnaires.
Table 2 summarises the activity monitor-based PA measurements. Of 33 studies, 14 studies used SenseWear Armband [3, 5, 21, 22, 35–37, 39, 40, 44, 49, 50, 55, 56], nine studies used ActiGraph [8, 10, 34, 38, 48, 51, 52, 58, 59] and four studies used Lifecorder [29, 32, 42, 45].
Details of activity monitor-based physical activity measurement in included studies (33 studies)
12 studies did not report the sensor location [21, 31, 32, 35, 37, 38, 40, 43, 50, 52, 57, 58], nine studies located the monitors at the arm (upper arm: seven studies [3, 22, 36, 39, 44, 55, 56]; not specified: two studies [5, 49]), three studies at the wrist [30, 34, 51], four studies at the waist [29, 42, 45, 59], two studies at the hip [41, 48], one study at the upper thigh [4], one study at the wrist and waist [8], and one study wrist or waist [10].
Of nine studies that used questionnaires, five studies used the International Physical Activity Questionnaire (IPAQ) (long form: one study [8]; short form: three studies [4, 7, 9]; not reported: one study [54]), four studies used the Human Activity Profile [46], the Rapid Assessment of Physical Activity Questionnaire [47], a part of frailty assessment by the Fried Frailty Criteria [33], and the Minnesota Leisure-Time Physical Activity Questionnaire short form [53] (table S1 in supplementary material 2).
Reporting quality of PA measurements
Table 3 summarises the reporting quality of PA measurements using activity monitors (33 studies). Accelerometer brand and model, the number of days of data collected, PA metrics, and the number of people not meeting wear-time criteria were well reported in most studies. However, epoch length, placement of the accelerometer (especially on the side of the body), the number of participants receiving the accelerometer, distribution method of the accelerometer, criteria for defining nonwear time, minutes requirement for a valid day, and the number of valid days needed were poorly reported (table S2 in supplementary material 2).
Reporting quality of accelerometer-based physical activity measurement (33 studies)
Risk of bias
Table S3 in supplementary material 2 shows the results of the risk of bias. The domains with the highest risks of bias were item 7 regarding the method of PA measurements and items 5 and 6 regarding confounding factors.
PA metrics
Of 40 studies, 35 studies (activity monitor only: 29 studies [3, 5, 10, 21, 22, 29–32, 35–45, 48–52, 55, 56, 58, 59]; questionnaire only: four studies [7, 46, 53, 54]; both: two studies [4, 8]) displayed at least one PA metric of patients with ILD. The PA metrics used were heterogeneous (tables S4 and S5 in supplementary material 2).
Of 29 studies that used activity monitors, 28 studies (97%) reported steps [3–5, 8, 10, 22, 29–32, 35–42, 44, 45, 48–50, 52, 55, 56, 58, 59], 16 studies (55%) MVPA [3, 5, 8, 10, 21, 22, 31, 35, 37, 41, 44, 48, 50, 51, 55, 56], eight studies (20%) TEE [3, 21, 22, 31, 35, 36, 44, 45], four studies (10%) AEE [8, 22, 41, 45] and nine studies (23%) ST [8, 10, 21, 31, 41, 43, 50, 55, 56]. Definitions of MVPA and ST were heterogeneous (table S4 in supplementary material 2). For example, seven of 16 studies defined moderate intensity of PA as >3 METs [5, 22, 31, 37, 44, 55, 56], two studies defined it as >2.5 METs [3, 35] and others used acceleration magnitude [8, 51] or EE [41] for defining MVPA.
Of nine studies that used questionnaires, four studies presented IPAQ overall EE (MET min·week−1) [4, 7, 8, 54], two studies IPAQ walking EE (MET min·week−1) [4, 7] and ST (min·day−1) [7, 8], and one study IPAQ MVPA EE (MET min·week−1) [8] (table S4 in supplementary material 2). Other metrics measured by questionnaires are described in table S5 in supplementary material 2.
Estimation of PA levels in steps, MVPA, TEE and ST
The pooled mean (95% CI) of steps was 5215 (4640–5791) steps·day−1 (I2=97 (97–98) %) (figure 2a). Subgroup analyses found that people with IPF or ILD took fewer steps than those with sarcoidosis. Activity monitors worn on the wrist or upper arm showed higher steps than those worn on the waist or lower extremity. Additionally, people with lower FVC, DLCO, 6MWD and LTOT exhibited fewer steps (figures S1–S8 in supplementary material 3). However, I2 only slightly improved in subgroup analyses by the ILD subtype.
Forest plots of estimation of overall means of a) steps (steps·day−1), b) moderate to vigorous physical activity (MVPA) (min·day−1) and c) sedentary time (ST) (min·day−1). Weight is calculated by the random-effects model. Wrist and waist refer to the activity monitor location. C: control; I: intervention.
The pooled mean of MVPA was 82 (58–106) min·day−1 (I2=99 (99–99) %) (figure 2b). Subgroup analyses revealed that people with sarcoidosis spent more time in MVPA than the other ILD subtypes (figures S9–S17 in supplementary material 3). MVPA measured by SenseWear (91 (61–121) min·day−1) appeared to be higher than by ActiGraph (49 (3–96) min·day−1), but it was not significant. Patients with lower FVC or LTOT spent a shorter time in MVPA than those with preserved FVC or without LTOT (all p<0.05). A mean of MVPA, defined as 2.5 METs or more, was approximately 2.5 times longer than that defined as 3.0 METs (152 (123–181) min·day−1 versus 67 (51–82) min·day−1) with a slight improvement in heterogeneity.
The pooled mean of TEE was 3574 (1684–5464) kcal·day−1 (I2=100 (100–100) %) (figure S18 in supplementary material 3). We did not perform a subgroup analysis by FVC because no study was classified into the low FVC group. Activity monitors worn on the upper extremity showed higher TEE than those worn on the waist or lower extremity (figures S19–S25 in supplementary material 3). Heterogeneity was improved only in subgroup analysis by the ILD subtype.
The pooled mean of ST was 605 (323–887) min·day−1 (I2=100 (100–100) %) (figure 2c). Types of activity monitors showed significant differences in ST. People with worse FVC or DLCO exhibited longer ST than those with better FVC or DLCO. ST significantly differed between studies that used different definitions of ST (figures S26–S34 in supplementary material 3). There was no improvement in I2 in all subgroup analyses.
Sensitivity analysis
Subgroup analyses by reporting quality of PA measurements found that studies with low reporting quality reported longer MVPA than those with adequate reporting quality (p=0.02) (figures S35–S38 in supplementary material 4). Substantial heterogeneities were observed in all PA metrics.
After omitting two studies [21, 22], the pooled mean value of TEE was 2130 (1847–2412) kcal·day−1 (figure S39 in supplementary material 4). I2 was unchanged (96 (94–97) %). Subgroup analyses showed similar results (figures S40–S47 in supplementary material 4).
Reference data of healthy controls
Of 40 studies, seven studies recruited healthy controls (tables S6 and S7 and figures S48 and S49 in supplementary material 5). The pooled mean of steps from six studies was 10 167 (8433–11 901) steps·day−1 (I2=88 (76–94) %). The pooled mean of TEE from three studies was 2618 (2505–2730) kcal·day−1 (I2=15 (0–91) %). The means (sd) of MVPA in three studies were 261 (118), 132 (72) and 86 (8) min·day−1. No study reported ST.
Quality of the body of evidence
The quality of the body of evidence on steps, MVPA, TEE and ST was very low, mainly due to severe inconsistency and imprecision (table S8 in supplementary material 6).
Discussions
This systematic review revealed that 1) measurement procedures varied tremendously between studies, 2) reporting quality of PA measurements was poor in most studies, 3) types and definitions of PA metrics were heterogeneous and influenced the PA metrics values, and 4) use of PA questionnaires is limited in patients with ILD. Additionally, there was very low-quality evidence in the pooled means of steps, MVPA, TEE and ST. Therefore, clinicians and researchers should improve the quality of PA measurements.
Reporting quality of PA measurements
Four systematic reviews have assessed the reporting quality of PA measurements using accelerometers in the general population [15], cancer survivors [68] and patients with chronic heart failure [69] or COPD [70]. Although direct comparisons of our results to these reviews are difficult due to the differences in methodology of reporting quality assessment, we selected the same tool used in two reviews [15, 69], enabling us to compare the reporting quality in patients with ILD to other populations.
More than 50% of studies included in this review failed to report six of 12 items related to data collecting and processing. Specifically, 88% of the included studies failed to report the criteria for defining nonwear, compared with 69% in general populations [15], 49% in cancer survivors [68] and 80% in heart failure [69]. Moreover, included studies also failed to report how accelerometers were distributed to participants (79%), placement of accelerometer (76%) and epoch length used (73%). The percentages are worse than those in the general population (69, 51 and 36%) [15], cancer survivors (46, 7 and 52%) [68] and heart failure (0, 64 and 60%) [69]. In contrast, 48% of the included studies did not report the number of valid days needed and 42% failed to report how many minutes were needed to be considered a valid day. The percentages are similar to those in the general population (52 and 50%) [15], cancer survivors (43 and 38%) [68] and heart failure (76 and 78%) [69]. Burtin et al. [70] revealed that only 37 of 110 (34%) studies with COPD patients fulfilled the following minimal preferred methodologic quality of PA assessment: measurement period ≥7 days; minutes needed to be considered a valid day ≥8 h·day−1; ≥4 consecutive or nonconsecutive valid days; and invalid days excluded from analysis. In our review, only 36% met the criteria. We are reluctant to use these results to claim that the reporting quality in studies with ILD is similar or inferior to that in other populations. However, we believe that reporting on PA measurements should be improved because these factors are crucial for replicating and comparing studies. Therefore, we encourage clinicians and researchers to report data collecting and processing methods following the checklist by Montoye et al. [15], as shown in table 3 and table S2 in supplementary material 2. We have developed a template for adequate reporting quality (table S9 in supplementary material 7), which can be used to report required information as supplementary materials.
Risk of bias
The studies included in this review recruited well-defined patients with ILD using objective measures following international guidelines. In contrast, the included studies did not adequately define and deal with confounding factors. Difficulties in recruiting enough patients with ILD may be a possible reason for this, because a large sample size is required to control confounding factors. Furthermore, most studies failed to measure PA validly and reliably. Therefore, defining and dealing with confounding factors and PA measurements are at high risk of bias, leading to the heterogeneity of the PA metric values.
PA metrics
This review showed the pooled mean values of four commonly used PA metrics (steps, MVPA, TEE and ST) in patients with ILD. However, we need to be cautious when interpreting these values due to the substantial heterogeneity caused by differences in the activity monitors used, PA measurement methods and participant characteristics. For example, although most studies used validated activity monitors and PA metrics in COPD [71–74], there was no validation study of them in patients with ILD. However, we believe that these values are valid because PA metrics are broadly distributed in the general population or people with chronic diseases.
First, the pooled mean value of steps in ILD (5231 (95% CI 4577–5885) steps·day−1) was similar to that in patients with COPD (5723 (sd, 3768) steps·day−1) [75] and about two times lower than that in their healthy counterparts (10 195 (95% CI 8023–12 367) steps·day−1). The mean value is slightly higher than a physical inactivity threshold (5000 steps·day−1) [76], but the 95% CI includes 5000 steps·day−1. Moreover, subgroup analyses revealed that type of activity monitor, sensor location, ILD subtype, FVC, DLCO, 6MWD and use of LTOT are associated with steps, which aligns with previous findings. Although caution is required when interpreting the values due to variations in how activity monitors count steps [77], about half or more patients with ILD are inactive in terms of walk-related PA.
Second, the pooled mean value of MVPA in ILD was 97 (95% CI 64–130) min·day−1. This value is lower than healthy controls in included studies (132 and 261 min·day−1) and a previous study recruited similar-aged people (156 min·day−1) [78]. Subgroup analysis showed that the mean was 152 (95% CI 123–181) min·day−1 in studies which defined MVPA as time spent in PA of ≥2.5 METs, while the mean of MVPA defined as >3.0 METs was 67 (95% CI 51–82) min·day−1. Patients with COPD spent similar times in MVPA, defined as >3.0 METs (65 (sd 11) min·day−1) [55]. These results suggest that using a consistent definition of MVPA across studies and populations helps to ensure valid comparisons.
The mean values of MVPA in patients with ILD and healthy controls were dramatically higher than the international guideline recommendation of 150 min·week−1 (approximately 21 min·day−1) [2]. Similar values were reported in other populations (e.g. COPD, healthy controls) [55]. This highlights the need for a new PA recommendation based on MVPA measured by activity monitors. Moreover, calibration studies are needed to establish ILD-specific cut points to distinguish between various PA intensity levels using indirect calorimetry for the following reasons. First, cut points are drastically different between populations (e.g. children versus adults versus elderly) [79, 80]. Second, using cut points derived from healthy people in patients with exercise intolerance (e.g. COPD, ILD) may be flawed.
Third, the pooled mean of TEE was 2917 (95% CI 1370–4463) kcal·day−1. This value was higher than that in healthy controls in included studies (2618 kcal·day−1) and older adults (2501 kcal·day−1) [81]. After omitting two studies with outliers, the mean was changed to 2130 (95% CI 1847–2412) kcal·day−1. TEE of 2130 kcal·day−1 could be referred to as a more valid value of TEE because the means of TEE in the omitted studies might be outliers.
Fourth, the pooled mean value of ST was 587 (95% CI 253–920) min·day−1. Subgroup analyses suggested that possible sources of heterogeneity were the differences in ST definition, sensor types and DLCO. Additionally, differences in calculation method and wear-time requirements may affect ST. For example, the means of ST in Hur et al. [8] and Alexandre et al. [31] are 349 and 536 min·day−1. Hur et al. [8] measured PA 24 h·day−1 and calculated daytime ST by excluding sleeping time. Alexandre et al. [31] measured daytime (12 h·day−1) PA and calculated daytime ST. Interestingly, similar ST was observed in healthy older people in studies that measured daytime ST [82, 83]. Thus, standardisation of definition, wear-time requirement and calculation method are required to compare studies.
Finally, the use of questionnaires is limited in ILD. The most common questionnaire was IPAQ. Concurrent validity, internal consistency and responsiveness of IPAQ long-form were acceptable, and they estimated a minimally important difference [8]. Thus, IPAQ long-form can be used in combination with activity monitors or as an alternative in situations where activity monitors are unavailable.
Quality of the body of evidence
The quality of a body of evidence on steps, MVPA, TEE and ST was very low due to serious inconsistency and imprecision. Their sources were not only participant characteristics but also PA data collection and processing. Thus, future studies which measure PA using a standardised method with high reporting quality will change the pooled mean values of PA metrics.
Limitations
There are limitations in this review. First, the authors did not have access to Embase, which may reduce its comprehensiveness. Although the Cochrane Handbook for Systematic Reviews of Interventions encourages researchers to search Embase if accessible, Cochrane regularly searches Embase for trial reports and includes them in CENTRAL [84]. Thus, our review included at least relevant interventional studies, contributing to the comprehensive search. Second, ILD subtypes were inconsistent among included studies and affected the pooled value of PA metrics. Third, non-English studies were excluded during the selection process, potentially causing language bias. However, language restrictions during the selection process appeared to have little impact on language bias [85]. Fourth, studies that did not report separate participant characteristics and PA metrics for ILD patients were excluded. Fifth, multiple subgroup analyses were performed to investigate the sources of heterogeneity. Multiple subgroup analyses could reduce statistical power, leading to the potential of overlooking significant differences between subgroups [86]. Finally, we used several estimated mean values of individual studies not reporting the mean and sd of PA metrics to estimate the pooled means. These limitations could lead to biased results.
Practical recommendations
1) Researchers and clinicians are encouraged to report details of PA measurements following the checklist by Montoye et al. [15] or the template presented in table S9 in supplementary material 7.
2) Unfortunately, there is no standardised method for measuring PA in patients with ILD. Following expert consensus on objectively measured PA in COPD [87, 88] may be crucial to facilitate interpretation, pooling of PA data and comparisons with COPD.
3) The same MVPA and ST definitions should be used. We propose time spent in ≥3.0 METs PA as a definition of MVPA and time spent in ≤1.5 METs as a definition of ST, following the World Health Organization guidelines [89].
4) For identifying nonwear time, using activity monitors with automated algorithms for calculating nonwear time [90, 91] is encouraged if available. Alternatively, a wear diary could be used.
5) Patient characteristics, including ILD subtype, the severity of ILD, pulmonary function, exercise capacity and the number of patients with LTOT should be reported because these variables influence PA.
6) Validation studies are necessary to ensure the accuracy of major activity monitors and PA metrics in patients with ILD. Additionally, calibration studies are required to establish an ILD-specific cut-off for counting steps and distinguishing different intensities of PA.
Conclusions
In this systematic review of 40 studies using activity monitors or questionnaires in patients with ILD, we found severe heterogeneity in the methods used for PA data collection, processing and definition of PA metrics. The heterogeneity makes it difficult to interpret and pool PA data and compare results in ILD with other diseases. Therefore, we encourage researchers and clinicians to improve the quality of PA measurements. Our recommendations could be helpful when measuring and reporting PA in patients with ILD.
Points for clinical practice and questions for future research
We propose that PA measurements in patients with ILD should be conducted using validated activity monitors and PA metrics following expert consensus on objectively measured PA in COPD. Therefore, clinicians and researchers are encouraged to report details of PA measurements based on the checklist by Montoye et al. [15] or the template shown in table S9 in supplementary material 7. In addition, validation and calibration studies are required for more accurate measurements of PA in patients with ILD.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary methods: 1) PRISMA 2020 Checklist and 2) Search strategy ERR-0165-2022.SUPPLEMENT1
Tables S1-S5 ERR-0165-2022.SUPPLEMENT2
Results of meta-analyses and investigation of heterogeneity on physical activity metrics (figures S1-S34) ERR-0165-2022.SUPPLEMENT3
Results of sensitivity analysis (figures S35-S47) ERR-0165-2022.SUPPLEMENT4
Reference data of healthy controls in included studies (tables S6, S7; figures S48, S49) ERR-0165-2022.SUPPLEMENT5
Quality of a body of evidence evaluated by GRADE approach (table S8) ERR-0165-2022.SUPPLEMENT6
A template for reporting activity monitor use (table S9) ERR-0165-2022.SUPPLEMENT7
Acknowledgements
The authors would like to thank all authors of included studies for contributing to the advancement of PA measurement in ILD.
Footnotes
Provenance: Submitted article, peer reviewed.
Conflict of interest: M.A. Spruit reports grants from Lung Foundation Netherlands and Stichting Astma Bestrijding, grants and fees from Boehringer Ingelheim, AstraZeneca, Chiesi and TEVA, outside the submitted work. All grants and fees were paid to the author’s institute. All other authors have no conflict of interest to report.
- Received August 22, 2022.
- Accepted April 11, 2023.
- Copyright ©The authors 2023
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org