Introduction

Chronic obstructive pulmonary disease (COPD) and lung cancer are the most striking lung diseases with high and increasing morbidity and mortality worldwide [1, 2]. In China, COPD has achieved a prevalence of 8.2 % in adults over 40 years old and accounts for more than one million deaths and over five million disabilities each year, while lung cancer has been the leading cause of cancer incidence and mortality for years [1, 2]. COPD and lung cancer are closely related as that COPD patients suffer an lung cancer incidence of 16.7 cases per 1,000 person-years, which was four times the incidence in general smoking population [3]. Moreover, 40–70 % of lung cancer patients have concomitant COPD [4].

Recently, great attention has been paid on the inherent relation between COPD and lung cancer [5, 6]. It is well recognized that these two diseases share some pivotal pathologic mechanisms such as chronic inflammation in response to extracellular stimuli [6], suggesting a commonality of etiological factors that involve both diseases. Indeed, a number of epidemiologic studies have documented several environmental and genetic factors that are associated with both diseases, among them some are overlapping [2, 79]. Tobacco smoking is the most important risk factor for them, and we are glad to see a modest reduction in both COPD and lung cancer incidences and deaths due to a general decline in smoking rate in the USA population [10]. However, other factors such as prior chronic lung diseases (i.e., emphysema, chronic bronchitis) [11, 12], environmental factors like occupational exposure to dust at work [13, 14], house environment, and lifestyle [2, 15] are also reported to be more or less related to COPD or lung cancer risk. This reflects considerable room for prevention of the two diseases.

Revealing the shared factors of COPD and lung cancer would not only help both diseases prevention but also deepen our knowledge about their etiological link. However, current assumptions about these shared factors can only been presumed by independent studies about COPD or lung cancer that were conducted in different areas or different populations. There was no study simultaneously investigating on risk factors for COPD and lung cancer to show shared factors of both diseases. Moreover, a recent study reported a mediation effect of COPD on association between smoking and lung cancer [16], revealing a novel role of COPD on the formation of lung cancer that COPD may act as an intermediate phenotype ahead of lung malignant transformation. We have previously identified that COPD and lung cancer shared some genetically susceptible factors [9, 17]. In the current study, in order to reveal the shared risk factors for COPD and lung cancer in Chinese, we conducted four independent case–control studies in southern and eastern Chinese to test and validate associations between twenty-three environmental factors and two diseases risk in a total of 1,511 COPD patients and 1,677 normal lung function controls and 1,559 lung cancer cases and 1,679 cancer-free controls during 2002–2011. We then used the lung cancer case–control studies to analyze the mediation effect of COPD on associations between these shared factors and lung cancer risk.

Methods

Study subjects

Four independent case–control studies were conducted in southern and eastern Chinese during 2002–2011 for COPD and lung cancer. The studies were approved by the institutional review boards of Guangzhou Medical University and Soochow University. Subjects without complete or with confused information on exposure variables were excluded. Briefly, lung function of all COPD patients and controls were measured by the Spirometry test (EasyOne Spirometer, ndd Medizintechnik AG, Switzerland). Subjects with forced expiratory volume in one second (FEV1) to forced vital capacity (FVC) <70 % after inhalation of 400 μg salbutamol, and with at least one of followed chronic airway symptoms over 2 weeks in life including chronic cough, dyspnea, sputum production, or wheezing were diagnosed to be COPD cases. A total of 1,025 COPD patients and 1,061 normal lung function controls were recruited from Guangzhou city; 486 COPD patients and 616 normal controls were enrolled form Suzhou city as described previously [9, 17]. According to the global initiative for chronic obstructive lung disease [18], there were 359 (35.0 %) cases of stage I, 356 (34.7 %) stage II, 217 (21.2 %) stage III, and 93 (9.1 %) stage IV in southern Chinese, and 213 (43.8 %) cases of stage I, 206 (43.4 %) stage II, 54 (11.1 %) stage III, and 13 (2.7 %) stage IV in eastern Chinese. A total of 1,056 histopathologically confirmed cases with primary lung cancer and 1,056 cancer-free controls were recruited from Guangzhou city; 503 lung cancer cases and 623 cancer-free controls were enrolled from Suzhou city as described previously [1922]. All controls were age-(±5 years) and sex-frequency-matched with cases. Furthermore, there were 217 lung cancer patients with pre-existing COPD who were physician-diagnosed COPD with Spirometry testing at least 1 year before lung cancer diagnosis. Because information on pre-existing COPD of lung cancer cases was obtained by interviewing with a scheduled questionnaire, we did not have any data on the GOLD stage. The detailed information on subjects’ recruitment was presented in Appendix as a supplementary material.

Data collection

After a signed informed consent was given from each subject, a scheduled questionnaire was used to collect data on individuals’ demographic characters and surrounding variables by two trained technicians. The same questionnaire and scale were used for collecting all data in the two populations. The demographic characters covered four elements such as age, sex, body mass index (BMI: <18.0, 18.0–25.0, >25.0), and educational experience (i.e., never, primary school, secondary school, and college or university). The surrounding variables included twenty-three possible risk factors, namely pre-existing tuberculosis, pre-existing chronic bronchitis, pre-existing emphysema, pre-existing silicosis, smoking status and pack-years smoked, passive smoking and its source, drinking status, occupational exposure (to dust, arsenic, asbestos, paint, or metallic toxicants), house ventilation, kitchen ventilator, coal burning, liquefied gas burning, biomass burning, cooking times in 1 week, vegetables/fruits consumption, cured meat consumption, Chinese sauerkraut/pickles consumption, and salted fish/meat consumption. These variables are more or less reported to be risk factors for COPD or lung cancer or both in abundant studies. The detailed definitions of selected variables were presented in Appendix as a supplementary material. As a supplement, subjects who had pre-existing pulmonary diseases at least 1 year before case diagnosis or control enrollment were defined as “Yes” if they provided reliable medical records. Individuals who had at least 10 years occupational exposure history were defined as “Yes”, while the reminders were defined as “No”. All individuals were Chinese Han, and subjects with confused or defective information on above factors were excluded.

Statistical analysis

The differences in distribution of demographics between cases and controls were analyzed using the chi-square test. The odds ratio (OR) and 95 % confidence interval (95 % CI) were estimated by the unconditional logistic regression model. The Breslow–Day test was used to test the homogeneity of the variables’ contributions to the risk of COPD and lung cancer. The multinomial logistic regression analysis was performed to compare the ORs of these shared factors between individuals with and without pre-existing COPD in the lung cancer studies [23]. A mediation model with the Sobel test tool (http://quantpsy.org/sobel/sobel.htm) was used to test the indirect effects that these shared factors had on lung cancer via COPD [2428]. Furthermore, we applied a multiplicative interaction model to evaluate possible interactions between the shared factors and COPD on affecting lung cancer risk [29]. Detailed statistical protocol for the mediation test was presented in Appendix as a supplementary material. A sensitivity analysis was performed to analyze the mediation effect of COPD on smoking/lung cancer association with an assumed 3 % measurement error rate in smoking. All tests were two-sided and evaluated by the SAS software (version 9.3; SAS Institute, Cary, NC). p < 0.05 was considered to be statistically significant.

Results

As shown in Table 1, concordant results were observed for all case–control studies, age and sex matched well between cases and controls (p > 0.05 for all). Otherwise, there were more individuals with lower BMI (<18.0) and less education experience in cases than controls (p < 0.05 for all). These variables were further adjusted for in the multivariate logistic regression model to control possible confounding on the main effects of selected factors.

Table 1 Frequency distribution of demographics in COPD/lung cancer cases and controls

The frequency distributions of selected factors in the southern Chinese and their associations with risk of COPD and lung cancer are presented in Table 2. Up to sixteen factors were significantly associated with COPD risk. They were pre-existing tuberculosis, pre-existing chronic bronchitis, pre-existing emphysema, smoking (or high pack-years smoked), passive smoking (especially passive smoking from parents), occupational exposure to dust or arsenic or metallic toxicants, house ventilation, kitchen ventilator, coal burning, liquefied gas burning, biomass burning, vegetables/fruits consumption, cured meat consumption, and Chinese sauerkraut/pickles consumption (p < 0.05 for all). Among them, pre-existing lung diseases such as chronic bronchitis (OR = 3.02, p = 5.28 × 10−15) and emphysema (OR = 4.24, p = 2.06 × 10−13) contributed to extremely high risk of COPD, which are due to the fact that the two diseases mostly pertain to COPD if the patients have irreversible limitation in lung airflow [30]. In addition to the prior lung diseases, occupational exposure to arsenic accounted for the second highest risk (OR = 2.98, p = 1.00 × 10−4), while high pack-years smoked achieved the greatest statistically significance (OR = 1.88, p = 8.63 × 10−9). Likewise, thirteen factors conferred significantly increased risks of lung cancer (p < 0.05 for all), including pre-existing tuberculosis, pre-existing chronic bronchitis, pre-existing emphysema, smoking (or high pack-years smoked), passive smoking (from parents or children), occupational exposure to dust or asbestos or metallic toxicants, poor house ventilation, no kitchen ventilator, biomass burning, cured meat consumption, and seldom vegetables/fruits consumption. Among these factors, occupational exposure to metallic toxicants held the highest risk (OR = 2.85, p = 9.00 × 10−4), and high pack-years smoked harbored the most statistically significance (OR = 2.02, p = 8.63 × 10−9).

Table 2 Frequency distributions and ORs of physical and environmental factors on risk of COPD and lung cancer in southern and eastern Chinese

Findings in the eastern Chinese were generally consistent with the above results as listed in Table 2. The aforementioned sixteen factors that are found to be associated with COPD risk were confirmed to be risk factors for COPD except for occupational exposure to arsenic, kitchen ventilator, coal burning, and Chinese sauerkraut/pickles consumption. Besides, the risk factors for lung cancer in the southern Chinese were also confirmed except for pre-existing chronic bronchitis, occupational exposure to dust, and kitchen ventilator.

Factors that were significantly associated with both COPD and lung cancer risk in the southern and eastern Chinese were recognized to be shared risk factors for COPD and lung cancer as shown in Table 3. Risk of COPD and lung cancer was increased in individuals with pre-existing tuberculosis, pre-existing emphysema, smoking or high pack-years smoked, passive smoking, occupational exposure to metallic toxicants, poor house ventilation, biomass burning, cured meat consumption, and seldom vegetables/fruits consumption (p < 0.05 for all). The homogeneity test further indicated that the differences in frequency distributions of these shared factors between cases and controls were consistent in COPD groups and lung cancer groups (Breslow–Day test: p > 0.05 for all) except for pre-existing emphysema (p = 6.48 × 10−5). In addition, given the dramatic sex differences of exposures to active tobacco smoking throughout China, we specially tested the effect of smoking on COPD and lung cancer risk stratified by sex. Although significant associations were observed between smoking and either COPD or lung cancer risk in males but not in females, the differences between stratum ORs by sex were not significant in each case–control study (Breslow–Day test: p > 0.05 for all).

Table 3 The frequency distributions and ORs of shared risk factors for COPD and lung cancer in the pooled population

On account of the fact that almost all these shared factors conferring consistent risks of COPD and lung cancer, we further performed the multinomial logistic regression analysis using the cancer-free controls as a reference group to infer the effect differences of these factors with regard to COPD status on lung cancer development (Table 4). The comparison between the ORs for the eight shared factors revealed that smoking (p = 4.53 × 10−6), high pack-years smoked (p = 0.001 for <20; p = 2.00 × 10−4 for ≥20), and biomass burning (p = 8.42 × 10−5) harbored significantly higher risk of lung cancer in individuals with pre-existing COPD than those without pre-existing COPD, while the others did not (p > 0.05 for all).

Table 4 Comparison of the ORs for shared factors associated with COPD and lung cancer risk in lung cancer groups with and without pre-existing COPD by the multinomial logistic regression analysis

Meanwhile, we performed the mediation model to assess the mediation effect of COPD on associations between these shared factors and lung cancer risk. Only smoking was observed to have a borderline significant interaction with pre-existing COPD on increasing lung cancer risk (p = 0.062); therefore, the interaction was introduced into the model for mediation analysis. As shown in Fig. 1, α was the comparable regression coefficient for association between the shared risk factors and COPD; β was the comparable regression coefficient for association between COPD and lung cancer; τ′ was the comparable regression coefficient for association between the shared risk factors and lung cancer; and θ was the comparable regression coefficient for interaction between the risk factors and pre-existing COPD on lung cancer risk. We found that COPD acted as a mediator in associations between smoking (α = 0.168, β = 0.068, τ′ = 0.111, θ = 0.115), high pack-years smoked (α = 0.099, β = 0.065, τ′ = 0.058), passive smoking (α = 0.031, β = 0.065, τ′ = 0.051), biomass burning (α = 0.078, β = 0.065, τ′ = 0.079), and lung cancer risk. The indirect effect of COPD was statistically significant as results from the Sobel test shown (p values were 0.005 for smoking, 0.006 for pack-years smoked, 0.041 for passive smoking, and 0.039 for biomass burning), and COPD in turn explained about 12.0, 9.9, 3.8, and 6.1 % of the effects of above factors on cancer risk. COPD also harbored 13.0 % of the effect that pre-existing tuberculosis had on lung cancer risk (p = 0.005). In terms of the other factors, although the mediation model suggested COPD might explain about 2.6 % of house ventilation, 9.8 % of vegetables/fruits consumption, and 0.06 % of cured meat consumption on cancer risk, none of these mediation effects were statistically significant (p values in turn were 0.273, 0.392, and 0.984). In addition, we performed a sensitivity analysis for mediation effect of COPD on smoking/lung cancer association with an assumed 3 % measurement error rate in smoking. The test showed that the mediation effects of COPD on smoking-caused lung cancer were all significant (all p < 0.05) in the three scenarios with the minimum smoking rate, current smoking rate and maximum smoking rate, and the mediated proportions were approximately same.

Fig. 1
figure 1

Path models for the mediation effect of COPD on associations between the shared factors and lung cancer risk. a Smoking. b Pack-year smoked. c Passive smoking. d Biomass burning. The indirect effects of COPD were statistically significant as obtained from the Sobel test as that COPD explained about 12.0, 9.9, 3.8, and 6.1 % of the effect of the above factors in turn (p values were 0.005 for smoking, 0.006 for pack-year smoked, 0.041 for passive smoking, and 0.039 for biomass burning)

Discussion

We conducted four independent case–control studies for COPD and lung cancer in southern and eastern Chinese and identified that eight factors, namely pre-existing tuberculosis, smoking status (or high pack-years smoked), passive smoking, occupation exposure to metallic toxicant, poor housing ventilation, biomass burning, cured meat consumption, and seldom vegetables/fruits consumption, contributed to consistently increased risk of both diseases. Smoking (or high pack-years smoked) and biomass burning conferred significantly higher lung cancer risk in individuals with pre-existing COPD than those without. Moreover, COPD acted as a mediator of associations between smoking, passive smoking, biomass burning, and lung cancer risk.

All the shared factors discovered in the current study were without exception proposed to be associated with COPD or lung cancer risk in previous studies. Pre-existing lung diseases such as tuberculosis [31], emphysema [30], smoking and passive smoking [2], occupation exposure to metallic toxicant [32], housing ventilation [33], biomass burning [2], vegetables/fruits consumption [34], and cured meat consumption [35] have been proven to be risk factors for COPD. Likewise, these factors are also related to lung cancer [11, 3638]. Our study was unique in that we compared the risk effects of these factors between COPD and lung cancer, which was not allowed in previous respective epidemiological studies. We found that except for pre-existing emphysema, the associations of these shared factors with COPD and lung cancer risk were consistent. Pre-existing emphysema exerted a significantly higher risk of COPD than of lung cancer, which is due to the fact that emphysema is mostly recognized to be COPD when the patients have irreversible limitation in lung airflow [30]. Moreover, we found that there was no significant difference between the risk effects of smoking on either COPD or lung cancer between males and females, although most of smokers were tended to be males. This was consistent with a recently published cohort study [39].

Many studies have reviewed the shared pathological mechanisms of COPD and lung cancer such as airway inflammation, DNA damage, and epithelial-to-mesenchymal transition (EMT) [6, 40]. The above shared factors are all well-established inducers of lung lesion by virtue of chronic infection, DNA damage, or functional change of various genes [41]. They in turn influence the risk of both diseases. For instance, emphysema is accorded with over-activated inflammation [42]; smoking can induce cell proliferation, apoptosis resistance, inflammation, and DNA alterations [43]; and metallic toxicants can trigger EMT in lung [44]. Given all descriptions above, these shared factors are all plausible causes of COPD and lung cancer. Therefore, they are objects that can be potentially controlled in prevention of both diseases.

Having considered that COPD acts as a risk factor for lung cancer, we used the lung cancer case–control studies to show the role of COPD on associations between above shared factors and lung cancer risk. Smokers, especially those smoked more than 20 packs per year, or biomass users, once suffer from COPD, would be more likely to develop lung cancer than those do not suffer from COPD because there was a significantly higher risk in smokers or biomass users with pre-existing COPD compared to those without, implying that COPD may modulate the effect of smoking and biomass burning on lung cancer risk. Likewise, the mediation model revealed that COPD had a significantly indirect effect on associations between smoking, passive smoking, biomass burning, and lung cancer risk. COPD explained about 12.0 % of effect smoking had, 9.9 % of effect more than 20 packs per year consumption had, 3.8 % of effect passive smoking had, and 6.1 % of effect biomass burning had on lung cancer risk. These mean that individuals who are smokers or consume more packs per year and those who use biomass as fuel are more likely to be COPD first, and in turn develop lung cancer. The indirect effect of COPD on smoking is much lower than a previous study that reported an indirect effect of 32.1 % [16]. This may be due to that in their study, the authors used physician-diagnosed emphysema as COPD, while we used much more rigorous criterion based on the Spirometry diagnosis. Also, there must be biased estimations of association between smoking and COPD, because the authors used the standard logistic regression to assess the regression coefficient of smoking-COPD association. This would cause biased evaluation of the indirect effect of COPD [27]. Moreover, we also observed that COPD has a mediation effect of pre-existing tuberculosis on lung cancer risk. However, it has been reported that tuberculosis increased the risk of COPD [45], COPD also increased the tuberculosis risk [46]. It was difficult for us to determine the causal sequence between COPD and tuberculosis. Thus, it was not conceivable that the mediation effect of COPD existed between pre-existing tuberculosis and lung cancer risk. Overall, we support that COPD screening for the prevention of COPD and lung cancer should be conducted so long as it is feasible and should be initiated as early as possible in such high risk exposure individuals, so that they can take steps to reduce their predisposition to lung cancer such as quit smoking.

Our study has several unique aspects. First, we have discovered and validated the associations between these possible factors and two diseases risk by virtue of the southern and eastern populations. Second, the subjects were selected based on strict standards that all COPD cases and controls were chosen with the Spirometry test, and lung cancer patients were histopathologically confirmed. Third, the sample size in the current study was relatively large. Moreover, under an assumed 3 % measurement error rate in smoking, the sensitivity analysis still showed that the mediator role of COPD on smoking-caused lung cancer was valid. Finally, the standardization approach proposed by MacKinnon and Dawyer [47] was applied for a scenario with a presence of interaction in the current study for the first time. The approach seemed to be reasonable as supported by a recent study [48]. Nevertheless, limitations such as selection bias and information bias cannot be rule out owing to the fact that the study was based on case–control design restricted with Chinese Han population. Also, the mediation analysis used in the current study was based on no unmeasured confounding assumptions that included no exposure–outcome confounding, no mediator–outcome confounding, no exposure–mediator confounding, and no mediator–outcome confounders affected by the exposure. Thus, some omitted possible confounders might lead to biased estimates and incorrect results on the mediation effect of COPD [49]. In addition, because we only recognized these lung function diagnosed individuals with the Spirometry testing to be pre-existing COPD, it surely seems that the frequency of COPD is a little lower than the reality in the current study. This may underestimate the mediation effect of COPD on lung cancer development.

In conclusion, in the current case–control studies, we proposed eight factors that contribute to consistently increased risks of COPD and lung cancer. Among them, the effect of smoking or pack-year smoked, biomass burning on increasing lung cancer risk is modulated by COPD; and COPD acts as a mediating phenotype of the relationships between smoking, passive smoking, biomass burning and lung cancer development. Our data exhibited a shared spectrum of etiological factors for COPD and lung cancer in Chinese, which should be in consideration for prevention of both diseases.