Abstract
COPD is a major cause of morbidity and mortality globally. While the significance of environmental exposures in disease pathogenesis is well established, the functional contribution of genetic factors has only in recent years drawn attention. Notably, many genes associated with COPD risk are also linked with lung function. Because reduced lung function precedes COPD onset, this association is consistent with the possibility that derangements leading to COPD could arise during lung development. In this review, we summarise the role of leading genes (HHIP, FAM13A, DSP, AGER and TGFB2) identified by genome-wide association studies in lung development and COPD. Because many COPD genome-wide association study genes are enriched in lung epithelial cells, we focus on the role of these genes in the lung epithelium in development, homeostasis and injury.
Shareable abstract
This review provides a summary of the involvement of key genes uncovered in COPD GWAS within the lung epithelium, in the context of development, homeostasis and injury. https://bit.ly/3UWdgsL
Introduction
Chronic obstructive pulmonary disease (COPD) is a progressive, debilitating inflammatory lung disease affecting 65 million people worldwide. Pathologically, COPD is characterised by emphysema together with airway inflammation and remodelling that collectively lead to deficits in airway and alveolar function and associated shortness of breath, chronic cough, hypoxaemia, susceptibility to infection and, in some cases, death. Presently, there are no treatments to halt the progression of COPD pathogenesis. While an association with environmental exposures, most notably cigarette smoke, is well established, only a proportion (25%) of long-term smokers develop COPD [1], suggesting a significant contribution of genetic factors to disease susceptibility [2].
Initial evidence supporting a genetic basis for COPD focused on clustering in families and twin studies, which estimated the heritability of COPD to be 60% [3, 4]. The first specific genetic cause for COPD was identified as ɑ1-antitrypsin deficiency, caused by a point mutation in the SERPINA1 gene [5]. Rare variant studies in families have identified mutations in TERT1 and PTPN6 in COPD patients [6, 7]. In the last decade, genome-wide association studies (GWAS) have further extended our understanding of the genetics of COPD. These analyses have highlighted a range of single nucleotide variants (SNVs) identifying regions of interest close to or within genes of interest. While GWAS identify a region rather than a gene, analysis of gene expression and other analyses [8, 9] have implicated specific genes likely associated with GWAS variants, including HHIP, FAM13A, DSP, AGER and TGFB2 [8, 10–15].
Multiple GWAS have identified overlapping risk alleles associated with both COPD and low lung function [12, 14–16]. A genetic overlap between lung function and COPD is logical given that reduced function as measured by spirometry is one factor that defines the disease. Recent studies demonstrate that in a large proportion of cases, reduced spirometric lung function in early adulthood precedes COPD development [17–20]. In addition, GWAS of both COPD and lung function have high levels of statistical enrichment in regulatory regions from fetal versus adult lung, and pathway and gene set analyses consistently identify lung development processes [8, 9]. This suggests that events in early life, including aberrant lung development, may contribute to COPD pathogenesis [21].
The respiratory tree is lined by airway and alveolar epithelial cells (figure 1). The conducting airways deliver air to the alveoli, where the close connection between alveolar epithelial cells and capillaries allows gas exchange. The airway epithelium is mainly composed of ciliated cells (which facilitate mucociliary clearance), secretory club and goblet cells (which produce mucous and antimicrobials) and basal cells (which serve as stem cells to replenish ciliated and secretory cells) [22], accompanied by relatively rare cells such as pulmonary neuroendocrine cells, ionocytes and brush cells [23, 24]. The alveoli are lined by type 1 (AT1) and type 2 (AT2) alveolar epithelial cells. AT1s are exquisitely thin to facilitate passive gas exchange [25, 26], whereas AT2s produce surfactant to alleviate surface tension and function as facultative progenitors of the alveolus, capable of both self-renewal and transdifferentiation to AT1 cells following injury [27, 28]. During development, the lung epithelium arises from the endodermal germ layer and is first specified during the embryonic phase at embryonic day (E) 9.0 in mice [29] and 4 weeks post conception in humans [30]. The subsequent pseudoglandular and cannalicular stages of lung development enable extensive lung growth through branching morphogenesis and differentiation of airway epithelial cells [31]. During the saccular stage, branching morphogenesis ceases as alveolar epithelial cells develop and begin to secrete surfactant [32]. The final alveolar stage, which in humans but not mice commences prior to birth and continues postnatally, allows septae to divide the saccules into alveoli to increase the gas exchange surface area.
In this review, we evaluate the body of literature delineating the biological roles played by leading GWAS candidate genes (HHIP, FAM13A, DSP, AGER and TGFB2) (figure 1) chosen for 1) genome-wide significance in both lung function and COPD, 2) strong evidence (e.g. through gene expression) for a link between the GWAS region and the target gene and 3) relevance for developmental biology. In particular, we focus on how these genes regulate lung epithelial development and response to injury, processes that likely share common mechanistic features [33]. Moreover, because COPD GWAS gene expression is enriched in lung epithelial cells [34], we explore the possible role of the epithelium in disease inception and/or progression. We also highlight new avenues of research to further extend understanding of COPD inception and progression.
HHIP
Hedgehog interacting protein (HHIP) is a key regulatory member of the hedgehog (Hh) family. GWAS of lung function or COPD have identified multiple SNVs in linkage disequilibrium on chromosome 4q31 (summarised in table 1) [11, 12, 14, 15, 35–44]. These SNVs are not located within a gene body, but are clustered upstream of the HHIP gene in what is likely an enhancer region [45]. Both HHIP mRNA and protein are significantly decreased in the lungs of COPD patients compared with healthy controls [45]. SNVs associated with COPD and lung function lie within a genomic region located ∼85 kb upstream of HHIP and have been found to alter HHIP expression, with COPD risk alleles associated with reduced expression of HHIP [45, 46, 58]. This enhancer region interacts with the HHIP promoter through a chromatin loop [45]. In certain cancers, including lung adenocarcinoma, HHIP is epigenetically silenced through promoter hypermethylation [59–62].
Sonic hedgehog (SHH), Indian hedgehog (IHH) and desert hedgehog (DHH) make up the mammalian Hh family. Together, they coordinate crucial developmental processes, including left–right symmetry, neural tube formation, branching morphogenesis and limb patterning. SHH is the predominant Hh family member expressed in the lung epithelium, where it determines branching morphogenesis [63–65]. In general, Hh signalling is absent in adult tissues, except in adult stem cells [66] or in cancerous cells [67].
HHIP functions as a negative regulator of Hh signalling in vertebrates [68]. HHIP is expressed either on the cell surface [69] or in a soluble form, where it acts as a decoy receptor [70, 71]. In mice, overexpression of Hhip downregulates Hh signalling to cause severe skeletal defects [68], while Hhip deficiency results in severe developmental phenotypes affecting the lungs, pancreas and skeleton [72, 73]. Hhip−/− mice, which die shortly after birth due to respiratory failure [72], are born with one left lobe and one right lobe (instead of the usual five lobes), and display defects in branching morphogenesis and reduced airspace late in gestation [72], strongly implying that Hhip is critical for lung development in mice. In humans, HHIP expression in the lung steadily increases with gestational age [74]; however, how this contributes to development of the human lung epithelium is unknown.
In adult human lungs, HHIP is expressed mainly by AT2 cells [34, 75, 76]. By contrast, in the fetal human and mouse lung, Hhip is expressed in stromal cells, in particular myofibroblasts (figure 1) [76, 77]. These differences in expression patterns could result in biologically significant and distinct hedgehog signalling phenotypes in the adult lung between mice and humans. For instance, in the mouse lung the SHH signal originates from epithelial cells and is received by mesenchymal cells, a subset of which express HHIP [78, 79]. It is possible that the Hh “sending” and “receiving” cells are different in adult humans, where HHIP is expressed by a subpopulation of AT2 cells (AT2B cells) [34]. Whether there is cross-talk between the epithelium and mesenchyme in human lungs remains to be seen.
A potential biological role for HHIP in COPD has been the topic of extensive investigation. HHIP expression in lung tissue is significantly reduced in patients with COPD [45]. HHIP knock-down in BEAS2B cells (immortalised human airway epithelial cells) upregulates genes associated with the extracellular matrix and proliferation [80]. Hhip haplo-insufficient mice exposed to cigarette smoke develop alveolar simplification, reminiscent of emphysema in COPD patients [81]. Even in the absence of cigarette smoke, these mice develop an emphysema-like phenotype with age [82]. In mice, it is likely that the absence of Hhip leads to increased T-cell-driven inflammation and oxidative stress [81, 82]. Given that ectopic Hh expression by fibroblasts in the distal lung leads to airspace enlargement [78], these effects could be directly dependent on a lack of control of canonical Hh signalling within the lung. In addition, as mentioned above, the cellular distribution of HHIP differs between mice and humans; abundant expression of HHIP in human AT2 cells might dictate alternative outcomes in response to injury in humans. We recently applied induced pluripotent stem cell-derived AT2 (iAT2) cells to interrogate the function of COPD GWAS genes, and found that HHIP expression regulated SFTPC expression in iAT2 cells but did not alter other aspects of AT2 cells (such as proliferation or differentiation capacity) [83]. Future studies, potentially employing iAT2 cells or precision-cut lung slices, elucidating responses to injury (e.g. cigarette smoke exposure) or cross-talk with the mesenchyme will extend our understanding of HHIP in the adult human lung.
FAM13A
SNVs in the family with sequence similarity 13 member A (FAM13A) gene were first identified by Cho et al. [10], who demonstrated SNVs in linkage disequilibrium at 4q22.1 in an intronic region of the FAM13A gene in COPD. This finding has since been replicated in other GWAS of COPD [10, 12, 15, 36, 37, 39, 46, 49, 50] and lung function [38, 40, 43] (summarised in table 1). FAM13A is broadly expressed in both human and mouse lung epithelial cell types, including AT2 cells, ciliated cells and club cells [75, 84], and is expressed at lower levels in lung endothelial cells and fibroblasts (figure 1). Expression of FAM13A steadily increases in the human fetal lung across development [74]. Fam13a-deficient mice, generated through deletion of exon 5, are viable at birth and developmentally normal, suggesting that in mice Fam13a is dispensable for normal lung development [84, 85].
The human FAM13A gene encodes a 117 kDa protein (ENST00000264344) that contains a Rho GTPase activating protein (GAP) domain, suggesting it could modulate cytoskeletal dynamics through Rho GTPases. In the absence of FAM13A, primary human airway epithelial cells and A549 cells (a lung carcinoma cell line) exhibit increased RhoA activity and induction of F-actin stress fibres in vitro [86], consistent with altered Rho GTPase activity. Interestingly, deletion of FAM13A also reduces E-cadherin and increases vimentin expression [86]. FAM13A modulation of epithelial-to-mesenchymal transition (EMT) could potentially occur through Rho GTPase modulation, which is known to mediate EMT [87]. It remains unknown whether FAM13A modulates Rho activity in humans in vivo or in other lung epithelial cell subtypes, such as AT2 cells, where FAM13A is highly expressed.
In addition to the full-length FAM13A gene, an alternative transcriptional start site encodes smaller isoform proteins, ranging from 77 to 80 kDa. The major transcript (ENST00000395002) using this alternative transcriptional start site corresponds with the only Fam13a isoform in mice (ENSMUST00000089860). It is important to note this distinction because the majority of studies to date have focused on this smaller isoform (in mouse studies) or do not distinguish between the isoforms (by using small interfering RNA approaches that target all transcripts). Because only the full-length transcript encodes a Rho GAP domain, it will be important for future studies to delineate the biological role of multiple FAM13A isoforms.
FAM13A represses Wnt/β-catenin signalling by interacting with protein phosphatase 2A (PP2A), thus promoting the degradation of β-catenin via glycogen synthase kinase-3β (GSK-3β) [84]. Furthermore, Akt phosphorylates FAM13A at serine-322 to promote its degradation [88]. To counter this, PP2A dephosphorylates this residue [85] and together PP2A and FAM13A interact with GSK-3β. Overexpression of FAM13A in Xenopus embryos or A549 cells leads to stabilisation of β-catenin by reducing Axin, a component of the β-catenin destruction complex [85]. Conversely, it has been demonstrated that FAM13A overexpression in immortal 16HBE cells or 3T3-L1 preadipocytes reduces overall β-catenin expression by promoting the phosphorylation and degradation of β-catenin [84, 89]. These seemingly discordant data may stem from different experimental setups, i.e. the cell lines used, or may reflect actual differences in the interaction of FAM13A and β-catenin in distinct lung cell types. It is well known that Wnt/β-catenin signalling is vital in lung development; it controls the emergence of lung progenitors in the mouse foregut [90, 91] and regulates specification of proximal and distal epithelial fates [92, 93], and inappropriate hyperactive Wnt drives non-lung lineages in the developing lung [94]. As such, the mechanisms through which FAM13A regulates β-catenin could have significance in human lung epithelial development and/or regeneration.
Gene expression studies suggest that the COPD GWAS risk allele leads to increased FAM13A expression in the lung [46, 49, 95]. Moreover, in experimental animal models, Fam13a-deficient mice are protected from cigarette smoke-induced alveolar simplification [84]. Mechanistically, the absence of Fam13a in mice is protective through increased β-catenin signalling and subsequent increased cellular proliferation [84] and decreased fatty acid oxidation and associated cellular stress [96], suggesting that elevated FAM13A expression in COPD patients might impede lung epithelial repair in response to injury. To explore the function of FAM13A in human AT2 cells, we recently applied CRISPR interference to knock down full-length FAM13A in iAT2 cells [83]. Surprisingly, in contrast to the in vivo mouse studies, loss of FAM13A slowed iAT2 proliferation [83], potentially on the basis of differences in FAM13A isoforms expressed in mice versus humans. Further work is thus needed to differentiate the function of differential isoforms of FAM13A in COPD pathogenesis in humans.
In addition to genetic factors, conditions in the COPD lung may also impact FAM13A expression. Hypoxia and inflammation modulate FAM13A expression [86, 97–99], although it remains to be seen how these environmental effects may affect FAM13A function in COPD. Finally, in addition to COPD, variants in FAM13A have also been associated with asthma and idiopathic pulmonary fibrosis (IPF) [38, 100], and elevated FAM13A expression has been observed in patients with severe cystic fibrosis [86] and in non-small cell lung cancer [99], suggesting that FAM13A may contribute to the pathogenesis of other chronic lung diseases.
DSP
Desmoplakin (DSP) is an integral component of desmosomes, cell–cell junctions that provide mechanical stability and that are involved in cellular proliferation, differentiation, migration and apoptosis [101]. In recent GWAS, SNVs in chromosome 6p24 in linkage disequilibrium have been associated with COPD and lung function [12, 13, 39, 52, 53]. These SNVs lie in coding or intronic regions of the DSP gene. DSP is expressed by epithelial cells that experience shear stress, including airway and alveolar epithelial cells [102], interacting with keratin intermediate filaments to maintain mechanical integrity [103, 104]. Expression quantitative trait loci (eQTL) analyses indicate that rs2076295 changes gene expression in the lung but not in other tissues [105], with the major allele increasing DSP expression and associated risk for COPD while the minor allele mediates reduced expression while increasing risk for IPF [8, 53, 106]. Interestingly, evidence strongly suggests this locus specifically modulates expression of DSP in lung epithelial cells [12, 107, 108], potentially functioning as an enhancer.
DSP plays an important role in development. Dsp−/− mouse embryos implant but do not survive beyond E6.5 due to the dissociation of extraembryonic tissues that rely on desmosomes [109]. Using tetraploid rescue to generate wild-type extraembryonic tissue, the same authors found that Dsp−/− embryos could survive to E12.5 but had severe developmental defects, including impaired skin, heart and brain formation [110]. While no defects in lung epithelial development were appreciated, they may not have been detectable so soon after lung specification at E9.0. In humans, mutations in DSP have been associated with Mendelian disorders such as lethal acantholytic epidermolysis bullosa [111], palmoplantar keratoderma [112], left ventricular cardiomyopathy [112], heart failure in childhood [113] and familial arrhythmogenic right ventricular dysplasia [114]. Most of these mutations truncate the C-terminus of the DSP protein, which impairs interactions with intermediate filaments. Collectively, these data indicate that DSP is crucial for development in both mice and humans.
Limited literature has explored the role of DSP in the pathobiology of COPD. Owing to lethality among Dsp−/− mice, it has not been straightforward to investigate the effect of cigarette smoke or ageing in the global absence of DSP. To address this question, we recently characterised the functional role of DSP in the lung using a combination of human iAT2 cells and mice lacking Dsp in lung epithelium [83]. DSP expression modulates the formation of desmosomes, as well as tight junctions, adherens junctions and cytoskeletal organisation in AT2 cells [83]. Intriguingly, reduction or ablation of DSP expression increased AT2 proliferation through extracellular signal-regulated kinase (ERK) signalling, both in iAT2 cells and in vivo [83]. As a consequence, DSP expression levels modulated responses to injury via cigarette smoke exposure and to treatment with transforming growth factor-β (TGF-β) [83].
Mounting evidence in non-lung organ systems suggests that DSP modulates β-catenin signalling. For example, DSP knock-down was found to induce nuclear localisation of plakoglobin (γ-catenin) with an associated decrease in β-catenin binding to transcription factors [115]. In a separate study, DSP knock-down reduced plakoglobin expression and increased β-catenin signalling [116], consistent with the possibility that DSP modulation of β-catenin may be cell type or context specific. This relationship is yet to be established in the lung, especially following injury such as cigarette smoke exposure. Moreover, DSP has been reported to bind to and maintain telomere length while its absence led to shortened telomeres and DNA damage [117], reminiscent of the shortened telomeres observed in COPD [118].
Beyond COPD, DSP has been implicated in other chronic lung diseases. SNVs in DSP, specifically rs2076295, have been reproducibly associated with IPF [8, 12, 106, 107]. While overall DSP expression is substantially elevated in IPF [106, 107], likely due to increased matrix stiffness or TGF-β signalling in IPF lungs as shown in vitro [119, 120], lower DSP expression was found in IPF AT2 cells [108], consistent with its association with the IPF G risk allele at rs2076295. It is interesting that differing expression of the same gene in the lung confers risk in opposite directions for two distinct chronic lung diseases. For instance, heightened expression of DSP increases risk for COPD, potentially through impairment of AT2 cell proliferation in response to injury; however, decreased DSP expression promotes AT2 migration and response to TGF-β [83] and increases risk for IPF.
AGER
Receptor for advanced glycation end products (RAGE, encoded by AGER) [121] was initially described for its involvement in inflammation, diabetes and atherosclerosis. RAGE is expressed as a membrane-bound receptor, but can be shed to a soluble (sRAGE) form [122] that acts as a decoy receptor to inhibit activation of RAGE. SNVs in AGER were initially discovered by GWAS of lung function in the general population, which identified SNVs on chromosome 6p21 in AGER [38, 41]. These findings have been replicated in other GWAS of lung function [16, 43]. In addition, SNVs in AGER are reproducibly associated with COPD and emphysema [8, 12–14, 44, 54]. While most GWAS variants likely act through regulatory regions, the main AGER association is a nonsynonymous variant, rs2070600, that leads to a glycine to serine change (Gly82Ser). This change increases the glycation rate and ligand-binding capacity of RAGE [123], with the A allele associated with reduced risk of COPD yet paradoxically lower sRAGE levels in blood, and sRAGE and splicing in sputum [55].
RAGE was discovered as a receptor for advanced glycation end products; however, it is now appreciated that it can be activated by a range of ligands, including S100 proteins, high mobility group box 1 (HMGB1) and nucleic acids. It is likely that specific ligands activate preferred signalling cascades, although evidence for this specificity is currently lacking. In both the human and mouse lung, RAGE is expressed predominantly by AT1 cells on the basolateral surface [124]. Indeed, the abundant expression in AT1 cells has been exploited to study AT1 cells using lineage tracing mouse studies [125]. RAGE is also expressed in various immune cells, including mononuclear cells and macrophages [126, 127], and lowly expressed in AT2 cells [128].
RAGE is abundantly expressed in many organs during development but is mostly downregulated by adulthood except in the lung [129]. In the mouse lung, RAGE expression can be detected by the late canalicular stage (E17.5), is substantially upregulated in AT1 cells as the lung enters alveologenesis (postnatal day 4) and is more highly expressed in the adult lung [130, 131]. To understand the role of RAGE during lung development, Reynolds et al. [132] engineered an inducible transgenic mouse overexpressing RAGE. These studies found that overexpression of RAGE throughout development increased mortality following birth, likely due to respiratory failure. When assessed at E18.5, RAGE overexpression led to large vacuous areas in the lungs, decreased absolute numbers of epithelial cells (assessed by NK2 homeobox 1 staining), increased pro-surfactant protein C (SPC) staining within individual AT2 cells and fewer total AT1 cells [132], suggesting that RAGE critically regulates lung epithelial fate. Similar phenotypes are observed when human RAGE is overexpressed in the developing mouse lung [133] or by targeting the RAGE overexpression specifically to AT2 cells during development, which leads to decreased collagen production and weakened basement membrane [134]. Interestingly, while RAGE overexpression during development is detrimental, Ager−/− mice have no discernible lung development malformations, suggesting that there are redundant receptors that can compensate for the absence of RAGE during lung development.
Despite serving as a key lineage marker of AT1 cells [125], our understanding of homeostatic RAGE function in this cell type is limited. In vitro, the absence of RAGE impairs transdifferentiation of AT1 cells from AT2 cells [135], suggesting that RAGE may control this important repair mechanism [136]; however, further in vivo work is required to confirm this relationship. Experiments using HEK293 cells demonstrated that RAGE promotes adherence of cells to collagen and cell spreading [137], which hypothetically suggests that RAGE may mediate AT1 cell attachment and flattening. In support of this function, increased albumin is found in the bronchoalveolar lavage fluid of Ager−/− mice, and primary murine alveolar epithelial cells isolated from Ager−/− mice exhibit impaired barrier formation [135]. Finally, acute lung injury leads to increased release of RAGE into the bronchoalveolar lavage fluid [138], whereas Ager−/− mice are protected from alveolar damage and inflammation induced by hyperoxia [139]. Together, these findings suggest that while RAGE may be required for AT1 attachment, it likewise contributes to inflammation and damage. These studies also highlight the requirement for improved in vitro systems, especially of human AT1 cells, to understand the cell intrinsic role RAGE plays in these cells.
Studies of injury mediated by cigarette smoke constituents have identified upregulation of RAGE expression and RAGE ligands (HMGB1 and S100A12) [131] in cells cultured in vitro in the presence of cigarette smoke extract. In vivo, Ager−/− mice exposed to cigarette smoke are protected from developing emphysema [140], likely due to impaired recruitment and activation of inflammatory cells [135, 140, 141], although it is plausible that the absence of RAGE may also alter the AT1 response to cigarette smoke. Among patients with COPD, RAGE expression is increased in alveolar epithelial cells [142] while systemic sRAGE is reduced and thus can be used as a biomarker of emphysema [54, 143–145]. Interestingly, there are correlations between COPD and RAGE ligands (e.g. HMGB1 [146]), suggesting that in COPD increased membrane-bound RAGE, and expression of ligands, drives the effects of RAGE.
RAGE has also been implicated in other chronic lung diseases and lung injury. Postnatal exposure to hyperoxia significantly reduces sRAGE expression [130, 147], which perhaps contributes to the excessive inflammation in these models. In models of pulmonary fibrosis, the absence of RAGE is protective [148, 149]; paradoxically, however, RAGE expression is downregulated in whole lung homogenates and isolated AT2 cells in patients with IPF [150]. Moreover, in mouse models of asthma, Ager−/− mice are protected from developing allergic asthma [151, 152] but develop virally triggered paucigranulocytic asthma [153]. Given the expression of RAGE on both AT1 cells and immune cells, it will be important for future studies to delineate the relative contribution of each compartment to fully elucidate the role of RAGE in chronic lung diseases.
TGFB2
TGF-β2 is a member of the TGF superfamily that modulates immune signalling, stem cell differentiation and extracellular matrix. Variants near the TGFB2 locus have been repeatedly identified by GWAS of both lung function and COPD [12, 13, 15, 16, 36, 39, 154]. In patients with COPD, TGF-β2 expression is reduced [15, 154] and eQTL analyses have implicated certain SNVs (rs1690789, rs6684205) in TGF-β2 expression [56, 155] (summarised in table 1).
TGF-β1, TGF-β2 and TGF-β3 signal through a heteromeric receptor comprising TGF-β type I (TGF-βRI) and type II (TGF-βRII) receptors [156]. The three TGF-β isoforms are encoded by distinct genes and share highly conserved structure and function [157] but differ in terms of cellular expression pattern. In the lung, TGF-β1 and TGF-β3 transcripts are most enriched in immune cells (such as alveolar macrophages), mesenchymal cells and the airway epithelium [158]. By contrast, TGF-β2 is predominantly expressed in the airway and alveolar epithelium (figure 1).
TGF-β signalling is critical in lung development. Tgfb2−/− mice display gross defects including cardiac, craniofacial, limb, spine and lung development deformations [159]. Despite lungs that appear grossly normal at E18.5, these mice are cyanotic, display respiratory distress with associated collapse of the bronchioles [159] and die shortly following birth. Tgfb3 mutant mice similarly fail to survive postnatally due to respiratory distress [160]. Interestingly, while Tgfb1 knockout mice have normal development but succumb to a hyperinflammatory disease several weeks post birth [161, 162], expression of constitutively active TGF-β1 causes arrested lung morphogenesis [163]. Transgenic mice deficient for Tgf-βRII in AT2 cells (using a Sftpc-Cre) or in all epithelial cells (using a Nkx2-1-Cre) are born viable but experience postnatal lung defects such as perturbed alveolarisation [164] or develop emphysema, respectively [165]. In contrast, loss of Tgf-βRII from the mesoderm (using Twist2-Cre) significantly alters lung branching morphogenesis and the ultrastructure of the lungs postnatally, including disorganised tracheal cartilage, decreased airway smooth muscle and dilated airways [164, 166]. Collectively, these data suggest that all TGF-β isoforms influence gross lung development in a spatial and temporal manner.
In the lung, TGF-β is significant in both the development and maintenance of epithelial cells. For instance, during development either overexpression or lack of expression of TGF-β reduces the number of secretoglobin family 1A member 1 (Scgb1a1)+ club cells and pro-SPC+ AT2 cells [163, 167]. Primary mouse AT2 cells upregulate all three TGF-β isoforms during the course of AT2 to AT1 transdifferentiation in vitro [168], suggesting they contribute to that process as well. Further, loss of TGF-β in vivo negatively impacts the number of AT1 cells and AT1 cell spreading [164, 169]. Finally, while TGF-β may contribute to lung epithelial homeostasis, it has also been shown to promote EMT [170], whereby epithelial cells lose cell–cell junctions and polarity while acquiring migratory properties. Exposure of primary AT2 cells or bronchial epithelial cells to TGF-β decreases expression of tight junction proteins and upregulates vimentin [171, 172], suggesting that TGF-β expression levels are important in maintaining lung epithelial cells, without promoting EMT.
Despite more than three decades of study, the contribution of TGF-β signalling in COPD remains uncertain. While little work has focused mechanistically on TGF-β2, findings related to TGF-β1 may be relevant given their functional redundancy. While TGF-β2 expression is decreased in COPD [15, 154], some but not all reports reveal increased expression of TGF-β1 in COPD patients, especially in epithelial cells [173–178]. Further, studies featuring mice or in vitro cigarette smoke exposure have identified upregulation of TGF-β and TGF-βR [179–182]. In contrast, TGF-β1, TGF-β2 and other TGF-β signalling components were downregulated in human bronchial epithelial cells following cigarette exposure [183]. These findings suggest that in vivo there is likely an important interplay between epithelial cells and mesenchymal and/or immune cells in response to cigarette smoke and the corresponding TGF-β response. Owing to the lethality of Tgf-β−/− mice, models of emphysema have instead used mice lacking genes involved in TGF-β signalling. Spontaneous emphysema develops in Smad3−/− [184], integrin αVβ6−/− (which binds and activates TGF-β) [185] and fucosyltransferase 8−/− (which exhibit dysregulated TGF-βRI activation) [186] mice. Finally, mice lacking TGF-β signalling are protected from fibrosis [165], while instillation of TGF-β causes lung fibrosis [187, 188]. Thus, in the lung there is delicate balance required for TGF-β signalling to facilitate repair following injury and to circumvent emphysema without inducing inappropriate EMT and/or fibrosis.
Questions for future research
Large-cohort GWAS of lung function and COPD have identified a number of candidate genes to explore in COPD, including HHIP, FAM13A, DSP, AGER and TGFB2, as well as other genes not reviewed here but also of interest (e.g. RARB, SFTPD, MFAP2, ADGRG6 and NPNT). Mounting evidence suggests that these genes may have significance in COPD based on their involvement in lung development. Significantly, these same developmental processes may be important for repair following injury.
A number of discussed genes converge on shared developmental pathways. Given the complex genetics underlying COPD, it will be interesting to understand how GWAS genes interact. For instance, FAM13A and DSP have both been directly implicated in altering Wnt/β-catenin signalling [84, 85, 115, 116], a vital pathway in repairing lung injury, through cellular proliferation and differentiation, including stem cell renewal. FAM13A and TGF-β2 can both regulate EMTs [87, 170], an often maladaptive response to injury. TGF-β signalling interacts with retinoic acid signalling (RARB is a COPD GWAS gene) during lung development [189]. These examples highlight the potential for shared pathways to be acted upon by multiple genes of interest and to contribute to COPD pathogenesis through dysfunction of processes central to lung development and repair. This concept is supported by our recent work demonstrating that the 559 genes associated with COPD were enriched in only 29 pathways [9].
This review of these COPD GWAS genes highlighted considerable overlap between genes associated with COPD and IPF. While both are diseases of ageing and associated with cigarette smoke exposure, the pathophysiology of the diseases differs considerably, with COPD causing airflow obstruction while IPF leads to lung restriction. Thus, it is perhaps unsurprising that while SNVs in FAM13A and DSP are common across these diseases, the direction of effect on expression is opposite. That is, IPF risk variants lead to lower cellular expression of FAM13A and DSP while COPD risk variants lead to increased expression [46, 49, 95, 108]. Given these genes also converge on common pathways (e.g. Wnt/β-catenin, discussed above), these observations may uncover both shared and divergent processes that lead to inception and progression of COPD versus IPF following the same environmental insult (i.e. cigarette smoke).
Despite this accumulating evidence, additional studies will be required to determine the function of these genes in the pathobiology of COPD. Omics approaches, such as high-throughput genetic perturbation (e.g. Perturb-Seq), single-cell lung assay for transposase-accessible chromatin sequencing and RNA-sequencing, that can identify which variants or regulatory regions affect gene expression in specific cell types have yet to be extensively applied to genes implicated through COPD GWAS. A recent study integrated single-cell RNA-sequencing with eQTL across 38 cell types in IPF [190]. Future studies employing the same technique for COPD have the potential to significantly advance the field.
While functional validation of GWAS variants remains a bottleneck in the field generally, functional studies in biologically relevant human cells representative of the distal lung, including small airway and alveolar epithelial cells, have been particularly slow to emerge. This is likely due in part to the difficulty in sourcing, genetically manipulating and maintaining primary cells from that compartment, such as AT2 cells, in long-term culture [191]. Technical advances in primary cell culture [192, 193] together with studies in human induced pluripotent stem cell-derived airway and alveolar epithelial cells may circumvent these problems [194–197]. While there are some limitations to murine models, such as differences in expression patterns of GWAS genes of interest (e.g. HHIP), lineage-specific deletion of genes of interest in mouse models will further illuminate how these genes affect lung development, epithelial cell homeostasis and response to injury in coordination with surrounding mesenchymal and immune cells. Finally, CRISPR technology allows the knock out, knock down or activation of genes [198–202], the manipulation of enhancer regions [203] and precise editing of single bases [204]. Further application of these techniques will allow loss/gain of function studies to interrogate gene function, as well as precise editing of SNVs to understand how specific variants regulate the gene of interest, in specific cell types, in COPD.
Footnotes
Provenance: Submitted article, peer reviewed.
Conflict of interest: X. Zhou has received grant support from Bayer. M.H. Cho has received grant support from GSK and Bayer; and has received speaking/consulting fees from Illumina, AstraZeneca and Genentech. R.B. Werder and A.A. Wilson declared no conflicts of interest.
Support statement: This work was supported by a CJ Martin Early Career Fellowship from the Australian National Health and Medical Research Council awarded to R.B. Werder; NIH grants U01TR001810, R01DK101501, R01DK117940 and R01HL166407 awarded to A.A. Wilson; and NIH grants R01HL153248 and R01HL147148 awarded to M.H. Cho. Funding information for this article has been deposited with the Crossref Funder Registry.
- Received February 5, 2024.
- Accepted February 23, 2024.
- Copyright ©The authors 2024
This version is distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0. For commercial reproduction rights and permissions contact permissions{at}ersnet.org