For many respiratory physicians, point-of-care chest ultrasound is now an integral part of clinical practice. The diagnostic accuracy of ultrasound to detect abnormalities of the pleura, the lung parenchyma and the thoracic musculoskeletal system is well described. However, the efficacy of a test extends beyond just diagnostic accuracy. The true value of a test depends on the degree to which diagnostic accuracy efficacy influences decision-making efficacy, and the subsequent extent to which this impacts health outcome efficacy. We therefore reviewed the demonstrable levels of test efficacy for bedside ultrasound of the pleura, lung parenchyma and thoracic musculoskeletal system.
For bedside ultrasound of the pleura, there is evidence supporting diagnostic accuracy efficacy, decision-making efficacy and health outcome efficacy, predominantly in guiding pleural interventions. For the lung parenchyma, chest ultrasound has an impact on diagnostic accuracy and decision-making for patients presenting with acute respiratory failure or breathlessness, but there are no data as yet on actual health outcomes. For ultrasound of the thoracic musculoskeletal system, there is robust evidence only for diagnostic accuracy efficacy.
We therefore outline avenues to further validate bedside chest ultrasound beyond diagnostic accuracy, with an emphasis on confirming enhanced health outcomes.
The next challenge in bedside chest ultrasound is to refocus from diagnostic accuracy toward patient outcomes http://ow.ly/NyNR3027WLU
Point-of-care chest ultrasound is now an integral part of practice for many respiratory and critical care clinicians [1–3]. Its widespread adoption has been facilitated by the advent of portable high-performance scanners, and the transfer of skills from other ultrasound applications in respiratory medicine including endobronchial ultrasound [4–6]. However, the greatest single driver behind the expansion in bedside chest ultrasound is probably the identification of characteristic sonographic features for common thoracic conditions. The diagnostic accuracy of such ultrasound findings is high, especially for pleural and parenchymal abnormalities .
Yet diagnostic accuracy should not be the only consideration when evaluating a test such as chest ultrasound. Other aspects of test efficacy are also important, in particular the impact on patient outcomes . Indeed, Fryback and Thornbury  proposed six levels of test efficacy relevant to medical imaging: technical efficacy, diagnostic accuracy efficacy, diagnostic thinking efficacy, therapeutic efficacy, patient outcome efficacy and societal efficacy [8, 9]. We shall touch on each briefly, as our subsequent narrative is structured around these levels of efficacy in relation to bedside chest ultrasound. To simplify discussion, we group these six levels of efficacy under three broad domains: test attributes, clinical decision-making and health outcomes (figure 1).
Levels of chest ultrasound efficacy
Test attributes comprise technical efficacy and diagnostic accuracy efficacy, which relate to the physical testing system. Technical efficacy for chest ultrasound encompasses machine characteristics, operator proficiency and the semiology of imaging findings. Diagnostic accuracy efficacy refers to test characteristics (e.g. sensitivity and specificity) for detecting various pathological conditions.
Clinical decision-making involves diagnostic thinking efficacy and therapeutic efficacy. These levels of efficacy relate to the cognitive impact of the test result on clinician behaviour . Diagnostic thinking efficacy describes the usefulness of a test to influence a clinician's thinking. Therapeutic efficacy is the degree to which test results affect patient management, such as a decision regarding further testing or treatment.
Health outcomes consist of patient outcome efficacy and societal efficacy. These levels of efficacy reflect the test's impact on real-world outcomes. Patient outcome efficacy is probably the best measure of test value, since the main aim of medical care is to improve patient well-being. A recent example is the reduction in lung cancer mortality through chest computed tomography (CT) screening . Societal efficacy is usually described in terms of cost–benefit and is often estimated by economic analyses. Randomised controlled trials (RCTs) can also be employed to address this question.
In general, the higher levels of test efficacy build upon the lower levels. Some degree of diagnostic accuracy is usually required to change clinician decision-making, which in turn is needed to improve final patient outcomes . Of course, diagnostic accuracy does not always change decision-making, nor do changes in medical decision-making automatically confer outcome efficacy .
Scope of this review
The levels of test efficacy described above form a useful framework when evaluating the usefulness of a diagnostic test. The intention of this review is to summarise and appraise the evidence for bedside chest ultrasound at each level of test efficacy. We conclude by outlining future avenues to further validate bedside chest ultrasound beyond diagnostic accuracy, with an emphasis on demonstrating improved health outcomes.
Readers will find that substantial evidence is available for the lower levels of chest ultrasound efficacy (figure 1). Yet even here, some uncertainties remain around the diagnostic accuracy for common conditions. At higher levels of test efficacy, less evidence for benefit exists. This issue requires attention, since chest ultrasound only confers real-world value if health outcomes are improved.
In this review, we define chest ultrasound as sonography of the pleura, the lung parenchyma and the thoracic musculoskeletal system. Echocardiography is not included.
Test attributes: technical and diagnostic accuracy efficacy
This section will be discussed according to three anatomical areas: pleural syndromes, comprising pneumothorax and pleural effusion; parenchymal syndromes, comprising lung consolidation and the interstitial syndrome; and musculoskeletal syndromes, comprising chest wall and diaphragm abnormalities.
Ultrasound is used to diagnose pneumothorax. Three sonographic features have been described: the absence of “lung sliding”, the absence of “B-lines” and the presence of “lung point” (table 1).
“Lung sliding” is a to and fro movement along the pleural line in time with respiration [1, 13]. This gives a granular lung artefact on M-mode termed the “sea-shore sign” (figure 2a). An associated finding is “lung pulse”, where the pleural line moves in time with transmitted cardiac pulsations . The presence of lung sliding or lung pulse always excludes pneumothorax in the area scanned. The absence of lung sliding is suggestive of, but not specific for, pneumothorax . The absence of lung sliding causes a linear artefact on M-mode known as the “bar-code sign” (figure 2b). Importantly, hyperinflation in chronic obstructive pulmonary disease (COPD)  and pleural adhesions  can also cause loss of lung sliding (table 1).
“B-lines” (figure 2c) are vertical artefacts projecting from the pleural line to the bottom of the screen that move with respiration [1, 16]. The presence of B-lines excludes pneumothorax, but their absence does not confirm it .
“Lung point” has been defined as “the absence of any sliding or moving B-lines at a physical location where this pattern consistently transitions into an area of sliding, which represents the physical limit of pneumothorax as mapped on the chest wall” . It is 100% specific for partial pneumothorax [18, 19], and represents the location where visceral and parietal pleurae part company. Animal and human studies suggest that the position of lung point in relation to the mid-axillary line correlates with pneumothorax size [20, 21]. Lung point is not seen in complete pneumothorax, when no lung is in contact with parietal pleura.
The accuracy of ultrasound for pneumothorax diagnosis has been compared to chest radiography (table 2) in four meta-analyses [24, 27–29]. Pooled ultrasound sensitivity was 78–90% and pooled specificity was >98% [24, 27–29]. Chest radiography had a poorer pooled sensitivity of 39–52%, but a similar pooled specificity [24, 25, 27]. Important caveats apply. First, the populations studied were mainly trauma and critically ill patients, or those who had undergone percutaneous thoracic procedures. Thus, the results may not be applicable to patients presenting with suspected spontaneous pneumothorax, particularly those with underlying COPD . Secondly, the prevalence of pneumothorax in the reviews was between 13 and 30%, reflecting the highly selected study population. Many studies also excluded patients in whom ultrasound was not technically possible. Thirdly, the chest radiograph comparator was generally a supine film, which is poorly sensitive for pneumothorax.
Substantial heterogeneity was present among all four meta-analyses, possibly due to operator performance . Also, some reviews meta-analysed by patient and others by hemithorax. We have previously found that using different units of analysis can give very different test characteristics .
The evidence therefore suggests that ultrasound is superior to chest radiography in detecting pneumothorax among trauma, critically ill and post-procedural patients. These settings combine a high prevalence of traumatic pneumothorax with the immediate availability of trained sonographers, and the use of ultrasound as the initial test of choice is well supported. However, the accuracy of ultrasound for spontaneous pneumothorax is unclear and requires further study. Until more evidence has accrued in this area, we recommend that conventional radiology be used to diagnose spontaneous pneumothorax.
In pleural effusion, ultrasound is useful for diagnosis, prognosis and to guide therapeutic interventions. Ultrasound has been used to detect pleural effusion for more than 50 years (figure 2d) . Simple effusions on B-mode are anechoic [34, 35], bounded by parietal pleura above, visceral pleura below and rib shadows bilaterally (table 1). This is termed the “quad sign” . The depth of the visceral pleura from the probe oscillates with the respiratory cycle, giving rise to the “sinusoid sign” on M-mode [37, 38]. The presence of “fluid colour” on Doppler imaging due to pleural fluid movement is also characteristic [39, 40].
Ultrasound can characterise pleural effusion. While all transudates are anechoic (figure 2d), exudates may be either anechoic or echoic . Visible internal echogenicity or stranding indicates an exudate . Ultrasound may suggest the cause of pleural effusion. In an area where tuberculosis is uncommon, pleural thickening of >1 cm or pleural nodularity predicted malignancy with a modest sensitivity, but a high degree of specificity . Echogenic swirling also suggests malignancy . Ultrasound can quantify pleural effusion, but measurements and formulae are only valid for free-flowing effusions [44, 45]. Ultrasound may predict the likelihood of successful pleural drainage. In one study, all anechoic para-pneumonic effusions were drained, whereas complex and septated effusions were successfully drained only half the time . The movement and strain pattern of atelectatic lung in response to the cardiac impulse can also predict the presence of trapped lung .
With minimal training, novice operators can achieve a diagnostic accuracy for un-loculated effusions similar to expert operators . With greater experience, respiratory physicians can be as accurate as radiologists at diagnosing pleural effusion, and have equally low complication rates for ultrasound-guided pleural interventions . There are now validated instruments to assess an individual's skill at using ultrasound to guide pleural effusion drainage .
The accuracy of ultrasound for pleural effusion has been compared with that of chest radiography in two systematic reviews (table 2) [51, 52]. The earlier review included four studies and found that ultrasound had a sensitivity of 92–96%, compared with a chest radiography sensitivity of 24–100% . Ultrasound and chest radiography specificity were comparable at 88–100%. The more recent review included 12 studies, and reported a pooled sensitivity of 94% and pooled specificity of 98% for ultrasound, compared with a pooled sensitivity of 51% and pooled specificity of 91% for chest radiography . Both reviews comprised trauma, cardiac, critically ill (some with acute respiratory distress syndrome (ARDS)) and surgical cohorts, with a mean prevalence of 27–41% for pleural effusion. Not all studies used CT as a reference standard . The morphology of pleural effusions (simple versus complex) in the included studies was not described.
The evidence suggests that ultrasound is superior to chest radiography for detecting pleural effusion in specific populations with suggestive clinical features. Where available, ultrasound is therefore a reasonable initial test for this indication.
In consolidation, the alveolar space fills with fluid. Key sonographic features are: loss of the bright pleural line; a real image as opposed to an artefact; a tissue-like pattern; echogenic air-bronchograms; hypoechoic vascular structures; and an irregular serrated distal border (figure 2e) [51, 52].
Consolidation is nonspecific (table 1). The many possible aetiologies include infectious pneumonia, organising pneumonia, pulmonary infarction and ARDS. However, some additional ultrasound features, when present, may suggest the underlying cause. Hypoechoic tubular fluid-bronchograms without flow signal on Doppler were found exclusively in obstructive pneumonia in one study . Microabscesses in necrotising pneumonia are seen within consolidated lung as rounded hypoechoic or anechoic lesions with ill-defined margins . Lung tumours causing obstructive consolidation appear as homogeneous nodules with well-defined margins (figure 2f) . Dynamic air bronchograms differentiate pneumonic consolidation from resorptive atelectasis . Pneumonic consolidation has more rapid and marked enhancement with sonographic contrast . Consolidation may be difficult to detect when it is hidden behind bony structures  or does not extend to the pleura . However, the latter is reportedly rare .
Other ultrasound features apart from consolidation have been described in patients with pneumonia. These include focal B-lines and pleural effusion . In one study, anterior consolidation, anterior diffuse B-lines with abolished lung sliding, anterior asymmetric B-lines, and posterior consolidation or effusion without anterior diffuse B-lines were all suggestive of pneumonia .
Occlusion of a pulmonary artery may lead to pulmonary infarction, atelectasis and local pulmonary oedema [59, 60]. These pathophysiological changes are visible sonographically as wedge-shaped or rounded pleural-based consolidation [61, 62], often in association with pleural effusion , and usually in posterior basal segments . The use of Doppler and sonographic contrast to detect alterations in blood flow [54, 64] may differentiate pulmonary infarction from consolidation caused by other aetiologies.
The diagnostic accuracy of ultrasound for lung consolidation has been examined in two ways. One approach has been to measure its accuracy to detect radiological consolidation. Another approach has been to measure its accuracy for specific disease entities which cause consolidation, chiefly pneumonia and pulmonary infarction.
A recent systematic review examined the accuracy of ultrasound for consolidation referenced to CT (table 3) . This review focused on the imaging findings of consolidation rather than the underlying aetiology. The sensitivity of ultrasound was greater than chest radiography (91–100% versus 38–68%) while specificity was similar (78–100% versus 89–95%). This review was restricted to hospitalised adults with respiratory failure who also underwent CT scanning. The risk of selection bias was high in the included studies, and clinician sonographers who were unblinded to clinical data may have inflated the sensitivity of ultrasound. All studies were set in intensive care and most patients were ventilated.
This systematic review included a study reporting two different units of analysis (the unit of analysis is a statistical term denoting the major entity analysed in a study) . The use of lung region (12 regions per patient) instead of lung (two lungs per patient) as the unit of analysis decreased sensitivity, increased specificity, inflated the sample size and gave the misleading impression of greater precision. The review concluded that studies reporting different units of analysis should not be meta-analysed together, even though this is sometimes attempted. We therefore suggest that primary studies should report individual patients as the unit of analysis even when findings are acquired by lung or lung region. This will facilitate future comparisons between studies or pooling of results.
Conclusion for detection of consolidation
Therefore, based on a small body of evidence at high risk of selection and index test bias, ultrasound is more sensitive than chest radiography for consolidation in adults in intensive care with acute respiratory failure. It would appear reasonable to deploy ultrasound as the initial testing modality in this scenario. It is important to bear in mind that ultrasound sensitivity may be lower for less unwell patients in other settings, because they may have less extensive consolidation.
With the second approach of measuring the accuracy of ultrasound for a specific consolidative aetiology, three reviews suggest ultrasound is sensitive (94–97%) and specific (90–96%) for pneumonia in adults. One review also included children and neonates (table 3) [65–67]. The most recent review found that ultrasound had greater sensitivity compared with chest radiography (90% versus 77%) .
A number of caveats apply. First, the high prevalence of pneumonia in all three meta-analyses implies a highly selected population. Secondly, ultrasound features used to diagnose pneumonia comprised a combination of alveolar consolidation and interstitial changes. However, there are a wide range of other pathologies that may cause identical sonographic findings (table 1). Thirdly, the reference standards of included studies ranged from clinical diagnosis (of doubtful reliability), through to chest radiography (of limited sensitivity), and CT (in the minority). A number of studies were at risk of differential verification bias, because not all patients underwent the same reference testing.
There is also an inherent limitation in using ultrasound to “diagnose” pneumonia. In reality, ultrasound only detects imaging findings . Clinician input is required to integrate imaging findings with clinical data to make the clinical diagnosis of pneumonia. (This limitation applies to all imaging modalities, not just ultrasound, and ideally all imaging research should report detection rates of imaging abnormalities, not of clinical diagnoses.) Therefore research studies examining the accuracy of ultrasound to diagnose pneumonia are compelled to incorporate a clinician interpretation into reported test characteristics. This interpretation is influenced by the prevalence of pneumonia relative to other causes of sonographic abnormalities in the test population. In the three meta-analyses cited, the prevalence of pneumonia was 50–67%. In settings with a lower prevalence of pneumonia, the probability of ultrasound detecting changes due to a different disease would be correspondingly higher, reducing the specificity for pneumonia.
Conclusion for diagnosis of pneumonia
It therefore appears, from an evidence base at significant risk of selection and verification bias, that ultrasound is highly sensitive and specific for pneumonia. However, this may only apply when the suspicion of pneumonia is high and other pathology is unlikely, limiting the applicability of reported test performance to other settings.
Finally, two meta-analyses have also examined the accuracy of ultrasound for pulmonary embolism. The included studies mainly detected pleural-based consolidation suggestive of pulmonary infarction (table 3) [68, 69]. No included studies used Doppler or ultrasound contrast to differentiate infarction from other causes of consolidation. Sensitivity and specificity were between 85–87% and 81.8–83%, respectively, but when results were summarised only from studies of higher quality, sensitivity and specificity fell to 77% and 75%, respectively .
Patient selection presented a high risk of bias or applicability concerns in many included studies. This concern was supported by the very high prevalence (50–61%) of pulmonary embolus, which is threefold higher than the prevalence of pulmonary embolus in studies of patients undergoing CT pulmonary angiogram (CTPA) [71–73].
The difficulties regarding the use of ultrasound imaging to “diagnose” the clinical condition of pneumonia also apply to pulmonary embolus. Small peripheral areas of consolidation may be due to causes other than pulmonary infarction. In settings where the probability of pulmonary embolus is lower, the likelihood of an alternative explanation for sonographic consolidation will increase.
CTPA remains the gold standard for diagnosis of pulmonary embolism, but involves radiation and intravenous contrast. It has been suggested that chest ultrasound might be employed as a screening test prior to CTPA [63, 74], but this strategy has not been tested prospectively.
Chest ultrasound combined with targeted cardiac echocardiography and lower limb vein ultrasound increases the sensitivity and specificity for pulmonary embolus to 90% and 86.2%, respectively . This may be a good option in patients unable to undergo CTPA.
Conclusion for diagnosis of pulmonary embolus
The evidence therefore suggests that in settings with an extremely high probability of pulmonary embolus, chest ultrasound has modest sensitivity and specificity for pulmonary embolus. These test characteristics are inferior to CTPA, which therefore remains the test of choice. In patients unsuitable for CTPA, it appears reasonable to proceed with the combination of cardiac, chest and peripheral venous ultrasound.
The interstitial syndrome
There is a clear relationship between B-lines (defined earlier) and the interstitial syndrome . B-lines are probably formed by the reverberation of ultrasound waves between thickened interstitial septa just below the pleura. A scanning region is considered positive when three or more B-lines are visible within a rib space, and the ultrasound examination is considered positive for the interstitial syndrome when two or more regions are positive bilaterally . Multiple B-lines can also be found in healthy individuals in the lowest intercostal spaces posteriorly. There is no consensus regarding how B-lines should be counted [76, 77]. Some groups have utilised automated B-line counting, but this cannot yet be performed in real-time .
Increased B-lines may be focal or diffuse (table 1). Pneumonia [57, 58, 79], pulmonary contusion  or focal fibrosis cause focal B-lines. Increased lung water or diffuse fibrotic conditions cause diffuse B-lines .
Ultrasound predicts the volume of extravascular lung water . There is a linear relationship between the total number of B-lines and lung weight estimated using CT . One study used whole lung lavage to correlate sonographic imaging to lung water . Increasing lung water was represented initially by B-lines, then as “white lung” formed by coalescent B-lines, and ultimately as alveolar consolidation.
Non-cardiogenic pulmonary oedema (ARDS) is one important cause of increased lung water that causes pathological B-lines. The ultrasound-derived lung oedema score reflects sepsis-induced acute lung injury and correlates with sepsis severity . Lung recruitment and re-aeration by the application of positive end-expiratory pressure can be demonstrated by the transformation of consolidation into B-lines, or the disappearance of B-lines . Chest ultrasound may also guide fluid resuscitation in critically ill patients .
Cardiogenic pulmonary oedema also increases B-lines. However, ultrasound does not accurately predict pulmonary artery occlusion pressure (PAOP) [82, 87]. The absence of B-lines had a specificity of 95%, but a sensitivity of only 50% to predict PAOP ≤18 mmHg . This is not surprising, since the interstitial syndrome is not specific to cardiogenic pulmonary oedema.
The differentiation of cardiogenic pulmonary oedema from ARDS therefore requires other supportive findings. Compared to patients with cardiogenic pulmonary oedema, patients with ARDS may more often have: areas without B-lines on the anterior chest, increased posterior consolidation, an irregular and thickened pleural line, and reduced pleural lung sliding . However, substantial overlap remains and further work-up may be needed to distinguish the two.
Patients with pulmonary fibrosis also have diffusely increased B-lines , which correlate to the severity of fibrosis on high-resolution CT imaging . Ultrasound may be used to screen for pulmonary fibrosis in patients with systemic sclerosis . Other ultrasound features such as pleural thickening and subpleural nodules have been reported in patients with connective tissue disease-associated fibrosis .
Systematic reviews and meta-analyses have examined the diagnostic accuracy of ultrasound for acute cardiogenic pulmonary oedema. Two systematic reviews found that the sensitivity of increased B-lines for acute cardiogenic pulmonary oedema was 85–94%, while specificity was 92–93% (table 4) [94, 95].
One review also summarised the diagnostic accuracy of other tests for acute heart failure: history, examination, ECG, chest radiography, brain natriuretic peptide (BNP) and pro-BNP, echocardiography and bio-impedance . Lung ultrasound had the highest positive likelihood ratio (7.1) and the second lowest negative likelihood ratio (0.16), after BNP and pro-BNP (0.09–0.11).
Both reviews included studies examining patients with acute breathlessness. Different studies used differing methods of B-line counting. Almost all studies used clinical impression as the reference standard for heart failure diagnosis. In both reviews, the prevalence of acute heart failure was very high at 45%. In contrast, in an intensive care study where pneumonia was more prevalent than cardiogenic pulmonary oedema, the presence of B-lines was neither sensitive nor specific for pulmonary oedema .
Conclusion for the diagnosis of cardiogenic pulmonary oedema
The evidence therefore suggests that in patients presenting to emergency departments with breathlessness and a high pre-test probability for acute cardiogenic pulmonary oedema, increased B-lines are highly sensitive and specific for this diagnosis. In settings where alternative causes for B-lines are more likely, the specificity of ultrasound is lower. The best method for B-line counting remains to be determined. The combination of lung ultrasound with other tests such as BNP and echocardiography increases diagnostic accuracy, and in our view forms the optimal diagnostic approach.
Chest wall abnormalities
Chest wall invasion by lung cancer may be suggested by absent lung sliding, interruption of the pleural reflection, visible tumour growth into the chest wall, and rib invasion [97, 98]. Using intraoperative findings and final pathology as the reference standard, ultrasound had superior diagnostic accuracy for invasion compared with CT, with a sensitivity and specificity of 89–100% and 95–98%, respectively, whereas CT had a sensitivity and specificity of 42–68% and 66–100%, respectively [98, 99].
Ultrasound may identify unprotected intercostal arteries vulnerable to percutaneous pleural interventions . Salamonsen et al.  visualised intercostal arteries using colour Doppler and confirmed significant variation in their course. In a subsequently contrast CT-validated study, ultrasound had excellent sensitivity (86%) but poor specificity (30%) for detecting such vulnerable intercostal vessels . Further refinements are needed to improve the specificity of this technique and define its clinical utility .
Diaphragm excursion and thickness are both readily identified on ultrasound and can be used to assess diaphragm function.
Diaphragm excursion can be measured on M mode by visualising the dome of each hemi-diaphragm via the anterior subcostal approach [104, 105]. Diaphragm thickness is measured laterally at the zone of apposition at the lower intercostal spaces . The diaphragm thickening fraction is calculated as (end-inspiratory thickness minus end-expiratory thickness)/end-inspiratory thickness and has been used to assess diaphragm activity [107, 108].
Ultrasound has been used to detect diaphragm dysfunction in small cohorts of ventilated patients [109, 110]. Diaphragmatic excursion of <1 cm had a sensitivity and specificity of 83% and 41%, respectively, for predicting weaning failure from mechanical ventilation . Diaphragm thickening has also been correlated with the work of breathing in ventilated patients and may be used to predict weaning failure . In a small prospective study, a diaphragm thickening fraction >36% predicted a successful spontaneous breathing trial with a sensitivity of 82% and a specificity of 88% . These preliminary findings compare favourably with current weaning indices, but warrant further study in larger well-defined cohorts. We suggest that until more data becomes available, clinical decisions regarding the timing of extubation should not be based on ultrasound indices alone.
Impact on clinician decision-making: diagnostic thinking and therapeutic efficacy
Many of the studies examining the influence of chest ultrasound on medical decision-making have been conducted in critically ill patients with acute respiratory failure or dyspnoeic patients in the emergency department.
Acute respiratory failure
A series of studies over the past decade have demonstrated diagnostic thinking efficacy for chest ultrasound in acute respiratory failure. Lichtenstein and Mezière  performed chest and leg vein ultrasound in ventilated patients on intensive care unit admission. Ultrasound correctly identified the underlying condition (pneumonia, cardiogenic pulmonary oedema, pneumothorax, COPD and pulmonary embolus) in 90%. In a cohort with similar causes of acute respiratory failure, Silva et al.  compared a protocol of cardiac, chest and leg vein ultrasound to no ultrasound. Most patients received either invasive or noninvasive ventilation. Sonography increased diagnostic accuracy from 63% to 83%. Finally, Bataille et al.  showed that echocardiography combined with chest ultrasound had superior diagnostic accuracy to chest ultrasound alone (81% versus 63%).
Some caveats apply. First, patients with multiple pathologies (4–5%) were excluded from all studies, although the number excluded from the third study was not reported. The earliest study also excluded patients with rare or unclear diagnoses. If accuracy is recalculated without these exclusions, diagnostic accuracy drops to 78% and 79% respectively for the first two studies. Secondly, none of the three studies reported ARDS as a cause of respiratory failure. However, in other series, sepsis and ARDS were important causes of acute respiratory failure . In a recent report, ARDS caused 67% of acute respiratory failure in patients with pulmonary infiltrates . Importantly the ultrasound features of ARDS overlap with pneumonia and cardiogenic pulmonary oedema [89, 116]. Therefore, in cohorts that include ARDS, ultrasound is probably less discriminating. Thirdly, chest ultrasound in all studies was coupled to either echocardiography, leg vein ultrasound or both. Only one study reported the yield of chest ultrasound as a single modality (63%), and this was not compared to the diagnostic accuracy of clinical diagnosis without ultrasound . Finally, the reference standard in all studies was expert clinical diagnosis. Based on autopsy series however, the clinical diagnosis in critically ill patients may be inaccurate in 20–30% of cases [117, 118].
Overall, the evidence does suggests that the early use of chest ultrasound can help diagnose the cause of acute respiratory failure, particularly when supplemented by cardiac and leg vein ultrasound. In our view, it is reasonable to perform such a multisystem ultrasound survey when such expertise is available. However, the presence of either ARDS or multiple pathologies is likely to lower the diagnostic yield.
It also appears that the use of ultrasound in acute respiratory failure can change patient management of patients already in intensive care. Xirouchaki et al.  reported that the use of chest ultrasound changed management in 47% of critically ill patients with unexplained worsening hypoxia, or the suspicion of pneumothorax, atelectasis, pneumonia, pleural effusion or pulmonary oedema. In this study, ultrasound was employed to answer specific clinical questions. Notably, all but one of 253 scans demonstrated consolidation, effusion or interstitial syndrome. From this, the authors concluded that indiscriminate use of ultrasound would not have been as useful as their selective approach.
Chest ultrasound may also influence medical decision-making in the emergency department. In a small series of patients presenting with acute dyspnoea, the use of chest ultrasound changed the diagnosis in 44% and altered management in 58% .
More recently, a RCT by Laursen et al.  found the use of combined cardiac, lung and deep vein ultrasound in patients presenting to the emergency department with breathlessness, desaturation, chest pain or cough increased the rate of correct initial diagnosis within 4 h from 67% to 88%. Consequently, appropriate treatment was commenced within 4 h in more patients undergoing ultrasound (78% versus 56%).
In this single-centre trial, all scans were performed by one experienced sonographer within an hour of admission to the medical emergency ward. Only patients admitted during his shifts were eligible for screening, and more than 50% of those screened did not meet inclusion. No differences were found in terms of length of stay or mortality, but this trial was not powered to detect changes in such outcomes. Despite these caveats, the results suggest that early multisystem sonography (including chest ultrasound) is effective in achieving rapid diagnosis and appropriate management of this patient group.
Impact on health outcomes: patient outcome and societal efficacy
Robust evidence for patient and societal benefit can be found for ultrasound-guided pleural interventions. However, there is a paucity of evidence for outcome efficacy of chest ultrasound in relation to parenchymal lung disorders and musculoskeletal conditions.
A systematic review from 2010 concluded that the use of ultrasound to guide pleural aspiration reduced the risk of pneumothorax with an odds ratio of 0.3 . The meta-analysis included data from 22 observational studies and two randomised trials. Only one of the two trials was positive. The negative trial used remote guidance rather than immediate guidance, with ultrasound marking being performed in radiology but aspiration taking place on the ward . A more recent trial also found that ultrasound reduced the risk of pneumothorax from 12% to 1% and increased the likelihood of aspirating fluid .
A second meta-analysis confined to pleural aspiration in mechanically ventilated patients also found that ultrasound use reduced the odds for pneumothorax (OR 0.3), but this was not statistically significant perhaps due to the small number of studies . Observational studies using insurance databases have confirmed a reduction in pneumothorax and haemorrhagic complications with ultrasound guidance [126, 127].
On the basis of evidence for patient outcome efficacy, ultrasound guidance for drainage of pleural effusion is now the recommended standard of care , although uptake remains incomplete in some areas of medical practice [129, 130].
Based on decision-tree analysis, ultrasound-guided pleural aspiration is cost-effective, mainly due to a reduction in pneumothorax rates [131, 132]. This has been confirmed in the database studies mentioned above, with a 6% reduction in total hospitalisation cost [126, 127]. In addition, if radiology-performed ultrasound guidance is substantial delayed, clinician-performed ultrasound guidance provides further cost–benefit .
Our review has highlighted the impressive reported accuracy of ultrasound for a number of conditions. However, substantial gaps are present in the current evidence (figure 3). We have shown that evidence for efficacy is greatest in pleural ultrasound, where there is an impact on health outcomes. There is less evidence for health outcome efficacy in parenchymal lung ultrasound, although it clearly influences clinical decision making. There is least evidence for efficacy in musculoskeletal chest ultrasound, with data generally limited to the diagnostic accuracy for some conditions. Studies of diagnostic accuracy require extension to new settings; for example, is ultrasound as accurate for spontaneous pneumothorax as it is for traumatic pneumothorax?
There should be greater emphasis on measuring comparative accuracy . In some scenarios, chest ultrasound offers compelling advantages over other tests, as it does not require ionising radiation (in pregnancy and paediatrics), intravenous contrast (in renal impairment or contrast allergy) or transport to the radiology department (in patients requiring complex organ support). However, the real challenge is to examine to what degree ultrasound is superior to current diagnostics in patients without these specific characteristics.
Accuracy studies should also evaluate chest ultrasound embedded within diagnostic pathways or test combinations, rather than as an isolated modality [134, 135]. The intended role of chest ultrasound in such pathways should be explicitly defined, whether as an add-on test, replacement test or triage test since the desired performance characteristics differ for each . In conducting such research, there is scope to enhance the quality of studies and systematic reviews by observing methodology guidelines [136–141].
Studies surrounding decision-making should ideally address the complexity of real-life clinical scenarios. To assist Bayesian diagnostic thinking and decision-making in complex patients, such as those with respiratory failure in intensive care, a positive or negative signal in one ultrasound domain (e.g. B-lines) may be coupled to signals in one or more other domains (e.g. consolidation or pleural effusion). Advanced statistical methods have already been employed to support such algorithms  and are likely to enter widespread use. This approach may be especially powerful when there is a specific clinical question to generate a pre-test probability for each differential diagnosis .
However, it is in health outcomes where the most research effort should be focused. It could be argued that the evident diagnostic accuracy of chest ultrasound obviates any need for demonstrating outcome efficacy. However, “evidence of test accuracy often provides low-quality evidence for making recommendations” , and confirmation of outcome efficacy provides a more robust basis for guidelines .
Despite the lack of ionising radiation, ultrasound is not risk-free. The greatest risk is of providing misleading results . There are also opportunity costs consumed by sonographic training, capital investment and scarce bedside time with patients. The rise and fall of the pulmonary artery catheter offers a cautionary tale of a test implemented beyond its evidence base [145, 146]. Concerns regarding possible increased mortality have now limited its routine use, although it is still believed to have a role in specific instances.
While we support the increasing uptake of bedside chest ultrasound for indications with evidence for accuracy, we also believe there is a need for further research efforts to confirm that chest ultrasound improves hard outcomes (for applications other than guiding pleural interventions) .
In general, RCTs are the best study design to demonstrate patient outcome efficacy . However, such studies are challenging to design and execute . Consequently, very few tests have been shown to have patient outcome efficacy [11, 148]. Successful trials require direct test-to-treatment coupling and measurable patient outcomes; for example, trials of ultrasound-guided pleural drainage (direct test-treatment coupling) to measure the rate of iatrogenic pneumothorax (immediate patient outcome) . Selecting appropriate test-treatment couplings and immediate outcomes relevant to acute respiratory failure and other respiratory syndromes are therefore a high priority.
Because conducting RCTs in this area is so challenging, alternative methods of demonstrating patient outcome efficacy should also be explored. It may be possible to avoid a full-scale RCT by focusing on so-called “critical comparisons” between an existing diagnostic pathway and a new ultrasound pathway, and still allow the impact on patient outcomes to be modelled . To design such studies, ultrasound researchers may need to collaborate closely with diagnostic test methodology experts.
In an era of cost-containment, there is also an increasing need to confirm societal efficacy for chest ultrasound, whether by RCTs, observational data, or robust economic analyses and modelling [134, 149, 150].
Finally, chest ultrasound has potential to change healthcare delivery. For example, the use of handheld ultrasound devices for suspected pleural effusion in a recent study removed the need for any further testing in 95% of cases . Future implementation of such disruptive innovation may reconfigure the fundamental structure of health systems.
We believe advances in all these aspects of test efficacy will underpin the future of bedside chest ultrasound.
We thank Tom Kotsimbos and Matthew Naughton of the Alfred Hospital (Melbourne, Australia) for their invaluable comments on the manuscript.
Conflict of interest: None declared.
Provenance: Submitted article, peer reviewed.
- Received May 8, 2016.
- Accepted July 5, 2016.
- Copyright ©ERS 2016.
ERR articles are open access and distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0.