Introduction

The most common cause of cancer-related death is lung cancer. In 2000, lung cancer accounted for 17 % of total cancer mortality [1]. Despite advances in treatment, the 5-year survival rate is still only 15 % or even less, as many lung cancers are found at a relatively late stage [2]. Potentially, screening for lung cancer may improve the prognosis. About a decade ago, low-dose computed tomography (CT) was proposed as a promising method to screen for lung cancer. Several cohort studies and randomised clinical trials using low-dose CT were started, aiming to investigate the effect of lung cancer screening on the distribution of tumour stages and eventually the effect on survival [35]. Recently, the initial, encouraging results of one of the largest trial cohorts were published: lung cancer screening by low-dose CT reduced lung cancer-specific mortality by 20 % compared with chest radiography [3].

Efficient lung cancer screening depends on the accurate distinction between benign and malignant pulmonary lesions, but starts with sensitive observer detection of pulmonary nodules. There are only scarce validation data on the detectability of small pulmonary nodules by low-dose CT [6, 7]. In one previous study using older generation CT equipment with slightly thicker collimation, the sensitivity of the detection of pulmonary nodules was evaluated in a thoracic phantom without pulmonary vessels, in which artificial nodules were placed at known locations. Pulmonary nodules as small as 2.4 mm in diameter could be detected by conventional dose CT [8]. However, no data exist about the observer sensitivity on current thin-slice low-dose CT for pulmonary nodules in an anthropomorphic pulmonary background with pulmonary vessels. In addition, limited data are available on the accuracy of volumetry of pulmonary nodules on low-dose CT. Therefore, the aim of this manuscript was to assess the sensitivity of detection of artificial pulmonary nodules on low-dose CT, randomly placed in an anthropomorphic pulmonary background with pulmonary vessels, and to determine the accuracy of volumetry of detected nodules by manual and semi-automated measurements.

Materials and methods

An anthropomorphic thoracic phantom (Lungman, Kyoto Kagaku, Tokyo, Japan) with artificial thoracic wall, heart, mediastinum, diaphragm and lung with pulmonary vessels was used (Fig. 1). The phantom consisted of an accurate life-size anatomical model of a male thorax with soft tissue substitute materials made of polyurethane resin composites and synthetic bones made of epoxy resin with X-ray absorption rates very close to those of human tissue. The space between the pulmonary vessels in the thoracic cavity consisted of air. In addition, we used 15 artificial spherical pulmonary nodules with a smooth surface in five diameters (3, 5, 8, 10 and 12 mm, corresponding to a volume of 14, 65, 268, 523 and 904 mm3) and 3 CT densities [-800, -630 and +100 Hounsfield Units (HU)]. The artificial solid nodules were made of polyurethane resin and the artificial non-solid nodules were made of polyurethane foam resin.

Fig. 1
figure 1

a and b An anthropomorphic thoracic phantom with (c) artificial pulmonary nodules with five different diameters (3, 5, 8, 10 and 12 mm, corresponding to volumes of 14, 65, 268, 523 and 904 mm3) and three different densities (-800, -630 and +100 HU)

16-row multi-detector CT (16-MDCT) and 64-row multi-detector CT (64-MDCT) (Sensation 16 and Sensation 64, respectively, Siemens, Forchheim, Germany) were used. A clinically used low-dose protocol for lung cancer screening was applied for image acquisition [9]: spiral acquisition at 120 kV, 20 mAs, 0.5 s rotation time, pitch 1.5 and collimation 16 × 0.75 mm and 2 × 32 × 0.6 mm. The CT images were reconstructed with a slice thickness of 1 mm and increment of 0.7 mm using a medium B30f kernel and a field of view 300 mm.

The artificial nodules were randomly positioned in both artificial lungs. All nodules were attached to pulmonary vessels. None of the nodules were attached to pleura or positioned sub-pleurally. A random, pre-determined set of 6 nodules was positioned in the artificial lungs, after which CT examination was performed. Each of the 15 nodules was examined in total 5 times; thus, 13 different sets of nodules were positioned in the phantom (the last set consisted only of 3 nodules). The CT examination was, for each new phantom nodule set-up, repeated five times, with a small translocation and rotation of the phantom in between each examination to simulate participant movement. The thoracic phantom was also examined five times without pulmonary nodules to serve as a control. Thus, per CT technique 70 examinations were performed, including 65 test examinations and 5 control examinations. The thoracic phantom was examined with the same settings for the two CT techniques. Furthermore, all the nodules were examined in air, on the CT table, to confirm the visibility on low-dose CT.

The reconstructed data were evaluated on a dedicated workstation (Leonardo, Siemens, Forchheim, Germany) by two independent observers, both with experience in thoracic diagnostic imaging for more than 8 years, who were blinded to information about the presence, properties and location of the artificial pulmonary nodules. All the examinations were read by both observers. The observers were instructed to review the images for the presence of nodules within a clinically relevant time duration (about 2 min per examination). The observers were asked to report whether there were one or more pulmonary nodules present or not. If a potential nodule was observed, the images were compared with the images of the control CT examination without nodules to confirm it was not a false-positive finding caused by pulmonary background structures. Subsequently, the slice with maximal cross-sectional area of the nodule was selected. Then, the diameter and CT density were manually measured. The diameter was measured according to the Response Evaluation Criteria in Solid Tumors (RECIST) [10]. A region of interest was drawn as large as possible within the nodule border to measure the CT density. Because all nodules were spherical, the volume of the nodules could be easily calculated from the measured diameter. Additionally, a dedicated semi-automated software tool (LungCARE, Siemens, Forchheim, Germany) was used to measure the volume of the detected solid nodules (CT density +100 HU). The diameter and volume of identified nodules were automatically calculated by this three-dimensional volumetric assessment software.

Statistics

The sensitivity of detection of artificial pulmonary nodules was calculated for three densities (-800, -630 and +100 HU) and for 16- and 64-row multidetector CT. The inter-observer reliability for both manual and semi-automated measurements was assessed using an intraclass correlation coefficient. The correlation of measurements between 16- and 64-row multidetector CT was expressed as a Pearson’s correlation coefficient. The inter-observer and inter-machine agreement of nodule volumetry was analysed using Bland-Altman plots. If there was a difference in the measurements between 16- and 64-row multidetector CT, or between the two observers, an independent sample t-test was used. If there was a difference in measuring diameter and volume using the manual and semi-automated methods, a paired-samples t-test was used. To find the difference between the observed value (diameter, volume and density) and the actual value, a one-sample t-test was used. Results were given as mean ± standard deviation (SD). A P < 0.05 was considered to be statistically significant. All statistical analyses were performed with a software package (SPSS 18.0.3, IBM, New York, USA).

Results

Representative CT images of the anthropomorphic thoracic phantom are shown in Fig. 2. Nodules sized 5 mm in diameter and larger of all CT densities were detected by both observers on all examinations obtained with 16- and 64-MDCT. For nodules sized 3 mm in diameter, solid nodules (CT density +100 HU) were detected on 15 out of 25 examinations (60 %) for both reviewers in 16-MDCT, and 15 (60 %) and 20 (80 %) examinations for the first and the second reviewer respectively in 64-MDCT. However, non-solid nodules with CT density of -630 HU were only detected on 5 examinations out of 25 (20 %) by the second reviewer on 16-MDCT. Non-solid nodules with CT density of -800 HU were not detected on any examination (Fig. 3). Each observed nodule was compared with the control examination to confirm the presence of the nodule; no false-positive nodules were found. All the nodules found were measured successfully using the manual or semi-automated method, except for one nodule with a diameter of 3 mm and a density of +100 HU for which segmentation by the semi-automated method failed. Furthermore, on the CT examinations of the nodules in air (without the phantom), all 15 nodules were visualised on low-dose CT.

Fig. 2
figure 2

(a) Axial, (b) coronal and (c) sagittal images of the anthropomorphic thoracic phantom with (d) a non-solid pulmonary nodule measured manually and (e) a solid nodule assessed semi-automatically

Fig. 3
figure 3

Sensitivity of the detection of artificial pulmonary nodules for three densities (-800, -630 and +100 HU) and two CT techniques (16-MDCT and 64-MDCT)

The inter-observer reliability was very good with an intraclass correlation coefficient of 0.985 (P < 0.001) and 1.000 (P < 0.001) for manual and semi-automated measurement. The correlation of nodule measurements between 16-MDCT and 64-MDCT was high with a Pearson’s correlation coefficient of 0.983 (P < 0.001) and 0.999 (P < 0.001) for manual and semi-automated measurements, respectively. An increasing relative inter-observer and inter-machine volumetry difference at smaller nodule size was found (Figs. 4 and 5). The mean absolute value of relative inter-observer difference was 11.7 ± 14.4 % and 3.3 % ± 6.6 % for manual and semi-automated volumetry, respectively. The mean absolute value of relative inter-scanner difference was 12.9 ± 12.4 % and 4.0 ± 5.6 %, respectively. There was no significant difference in the diameter measurements or CT density between the two techniques or between the two observers (P > 0.05).

Fig. 4
figure 4

Bland-Altman plots for inter-observer agreement of the measured volume

Fig. 5
figure 5

Bland-Altman plots for inter-scanner agreement of the measured volume

In both the manual and the semi-automated method, nodule diameter and volume were significantly underestimated compared with the actual properties (P < 0.01) (Table 1). An increasing underestimation of nodule diameter and volume at smaller nodule size was found (Figs. 6 and 7). In diameter evaluation, the overall underestimation for nodules of any density was 9.2 ± 6.0 % using the manual method. For solid nodules, the underestimation was 10.1 ± 6.9 % using the manual method, compared with 3.7 ± 7.1 % (P < 0.01) using the semi-automated method. In volumetry, the overall underestimation for nodules of any density was 24.1 ± 14.0 % using the manual method. For solid nodules, the underestimation was 26.4 ± 15.5 % using the manual method, compared with 7.6 ± 8.5 % (P < 0.01) using the semi-automated method.

Table 1 Measurements of diameter and volume of artificial pulmonary nodules by the manual and semi-automated methods
Fig. 6
figure 6

Deviation of the measured diameter from the actual diameter by manual measurement for nodules of -800, -630 and +100 HU, and by semi-automated measurement for nodules of +100 HU

Fig. 7
figure 7

Deviation of the measured volume from the actual volume by manual volumetry for nodules of -800, -630 and +100 HU, in addition to by semi-automated volumetry for nodules of +100 HU

The mean measured CT density was -813 ± 23 HU, -647 ± 9 HU and 123 ± 61 HU for nodule density of -800 HU, -630 HU and +100 HU, respectively (Table 1), thus deviating 1.7 ± 2.3 %, -2.7 ± 1.5 % and 26 ± 57 % from the expected density (P < 0.01).

Discussion

In one of the first pulmonary nodule validation studies using an anthropomorphic thoracic phantom and current low-dose CT technology, we have shown that a clinically used lung cancer screening protocol with low-dose CT has 100 % sensitivity of detection for spherical pulmonary nodules sized 5 mm in diameter and larger. In addition, we have shown that low-dose CT yields more accurate nodule volume measurements when using a semi-automated method than in the case of the manual method, with negligible underestimation of actual size, especially for small nodules.

We found a sensitivity of 100 % for nodules with a diameter equal to or larger than 5 mm for all three nodule densities in this anthropomorphic thoracic phantom, and 60–80 % and 0–20 % for solid and non-solid nodules with a diameter of 3 mm, respectively. This anthropomorphic thoracic phantom was also used in another study in which sensitivity of 95 % for solid nodules and 74–81 % for non-solid nodules were found for nodule diameter equal to or larger than 5 mm [11]. Unlike the low-dose acquisition protocol for lung cancer screening in this study, the authors did not use a low-dose protocol, which limits comparability. In addition, the authors used 5-mm reconstructed slice thickness, compared with 1 mm in this study. It is well known that a larger slice width results in lower sensitivity of pulmonary nodules [6]. This anthropomorphic phantom was also used for an image database of nodules of diameter larger than 5 mm of several shapes, but results for nodule detectability and comparing between manual and semi-automated measurements were not reported [12]. In some nodule detectability studies, solid nodules with a diameter of 2 to 3 mm were detected in all cases [8, 13, 14]. In these studies, the nodules were placed in known order and examined in a thoracic phantom without lung vessels. On the other hand, in our study the artificial nodules were randomly positioned inside the lungs of an anthropomorphic thoracic phantom, thus limiting detection bias and strengthening the findings. As adjacent tissue can interfere with the nodule image reconstruction and reading, especially for non-solid nodules, and because this interference increases with decreasing radiation dose, we expect that this interference explains why we could not detect some of the 3-mm nodules.

No false-positive nodules were found compared with the control examinations. That is to say, all nodules detected on low-dose CT were actual nodular lesions. Nevertheless, in clinical practice, pulmonary parenchyma can be distorted and may contain scars and variations, which can erroneously be interpreted as a pulmonary nodule; thus, the specificity in a clinical setting is usually decreased.

We found an increasing underestimation of the nodule volume at smaller nodule diameters, which was also found in some previous studies [15, 16]. However, some other studies reported an increasing overestimation of the nodule volume at smaller nodule diameters [1722]. For small pulmonary nodules, the transit zone between nodule and pulmonary background caused by partial volume effect was found to be important for accurate volumetry [23]. Thus, measurement errors in small nodules when measured manually should be considered.

In this study, we found that the measured nodule density was significantly different from the expected density. In lung cancer screening by unenhanced CT examination, accurate CT density is important mainly to differentiate between solid and non-solid nodules, and to evaluate increases in density over time in the case of non-solid nodules. However, as we only had one type of solid nodule, and two types of non-solid nodule with a relatively large difference in CT density compared with the solid nodules, no reliable conclusion can be drawn about the potential impact of CT density variation on lung cancer screening results. Future studies with more variation in the density of solid nodules, with CT densities within the clinically relevant range, have been planned to investigate the impact of CT density on nodule volumetry in more detail.

No difference was found between low-dose 16- and 64-row multi-detector CT from the same vendor regarding manual and semi-automated volumetry. However, a previous study found different nodule volumetry outcomes for four 16-row CT systems from different vendors [17]. As the follow-up of screened participants or clinical patients can last for an extensive period, different CT systems from different vendors can be used. A direct comparison of nodule volumes obtained from different CT systems from the same vendor seems valid, at least for the vendor investigated in this study. However, whether similar nodule volumes would have been found for other vendors is unknown.

Clinical implications

Some lung cancer screening projects were mainly based on nodule diameter [3], whereas other lung cancer screening projects were mainly based on nodule volume measurements [9]. In the National Lung Screening Trial (NLST) study, a positive result indicating suspected lung cancer on low-dose CT was defined as the presence of a nodule with a largest transverse axis of at least 4 mm [4]. In the Dutch-Belgian Randomized Lung Cancer Screening Trial (NELSON) study, a positive result was defined as either a fast-growing nodule with a volume of at least 50 mm3, i.e., nearly 5 mm in diameter, or a nodule with a volume of at least 500 mm3 [9]. The results of this study validate these screening protocols, as all solid nodules and non-solid nodules with a diameter of 5 mm could be detected by observers on low-dose CT against an anthropomorphic pulmonary background.

Pulmonary nodule volumetry is used to guide the diagnostic strategy in the follow-up of lung cancer screening [9]. Repeated CT-derived volumetry of pulmonary nodules can be used to determine the risk of lung cancer and can be used to monitor tumour response in the case of non-surgical therapy. The accuracy and precision of pulmonary nodule volumetry depend on a number of factors, including image acquisition and reconstruction parameters, nodule characteristics, and the performance of algorithms for nodule segmentation and volume estimation [24]. Size measurement needs to be as accurate as possible in order to enable the assessment of nodule growth. A commonly used criterion for pulmonary nodule growth is given by the Response Evaluation Criteria in Solid Tumors (RECIST), which states that nodules in the stable disease category should not be larger than 20 % or smaller than 30 % in diameter on subsequent CT examinations [10]. However, a 20 % error in diameter measurement could result in an error in volume for a spherical nodule of up to 73 %, which could result in inaccurate growth rate evaluation. To improve accuracy in growth rate evaluation, semi-automated volumetry is favoured over manual volumetry.

Limitations

There are limitations to this study. Firstly, only spherical nodules were used with five discrete sizes. Additional data on the sensitivity of nodules with sizes between 3 and 5 mm in diameter are needed in order to determine the sensitivity of current low-dose CT and to optimise diagnostic screening strategies for small nodules. A further extension to our study is the assessment of the sensitivity of low-dose CT for non-spherical and irregularly shaped (lobulated and/or spiculated) nodules. Secondly, we simulated healthy pulmonary tissues. The sensitivity of nodule detection is dependent on pulmonary structures, and surrounding pathological lesions such as fibrosis, emphysema or consolidation could influence nodule detectability, which can make pulmonary nodules undetectable or be erroneously interpreted as pulmonary nodules. We therefore expect that the sensitivity in an in vivo setting may be lower, and that the false-positive rate may be higher, compared with the findings in this study. Thirdly, we used only one clinical low-dose CT screening protocol. Although the sensitivity of pulmonary nodules is also dependent on CT protocol, the protocol we used is the most common in current lung cancer screening projects using thin-slice, low-dose CT. Therefore, we expect that this protocol is the most relevant for the sensitivity of pulmonary nodule detection in lung cancer screening. Finally, semi-automated volumetry was not performed for non-solid nodules, because the present commercially available volumetry software was only for solid nodules. In case of the considerable volumetry deviation from the actual volume in non-solid nodules by manual measurements, a special software package for semi-automated volume measurement of non-solid nodules is needed to assess these non-solid nodules.

In conclusion, this anthropomorphic phantom study shows that a lung cancer screening protocol with low-dose CT is highly reliable for the detection of spherical pulmonary nodules of 5 mm in diameter and larger. Low-dose CT yields more accurate nodule volumetry when using a semi-automated software tool than manual measurements, with negligible underestimation of actual size, especially for small nodules. For early lung cancer detection, in which mostly small nodules are found, accurate measurement is especially necessary to enable assessment of volume doubling time. Thus, a semi-automated volume measurement should be used in the setting of lung cancer screening. No difference in the accuracy of volumetry was found between 16- and 64-row multi-detector CT.