Elsevier

Academic Radiology

Volume 13, Issue 10, October 2006, Pages 1254-1265
Academic Radiology

Original investigation
Evaluation of Lung MDCT Nodule Annotation Across Radiologists and Methods1

https://doi.org/10.1016/j.acra.2006.07.012Get rights and content

Rationale and Objectives

Integral to the mission of the National Institutes of Health–sponsored Lung Imaging Database Consortium is the accurate definition of the spatial location of pulmonary nodules. Because the majority of small lung nodules are not resected, a reference standard from histopathology is generally unavailable. Thus assessing the source of variability in defining the spatial location of lung nodules by expert radiologists using different software tools as an alternative form of truth is necessary.

Materials and Methods

The relative differences in performance of six radiologists each applying three annotation methods to the task of defining the spatial extent of 23 different lung nodules were evaluated. The variability of radiologists’ spatial definitions for a nodule was measured using both volumes and probability maps (p-map). Results were analyzed using a linear mixed-effects model that included nested random effects.

Results

Across the combination of all nodules, volume and p-map model parameters were found to be significant at P < .05 for all methods, all radiologists, and all second-order interactions except one. The radiologist and methods variables accounted for 15% and 3.5% of the total p-map variance, respectively, and 40.4% and 31.1% of the total volume variance, respectively.

Conclusion

Radiologists represent the major source of variance as compared with drawing tools independent of drawing metric used. Although the random noise component is larger for the p-map analysis than for volume estimation, the p-map analysis appears to have more power to detect differences in radiologist-method combinations. The standard deviation of the volume measurement task appears to be proportional to nodule volume.

Section snippets

Materials and methods

The annotations of the radiologists were evaluated in two generations of drawing training sessions before the final drawing experiment was performed. In the initial session, example slices of several nodules were sent to the participating radiologist at each of the five LIDC institutions in Microsoft PowerPoint (Redmond, WA) slides. Using PowerPoint, each radiologist was requested to draw the boundaries of the nodule as seen in the slice. The spectrum of nodules varied from complex and

Results

For the parameter estimates of the nodule p-map model in Table 1, note that only one term (ie, the interaction term between radiologist 6 and method 2) is not significantly different from zero at P < .05. Further analysis of the sum of squares attributed to each variable category leads to the following summary of the model’s resolution of signal and noise shown in Table 3 across all nodules. Also note from Table 3 that the radiologists’ term accounts for four times the variance compared with

P-maps

All of the coefficients of the linear mixed-effects model shown in Table 1 derived from p-map values are statistically different from zero at P < .05 except one interaction term. Although the modeled variance for radiologists was more than four times that of methods, by far the largest variance, almost 60% and four times larger than that of the radiologists, was due to random error. The magnitude of the residual error for the p-map analysis accentuates the point that segmentation is

References (23)

  • M. Truong et al.

    Lung cancer screening

    Curr Oncol Rep

    (2003)
  • Cited by (0)

    1

    Funded in part by the National Institutes of Health, National Cancer Institute, Cancer Imaging Program by the following grants: 1U01 CA 091085, 1U01 CA 091090, 1U01 CA 091099, 1U01 CA 091100, and 1U01 CA 091103.

    View full text