Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer

被引:2136
作者
Bejnordi, Babak Ehteshami [1 ]
Veta, Mitko [2 ]
van Diest, Paul Johannes [3 ]
van Ginneken, Bram [1 ]
Karssemeijer, Nico [1 ]
Litjens, Geert [4 ]
van der Laak, Jeroen A. W. M. [4 ]
机构
[1] Radboud Univ Nijmegen, Med Ctr, Dept Radiol & Nucl Med, Diagnost Image Anal Grp, Nijmegen, Netherlands
[2] Eindhoven Univ Technol, Med Image Anal Grp, Eindhoven, Netherlands
[3] Univ Med Ctr Utrecht, Dept Pathol, Utrecht, Netherlands
[4] Radboud Univ Nijmegen, Med Ctr, Dept Pathol, Nijmegen, Netherlands
来源
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION | 2017年 / 318卷 / 22期
关键词
DIGITAL PATHOLOGY; CLASSIFICATION; MULTIREADER; FEATURES; SCALE; ROC;
D O I
10.1001/jama.2017.14585
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
IMPORTANCE Application of deep learning algorithms to whole-slide pathology images can potentially improve diagnostic accuracy and efficiency. OBJECTIVE Assess the performance of automated deep learning algorithms at detecting metastases in hematoxylin and eosin-stained tissue sections of lymph nodes of women with breast cancer and compare it with pathologists' diagnoses in a diagnostic setting. DESIGN, SETTING, AND PARTICIPANTS Researcher challenge competition (CAMELYON16) to develop automated solutions for detecting lymph node metastases (November 2015-November 2016). A training data set of whole-slide images from 2 centers in the Netherlands with (n = 110) and without (n = 160) nodal metastases verified by immunohistochemical staining were provided to challenge participants to build algorithms. Algorithm performance was evaluated in an independent test set of 129 whole-slide images (49 with and 80 without metastases). The same test set of corresponding glass slides was also evaluated by a panel of 11 pathologists with time constraint (WTC) from the Netherlands to ascertain likelihood of nodal metastases for each slide in a flexible 2-hour session, simulating routine pathology workflow, and by 1 pathologist without time constraint (WOTC). EXPOSURES Deep learning algorithms submitted as part of a challenge competition or pathologist interpretation. MAIN OUTCOMES AND MEASURES The presence of specific metastatic foci and the absence vs presence of lymph node metastasis in a slide or image using receiver operating characteristic curve analysis. The 11 pathologists participating in the simulation exercise rated their diagnostic confidence as definitely normal, probably normal, equivocal, probably tumor, or definitely tumor. RESULTS The area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.556 to 0.994. The top-performing algorithm achieved a lesion-level, true-positive fraction comparable with that of the pathologist WOTC (72.4%[95% CI, 64.3%-80.4%]) at a mean of 0.0125 false-positives per normal whole-slide image. For the whole-slide image classification task, the best algorithm (AUC, 0.994 [95% CI, 0.983-0.999]) performed significantly better than the pathologists WTC in a diagnostic simulation (mean AUC, 0.810 [range, 0.738-0.884]; P<.001). The top 5 algorithms had a mean AUC that was comparable with the pathologist interpreting the slides in the absence of time constraints (mean AUC, 0.960 [range, 0.923-0.994] for the top 5 algorithms vs 0.966 [95% CI, 0.927-0.998] for the pathologist WOTC). CONCLUSIONS AND RELEVANCE In the setting of a challenge competition, some deep learning algorithms achieved better diagnostic performance than a panel of 11 pathologists participating in a simulation exercise designed to mimic routine pathology workflow; algorithm performance was comparable with an expert pathologist interpreting whole-slide images without time constraints. Whether this approach has clinical utility will require evaluation in a clinical setting.
引用
收藏
页码:2199 / 2210
页数:12
相关论文
共 32 条
[1]   AggNet: Deep Learning From Crowds for Mitosis Detection in Breast Cancer Histology Images [J].
Albarqouni, Shadi ;
Baur, Christoph ;
Achilles, Felix ;
Belagiannis, Vasileios ;
Demirci, Stefanie ;
Navab, Nassir .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2016, 35 (05) :1313-1321
[2]  
[Anonymous], IEEE C COMP VIS PATT
[3]  
[Anonymous], 2014, DICT STAT 3E
[4]  
[Anonymous], VERY DEEP CONVOLUTIO
[5]  
[Anonymous], 2002, P ADV NEURAL INFORM
[6]  
[Anonymous], IEEE C COMP VIS PATT
[7]   Stain Specific Standardization of Whole-Slide Histopathological Images [J].
Bejnordi, Babak Ehteshami ;
Litjens, Geert ;
Timofeeva, Nadya ;
Otte-Holler, Irene ;
Homeyer, Andre ;
Karssemeijer, Nico ;
van der Laak, Jeroen A. W. M. .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2016, 35 (02) :404-415
[8]  
Breiman L., 2001, Machine Learning, V45, P5
[9]   Clinical outcome of patients with lymph node-negative breast carcinoma who have sentinel lymph node micrometastases detected by immunohistochemistry [J].
Chagpar, A ;
Middleton, LP ;
Sahin, AA ;
Meric-Bernstam, F ;
Kuerer, HM ;
Feig, BW ;
Ross, MI ;
Ames, FC ;
Singletary, SE ;
Buchholz, TA ;
Valero, V ;
Hunt, KK .
CANCER, 2005, 103 (08) :1581-1586
[10]   Recent developments in imaging system assessment methodology, FROC analysis and the search model [J].
Chakraborty, Dev P. .
NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2011, 648 :S297-S301