Active semi-supervised expectation maximization learning for lung cancer detection from Computerized Tomography (CT) images with minimally label training data

Author/Creator ORCID

Date

2020-03-16

Department

Program

Citation of Original Publication

Nguyen, Phuong; Chapman, David; Menon, Sumeet; Morris, Michael; Yesha, Yelena; Active semi-supervised expectation maximization learning for lung cancer detection from Computerized Tomography (CT) images with minimally label training data; SPIE Medical Imaging (2020); https://www.spiedigitallibrary.org/conference-proceedings-of-spie/11314/113142E/Active-semi-supervised-expectation-maximization-learning-for-lung-cancer-detection/10.1117/12.2549655.short?SSO=1

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
©2020 Society of Photo-Optical Instrumentation Engineers (SPIE). One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.

Subjects

Abstract

Artificial intelligence (AI) has great potential in medical imaging to augment the clinician as a virtual radiology assistant (vRA) through enriching information and providing clinical decision support. Deep learning is a type of AI that has shown promise in performance for Computer Aided Diagnosis (CAD) tasks. A current barrier to implementing deep learning for clinical CAD tasks in radiology is that it requires a training set to be representative and as large as possible in order to generalize appropriately and achieve high accuracy predictions. There is a lack of available, reliable, discretized and annotated labels for computer vision research in radiology despite the abundance of diagnostic imaging examinations performed in routine clinical practice. Furthermore, the process to create reliable labels is tedious, time consuming and requires expertise in clinical radiology. We present an Active Semi-supervised Expectation Maximization (ASEM) learning model for training a Convolutional Neural Network (CNN) for lung cancer screening using Computed Tomography (CT) imaging examinations. Our learning model is novel since it combines Semi-supervised learning via the Expectation-Maximization (EM) algorithm with Active learning via Bayesian experimental design for use with 3D CNNs for lung cancer screening. ASEM simultaneously infers image labels as a latent variable, while predicting which images, if additionally labeled, are likely to improve classification accuracy. The performance of this model has been evaluated using three publicly available chest CT datasets: Kaggle2017, NLST, and LIDC-IDRI. Our experiments showed that ASEM-CAD can identify suspicious lung nodules and detect lung cancer cases with an accuracy of 92% (Kaggle17), 93% (NLST), and 73% (LIDC) and Area Under Curve (AUC) of 0.94 (Kaggle), 0.88 (NLST), and 0.81 (LIDC). These performance numbers are comparable to fully supervised training, but use only slightly more than 50% of the training data labels .