Active Semi-Supervised Learning via Bayesian Experimental Design for Lung Cancer Classification Using Low Dose Computed Tomography Scans
Loading...
Links to Files
Author/Creator ORCID
Date
2023-03-15
Type of Work
Department
Program
Citation of Original Publication
Nguyen, Phuong, Ankita Rathod, David Chapman, Smriti Prathapan, Sumeet Menon, Michael Morris, and Yelena Yesha. 2023. "Active Semi-Supervised Learning via Bayesian Experimental Design for Lung Cancer Classification Using Low Dose Computed Tomography Scans" Applied Sciences 13, no. 6: 3752. https://doi.org/10.3390/app13063752
Rights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Attribution 4.0 International (CC BY 4.0)
Attribution 4.0 International (CC BY 4.0)
Subjects
Abstract
We introduce an active, semisupervised algorithm that utilizes Bayesian experimental
design to address the shortage of annotated images required to train and validate Artificial Intelligence
(AI) models for lung cancer screening with computed tomography (CT) scans. Our approach
incorporates active learning with semisupervised expectation maximization to emulate the human
in the loop for additional ground truth labels to train, evaluate, and update the neural network
models. Bayesian experimental design is used to intelligently identify which unlabeled samples
need ground truth labels to enhance the model’s performance. We evaluate the proposed Active
Semi-supervised Expectation Maximization for Computer aided diagnosis (CAD) tasks (ASEM-CAD)
using three public CT scans datasets: the National Lung Screening Trial (NLST), the Lung Image
Database Consortium (LIDC), and Kaggle Data Science Bowl 2017 for lung cancer classification using
CT scans. ASEM-CAD can accurately classify suspicious lung nodules and lung cancer cases with an
area under the curve (AUC) of 0.94 (Kaggle), 0.95 (NLST), and 0.88 (LIDC) with significantly fewer
labeled images compared to a fully supervised model. This study addresses one of the significant
challenges in early lung cancer screenings using low-dose computed tomography (LDCT) scans and
is a valuable contribution towards the development and validation of deep learning algorithms for
lung cancer screening and other diagnostic radiology examinations.