Scaling the Inference of Digital Pathology Deep Learning Models Using CPU-Based High-Performance Computing
Loading...
Links to Files
Author/Creator
Author/Creator ORCID
Date
2023-02-17
Type of Work
Department
Program
Citation of Original Publication
Li, Weizhe, Mike Mikailov, and Weijie Chen. “Scaling the Inference of Digital Pathology Deep Learning Models Using CPU-Based High-Performance Computing.” IEEE Transactions on Artificial Intelligence 4, no. 6 (December 2023): 1691–1704. https://doi.org/10.1109/TAI.2023.3246032.
Rights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
CC BY 4.0 DEED Attribution 4.0 International
CC BY 4.0 DEED Attribution 4.0 International
Subjects
Abstract
Digital pathology whole-slide images (WSIs) are large-size gigapixel images, and image analysis based on deep learning artificial intelligence technology often involves pixelwise testing of a trained deep learning neural network (DLNN) on hundreds of WSI images, which is time-consuming. We take advantage of high-performance computing (HPC) facilities to parallelize this procedure into multiple independent (and hence delightfully parallel) tasks. However, traditional software parallelization techniques and regular file formats can have significant scaling problems on HPC clusters. In this work, a useful computational strategy is designed to localize and extract relevant patches in WSI files and group them in Hierarchical Data Format version 5 files well suited for parallel I/O. HPC's array job facilities are adapted for hierarchical scaling and parallelization of WSI preprocessing and testing of trained algorithms. Applying these techniques to testing a trained DLNN on the CAMELYON datasets with 399 WSIs reduced the theoretical processing time of 18 years on a single central processing unit (CPU) or 30 days on a single graphics processing unit to less than 45 h on an HPC cluster of 4000 CPU cores. The efficiency–accuracy tradeoff we demonstrated on this dataset further reinforced the importance of efficient computation techniques, without which accuracy may be sacrificed. The framework developed here for testing DLNNs does not rely on any specific neural network architecture and HPC cluster setup and can be utilized for any large-scale image processing and big-data analysis.