Scaling the Inference of Digital Pathology Deep Learning Models Using CPU-Based High-Performance Computing

Li, Weizhe; Mikailov, Mike; Chen, Weijie

Scaling the Inference of Digital Pathology Deep Learning Models Using CPU-Based High-Performance Computing

dc.contributor.author	Li, Weizhe
dc.contributor.author	Mikailov, Mike
dc.contributor.author	Chen, Weijie
dc.date.accessioned	2024-04-15T17:15:11Z
dc.date.available	2024-04-15T17:15:11Z
dc.date.issued	2023-02-17
dc.description.abstract	Digital pathology whole-slide images (WSIs) are large-size gigapixel images, and image analysis based on deep learning artificial intelligence technology often involves pixelwise testing of a trained deep learning neural network (DLNN) on hundreds of WSI images, which is time-consuming. We take advantage of high-performance computing (HPC) facilities to parallelize this procedure into multiple independent (and hence delightfully parallel) tasks. However, traditional software parallelization techniques and regular file formats can have significant scaling problems on HPC clusters. In this work, a useful computational strategy is designed to localize and extract relevant patches in WSI files and group them in Hierarchical Data Format version 5 files well suited for parallel I/O. HPC's array job facilities are adapted for hierarchical scaling and parallelization of WSI preprocessing and testing of trained algorithms. Applying these techniques to testing a trained DLNN on the CAMELYON datasets with 399 WSIs reduced the theoretical processing time of 18 years on a single central processing unit (CPU) or 30 days on a single graphics processing unit to less than 45 h on an HPC cluster of 4000 CPU cores. The efficiency–accuracy tradeoff we demonstrated on this dataset further reinforced the importance of efficient computation techniques, without which accuracy may be sacrificed. The framework developed here for testing DLNNs does not rely on any specific neural network architecture and HPC cluster setup and can be utilized for any large-scale image processing and big-data analysis.
dc.description.uri	https://ieeexplore.ieee.org/abstract/document/10048520
dc.format.extent	14 pages
dc.genre	journal articles
dc.identifier	doi:10.13016/m2v2nz-wjcj
dc.identifier.citation	Li, Weizhe, Mike Mikailov, and Weijie Chen. “Scaling the Inference of Digital Pathology Deep Learning Models Using CPU-Based High-Performance Computing.” IEEE Transactions on Artificial Intelligence 4, no. 6 (December 2023): 1691–1704. https://doi.org/10.1109/TAI.2023.3246032.
dc.identifier.uri	https://doi.org/10.1109/TAI.2023.3246032
dc.identifier.uri	http://hdl.handle.net/11603/33041
dc.publisher	IEEE
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights	CC BY 4.0 DEED Attribution 4.0 International	en
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.title	Scaling the Inference of Digital Pathology Deep Learning Models Using CPU-Based High-Performance Computing
dc.type	Text

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Scaling_the_Inference_of_Digital_Pathology_Deep_Learning_Models_Using_CPU-Based_High-Performance_Computing.pdf
Size:: 7.54 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department