FedPseudo: Privacy-Preserving Pseudo Value-Based Deep Learning Models for Federated Survival Analysis

Author/Creator ORCID

Date

2023-08-04

Department

Program

Citation of Original Publication

Rahman, Md Mahmudur, and Sanjay Purushotham. “FedPseudo: Privacy-Preserving Pseudo Value-Based Deep Learning Models for Federated Survival Analysis.” In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 1999–2009. KDD ’23. New York, NY, USA: Association for Computing Machinery, 2023. https://doi.org/10.1145/3580305.3599348.

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

Survival analysis, aka time-to-event analysis, has a wide-ranging impact on patient care. Federated Survival Analysis (FSA) is an emerging Federated Learning (FL) paradigm for performing survival analysis on distributed decentralized data available at multiple medical institutions. FSA enables individual medical institutions, referred to as clients, to improve their survival predictions while ensuring privacy. However, FSA faces challenges due to non-linear and non-IID data distributions among clients, as well as bias caused by censoring. Although recent studies have adapted Cox Proportional Hazards (CoxPH) survival models for FSA, a systematic exploration of these challenges is currently lacking. In this paper, we address these critical challenges by introducing FedPseudo, a pseudo value-based deep learning framework for FSA. FedPseudo uses deep learning models to learn robust representations from non-linear survival data, leverages the power of pseudo values to handle non-uniform censoring, and employs FL algorithms such as FedAvg to learn model parameters. We propose a novel and simple approach for estimating pseudo values for FSA. We provide theoretical proof that the estimated pseudo values, referred to as Federated Pseudo Values, are consistent. Moreover, our empirical results demonstrate that they can be computed faster than traditional methods of deriving pseudo values. To ensure and enhance the privacy of both the estimated pseudo values and the shared model parameters, we systematically investigate the application of differential privacy (DP) on both the federated pseudo values and local model updates. Furthermore, we adapt V -Usable Information metric to quantify the informativeness of a client's data for training a survival model and utilize this metric to show the advantages of participating in FSA. We conducted extensive experiments on synthetic and real-world survival datasets to demonstrate that our FedPseudo framework achieves better performance than other FSA approaches and performs similarly to the best centrally trained deep survival model. Moreover, FedPseudo consistently achieves superior results across different censoring settings.