Speaker
Description
In the healthcare domain, sensitive patient data is inherently decentralized across institutions and cannot be centralized due to strict privacy regulations. Federated learning offers a collaborative model training without explicitly sharing patient data by communicating model parameters or soft labels. These approaches, however, are still vulnerable to privacy leakage and often limit model flexibility to those trainable by gradient-based methods. We present a novel federated co-training (FEDCT) method that enhances privacy substantially by exchanging only hard (definitive) labels on a shared public unlabeled dataset. Participating healthcare institutions collaboratively generate consensus labels, which are used as pseudo-labels to train local models. This approach not only empirically improves resistance to membership inference attacks but also supports a wider range of models, including interpretable and non-differentiable algorithms like decision trees and ensemble methods. FEDCT is particularly suited for healthcare use cases such as distributed radiology analysis or clinical data modeling over several hospitals, where model interpretability, privacy, and communication efficiency are paramount. Our theoretical analysis and empirical evaluations on both general and medical datasets demonstrate that FEDCT achieves high model performance with significantly improved privacy guarantees, facilitating secure and practical federated learning in sensitive healthcare environments.