Prediction of Drug-Induced Autoimmunity Using X Gradient Boost Machine Learning

Author/Creator ORCID

Department

Program

Citation of Original Publication

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Subjects

Abstract

Drug-induced autoimmunity (DIA) comprises immunemediated adverse events such as lupus, hepatitis, and uveitis that can arise after extended drug exposure, complicating prospective risk assessment. We built a gradient-boosted tree (XGBoost) classifier using 196 RDKit-derived molecular descriptors for 477 compounds[1] and addressed class imbalance with SMOTE. On a held-out test set, the model achieved ROC-AUC of 0.888 with 66.7% recall and 57.1% precision for the positive class; five-fold cross-validation indicated strong generalization (ROC-AUC 0.974 ± 0.067). Gain-based feature importance highlighted topological complexity, aromaticity, and polarity-related descriptors as salient. The framework enables rapid, cost-effective screening of autoimmune risk during early discovery to prioritize compounds for deeper evaluation.