TY - JOUR
T1 - Labels as a feature: Network homophily for systematically annotating human GPCR drug-target interactions
AU - Hansson, Frederik G.
AU - Madsen, Niklas Gesmar
AU - Hansen, Lea G.
AU - Jakočiūnas, Tadas
AU - Lengger, Bettina
AU - Keasling, Jay D.
AU - Jensen, Michael K.
AU - Acevedo-Rocha, Carlos G.
AU - Jensen, Emil D.
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025
Y1 - 2025
N2 - Machine learning has revolutionized drug discovery by enabling the exploration of vast, uncharted chemical spaces essential for discovering novel patentable drugs. Despite the critical role of human G protein-coupled receptors in FDA-approved drugs, exhaustive in-distribution drug-target interaction testing across all pairs of human G protein-coupled receptors and known drugs is rare due to significant economic and technical challenges. This often leaves off-target effects unexplored, which poses a considerable risk to drug safety. In contrast to the traditional focus on out-of-distribution exploration (drug discovery), we introduce a neighborhood-to-prediction model termed Chemical Space Neural Networks that leverages network homophily and training-free graph neural networks with labels as features. We show that Chemical Space Neural Networks’ ability to make accurate predictions strongly correlates with network homophily. Thus, labels as features strongly increase a machine learning model’s capacity to enhance in-distribution prediction accuracy, which we show by integrating labeled data during inference. We validate these advancements in a high-throughput yeast biosensing system (3773 drug-target interactions, 539 compounds, 7 human G protein-coupled receptors) to discover novel drug-target interactions for FDA-approved drugs and to expand the general understanding of how to build reliable predictors to guide experimental verification.
AB - Machine learning has revolutionized drug discovery by enabling the exploration of vast, uncharted chemical spaces essential for discovering novel patentable drugs. Despite the critical role of human G protein-coupled receptors in FDA-approved drugs, exhaustive in-distribution drug-target interaction testing across all pairs of human G protein-coupled receptors and known drugs is rare due to significant economic and technical challenges. This often leaves off-target effects unexplored, which poses a considerable risk to drug safety. In contrast to the traditional focus on out-of-distribution exploration (drug discovery), we introduce a neighborhood-to-prediction model termed Chemical Space Neural Networks that leverages network homophily and training-free graph neural networks with labels as features. We show that Chemical Space Neural Networks’ ability to make accurate predictions strongly correlates with network homophily. Thus, labels as features strongly increase a machine learning model’s capacity to enhance in-distribution prediction accuracy, which we show by integrating labeled data during inference. We validate these advancements in a high-throughput yeast biosensing system (3773 drug-target interactions, 539 compounds, 7 human G protein-coupled receptors) to discover novel drug-target interactions for FDA-approved drugs and to expand the general understanding of how to build reliable predictors to guide experimental verification.
U2 - 10.1038/s41467-025-59418-6
DO - 10.1038/s41467-025-59418-6
M3 - Journal article
C2 - 40316519
AN - SCOPUS:105004059826
SN - 2041-1723
VL - 16
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 4121
ER -