Machine learning can identify newly diagnosed patients with CLL at high risk of infection

Rudi Agius, Christian Brieghel, Michael A. Andersen, Alexander T. Pearson, Bruno Ledergerber, Alessandro Cozzi-Lepri, Yoram Louzoun, Christen L. Andersen, Jacob Bergstedt, Jakob H. von Stemann, Mette Jørgensen, Man Hung Eric Tang, Magnus Fontes, Jasmin Bahlo, Carmen D. Herling, Michael Hallek, Jens Lundgren, Cameron Ross MacPherson, Jan Larsen, Carsten U. Niemann*

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

83 Downloads (Pure)


Infections have become the major cause of morbidity and mortality among patients with chronic lymphocytic leukemia (CLL) due to immune dysfunction and cytotoxic CLL treatment. Yet, predictive models for infection are missing. In this work, we develop the CLL Treatment-Infection Model (CLL-TIM) that identifies patients at risk of infection or CLL treatment within 2 years of diagnosis as validated on both internal and external cohorts. CLL-TIM is an ensemble algorithm composed of 28 machine learning algorithms based on data from 4,149 patients with CLL. The model is capable of dealing with heterogeneous data, including the high rates of missing data to be expected in the real-world setting, with a precision of 72% and a recall of 75%. To address concerns regarding the use of complex machine learning algorithms in the clinic, for each patient with CLL, CLL-TIM provides explainable predictions through uncertainty estimates and personalized risk factors.

Original languageEnglish
Article number363
JournalNature Communications
Issue number1
Number of pages17
Publication statusPublished - 1 Dec 2020

Fingerprint Dive into the research topics of 'Machine learning can identify newly diagnosed patients with CLL at high risk of infection'. Together they form a unique fingerprint.

Cite this