Abstract
Improving the identification and treatment of mental health conditions is a key priority worldwide, as mental health disorders significantly impact both individuals and societies through increased healthcare demands, reduced quality of life and premature mortality. In recent years, data-driven psychiatry has emerged as a promising field that applies machine learning techniques to large-scale health data with the aim of improving the prediction, diagnosis, and treatment of mental health conditions. By analyzing rich and complex datasets, machine learning models have the potential to improve risk prediction, personalize treatment, and support clinical decision-making. This could enable clinicians to identify high-risk patients earlier and intervene proactively, ultimately improving patient outcomes and optimizing healthcare resources.
However, the integration of predictive models into real-world psychiatric practice remains limited. One reason for this is that existing studies often suffer from methodological shortcomings that undermine the reliability and clinical usefulness of the models. Key challenges include difficulties in handling highly imbalanced outcomes, limited evaluation of model generalizability, and inadequate evaluation of model performance across clinically relevant subgroups.
The goal of this thesis is to address these challenges and advance the research related to the proper development and evaluation of prediction models in psychiatry. Using nationwide registry data and machine learning methods, the thesis develops and evaluates four predictive models targeting different psychiatric outcomes, including the risk of suicide and suicide attempts after discharge from a psychiatric hospital stay, the risk of cardiovascular diseases in patients with schizophrenia, and the risk of mechanical restraint during a psychiatric hos pitalization. In doing so, the thesis addresses, advances and discusses several methodological challenges critical for clinical applicability, including threshold selection, choice of evaluation metrics, balancing model complexity with interpretability, and defining meaningful populations and outcomes.
A central contribution of the thesis is the systematic evaluation of group fairness, i.e. how the developed predictive models perform across diverse subgroups defined by e.g. age, sex, and migrant background. This provides insights into potential disparities in predictive performance - an aspect often overlooked in the existing literature. In addition, the thesis addresses the challenge of class imbalance, a common issue in psychiatric datasets where many clinical outcomes are rare yet critical to predict. Through both a literature review and an empirical analysis, the thesis assesses and discusses different sampling strategies for highly imbalanced data and provides guidelines for practitioners in handling high class imbalance.
Overall, the thesis contributes methodological frameworks and empirical findings that advance the field of data-driven psychiatry. The approaches and insights developed support the responsible deployment of machine learning models in mental health care and can also be adapted to other clinical contexts where similar challenges are present. By addressing challenges related to model development and evaluation, the thesis aims to support the responsible integration of machine learning into mental health services, with the ultimate aim of contributing to better care for patients.
However, the integration of predictive models into real-world psychiatric practice remains limited. One reason for this is that existing studies often suffer from methodological shortcomings that undermine the reliability and clinical usefulness of the models. Key challenges include difficulties in handling highly imbalanced outcomes, limited evaluation of model generalizability, and inadequate evaluation of model performance across clinically relevant subgroups.
The goal of this thesis is to address these challenges and advance the research related to the proper development and evaluation of prediction models in psychiatry. Using nationwide registry data and machine learning methods, the thesis develops and evaluates four predictive models targeting different psychiatric outcomes, including the risk of suicide and suicide attempts after discharge from a psychiatric hospital stay, the risk of cardiovascular diseases in patients with schizophrenia, and the risk of mechanical restraint during a psychiatric hos pitalization. In doing so, the thesis addresses, advances and discusses several methodological challenges critical for clinical applicability, including threshold selection, choice of evaluation metrics, balancing model complexity with interpretability, and defining meaningful populations and outcomes.
A central contribution of the thesis is the systematic evaluation of group fairness, i.e. how the developed predictive models perform across diverse subgroups defined by e.g. age, sex, and migrant background. This provides insights into potential disparities in predictive performance - an aspect often overlooked in the existing literature. In addition, the thesis addresses the challenge of class imbalance, a common issue in psychiatric datasets where many clinical outcomes are rare yet critical to predict. Through both a literature review and an empirical analysis, the thesis assesses and discusses different sampling strategies for highly imbalanced data and provides guidelines for practitioners in handling high class imbalance.
Overall, the thesis contributes methodological frameworks and empirical findings that advance the field of data-driven psychiatry. The approaches and insights developed support the responsible deployment of machine learning models in mental health care and can also be adapted to other clinical contexts where similar challenges are present. By addressing challenges related to model development and evaluation, the thesis aims to support the responsible integration of machine learning into mental health services, with the ultimate aim of contributing to better care for patients.
| Original language | English |
|---|
| Publisher | Technical University of Denmark |
|---|---|
| Number of pages | 318 |
| Publication status | Published - 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Fingerprint
Dive into the research topics of 'Predictive Modeling and Evaluation in Data-Driven Psychiatry'. Together they form a unique fingerprint.Projects
- 1 Finished
-
AI modeling and evaluation in precision psychiatry
Nielsen, S. D. (PhD Student), Clemmensen, L. K. H. (Main Supervisor), Eriksen Benros, M. (Supervisor), Ganz-Benjamin, M. (Examiner) & Mikaelsen, K. Ø. (Examiner)
01/12/2021 → 05/11/2025
Project: PhD
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver