An inter-rater variability study between human and automatic scorers in 5-s mini-epochs of sleep

Louise Frøstrup Follin*, Alexander Neergaard Zahid, Rannveig Viste, Janita Vevelstad, Tobias Kaufmann, Anette Ramm-Pettersen, Hilde T Juvodden, Berit Hjelde Hansen, Julie Anja Engelhard Christensen, Stine Knudsen-Heier

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

11 Downloads (Orbit)

Abstract

Study objective: Sleep is traditionally scored using 30-s epochs of polysomnographies. As sleep is physiologically dynamic and 30-s epochs may conceal important characteristics, we aim to challenge this standard by scoring sleep in 5-s mini-epochs and analyzing inter-rater variability between human and automatic scorers.
Methods: In 40 polysomnography recordings, 120 mini-epochs per polysomnography were scored manually by three human experts (expert1_5s, expert2_5s and expert3_5s) and automatically by a validated sleep classifier (USleep_5s). Additionally, 5-s mini-epochs (clinical_5s) extracted from conventional human-scored 30-s epochs were considered. We assessed inter-rater variability and stage shifting in epochs and mini-epochs and further in narcolepsy type 1 (NT1) patients and siblings.
Results: Agreement for mini-epochs was κ = 0.50 ± 0.11 (expert1_5s vs clinical_5s) and κ = 0.51 ± 0.12, (expert1_5s vs USleep_5s). Between human experts, agreement was κ = 0.51 ± 0.16 (expert1_5s vs expert2_5s), and κ = 0.57 ± 0.11 (expert1_5s vs expert3_5s). Stage shift percentages were significantly higher in mini-epochs scored by expert1_5s (27.75 %) and USleep_5s (22.88 %) than corresponding conventional epochs (5.12 %), with no significant difference between NT1 patients and siblings.
Conclusion: While mini-epoch scoring agreement was generally high, it was still lower than within epochs, likely due to a lack of standard mini-epoch scoring procedure and the automatic classifier being trained on epochs. However, stage discrepancies between epochs and mini-epochs and increased stage shifting in mini-epochs support that epochs can contain several stages, and that mini-epochs could supplement more detailed sleep characterization potentially enabling more precise diagnosis and finding new polysomnographic biomarkers. Future studies should include larger datasets to refine mini-epoch scoring rules and exploit automatic classifiers e.g. via transfer learning.
Original languageEnglish
JournalSleep Medicine
Volume128
Pages (from-to)139-150
ISSN1389-9457
DOIs
Publication statusPublished - 2025

Keywords

  • Automatic sleep classification
  • Computerized analysis
  • Inter-rater variability
  • Mini-epochs
  • Polysomnography
  • Sleep stage scoring

Fingerprint

Dive into the research topics of 'An inter-rater variability study between human and automatic scorers in 5-s mini-epochs of sleep'. Together they form a unique fingerprint.

Cite this