Multi-modal deep learning for joint prediction of otitis media and diagnostic difficulty

Josefine Vilsbøll Sundgaard*, Morten Rieger Hannemose, Søren Laugesen, Peter Bray, James Harte, Yosuke Kamide, Chiemi Tanaka, Rasmus R. Paulsen, Anders Nymark Christensen

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

17 Downloads (Pure)


Objectives: In this study, we propose a diagnostic model for automatic detection of otitis media based on combined input of otoscopy images and wideband tympanometry measurements.
Methods: We present a neural network-based model for the joint prediction of otitis media and diagnostic difficulty. We use the subclassifications acute otitis media and otitis media with effusion. The proposed approach is based on deep metric learning, and we compare this with the performance of a standard multi-task network.
Results: The proposed deep metric approach shows good performance on both tasks, and we show that the multi-modal input increases the performance for both classification and difficulty estimation compared to the models trained on the modalities separately. An accuracy of 86.5% is achieved for the classification task, and a Kendall rank correlation coefficient of 0.45 is achieved for difficulty estimation, corresponding to a correct ranking of 72.6% of the cases.
Conclusion: This study demonstrates the strengths of a multi-modal diagnostic tool using both otoscopy images and wideband tympanometry measurements for the diagnosis of otitis media. Furthermore, we show that deep metric learning improves the performance of the models.
Original languageEnglish
Article numbere1199
JournalLaryngoscope Investigative Otolaryngology
Issue number1
Number of pages7
Publication statusPublished - 2024


Dive into the research topics of 'Multi-modal deep learning for joint prediction of otitis media and diagnostic difficulty'. Together they form a unique fingerprint.

Cite this