Audiovisual Quality Fusion based on Relative Multimodal Complexity

Junyong You, Jari Korhonen, Ulrich Reiter

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review


In multimodal presentations the perceived audiovisual quality assessment is significantly influenced by the content of both the audio and visual tracks. Based on our earlier subjective quality test for finding the optimal trade-off between audio and video quality, this paper proposes a novel method for relative multimodal complexity analysis to derive the fusion parameter in objective audiovisual quality metrics. Audio and video qualities are first estimated separately using advanced quality models, and then they are combined into the overall audiovisual quality using a linear fusion. Based on carefully designed auditory and visual features, the relative complexity analysis model across sensory modalities is proposed for deriving the fusion parameter. Experimental results have demonstrated that the content adaptive fusion parameter can improve the prediction accuracy of objective audiovisual quality metrics, compared to the fusion parameters obtained from the subjective quality tests using other known optimization methods.
Original languageEnglish
Title of host publication2011 18th IEEE International Conference on Image Processing (ICIP)
Publication date2011
ISBN (Print)978-1-4577-1304-0
ISBN (Electronic)978-1-4577-1302-6
Publication statusPublished - 2011
Event18th IEEE International Conference on Image Processing - Brussels, Belgium
Duration: 11 Sep 201114 Sep 2011
Conference number: 18


Conference18th IEEE International Conference on Image Processing
Internet address
SeriesInternational Conference on Image Processing. Proceedings


  • Content analysis
  • Quality fusion
  • Audiovisual quality assessment
  • Multimodal complexity


Dive into the research topics of 'Audiovisual Quality Fusion based on Relative Multimodal Complexity'. Together they form a unique fingerprint.

Cite this