Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedings – Annual report year: 2015Researchpeer-review

Standard

Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons. / Madsen, Jens; Jensen, Bjørn Sand; Larsen, Jan.

Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014). International Society for Music Information Retrieval, 2014. p. 319-324.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedings – Annual report year: 2015Researchpeer-review

Harvard

Madsen, J, Jensen, BS & Larsen, J 2014, Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons. in Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014). International Society for Music Information Retrieval, pp. 319-324, 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, Province of China, 27/10/2014.

APA

Madsen, J., Jensen, B. S., & Larsen, J. (2014). Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014) (pp. 319-324). International Society for Music Information Retrieval.

CBE

Madsen J, Jensen BS, Larsen J. 2014. Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014). International Society for Music Information Retrieval. pp. 319-324.

MLA

Madsen, Jens, Bjørn Sand Jensen, and Jan Larsen "Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons". Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014). International Society for Music Information Retrieval. 2014, 319-324.

Vancouver

Madsen J, Jensen BS, Larsen J. Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014). International Society for Music Information Retrieval. 2014. p. 319-324

Author

Madsen, Jens ; Jensen, Bjørn Sand ; Larsen, Jan. / Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons. Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014). International Society for Music Information Retrieval, 2014. pp. 319-324

Bibtex

@inproceedings{67beee62c5154a55bbe7e3b725efd911,
title = "Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons",
abstract = "The temporal structure of music is essential for the cognitive processes related to the emotions expressed in music. However, such temporal information is often disregarded in typical Music Information Retrieval modeling tasks of predicting higher-level cognitive or semantic aspects of music such as emotions, genre, and similarity. This paper addresses the specific hypothesis whether temporal information is essential for predicting expressed emotions in music, as a prototypical example of a cognitive aspect of music. We propose to test this hypothesis using a novel processing pipeline: 1) Extracting audio features for each track resulting in a multivariate ”feature time series”. 2) Using generative models to represent these time series (acquiring a complete track representation). Specifically, we explore the Gaussian Mixture model, Vector Quantization, Autoregressive model, Markov and Hidden Markov models. 3) Utilizing the generative models in a discriminative setting by selecting the Probability Product Kernel as the natural kernel for all considered track representations. We evaluate the representations using a kernel based model specifically extended to support the robust two-alternative forced choice self-report paradigm, used for eliciting expressed emotions in music. The methods are evaluated using two data sets and show increased predictive performance using temporal information, thus supporting the overall hypothesis.",
author = "Jens Madsen and Jensen, {Bj{\o}rn Sand} and Jan Larsen",
note = "copyright: Jens Madsen, Bj{\o}rn Sand Jensen, Jan Larsen. Licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). Attribution: Jens Madsen, Bj{\o}rn Sand Jensen, Jan Larsen. “Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons”, 15th International Society for Music Information Retrieval Conference, 2014.",
year = "2014",
language = "English",
pages = "319--324",
booktitle = "Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014)",
publisher = "International Society for Music Information Retrieval",

}

RIS

TY - GEN

T1 - Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons

AU - Madsen, Jens

AU - Jensen, Bjørn Sand

AU - Larsen, Jan

N1 - copyright: Jens Madsen, Bjørn Sand Jensen, Jan Larsen. Licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). Attribution: Jens Madsen, Bjørn Sand Jensen, Jan Larsen. “Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons”, 15th International Society for Music Information Retrieval Conference, 2014.

PY - 2014

Y1 - 2014

N2 - The temporal structure of music is essential for the cognitive processes related to the emotions expressed in music. However, such temporal information is often disregarded in typical Music Information Retrieval modeling tasks of predicting higher-level cognitive or semantic aspects of music such as emotions, genre, and similarity. This paper addresses the specific hypothesis whether temporal information is essential for predicting expressed emotions in music, as a prototypical example of a cognitive aspect of music. We propose to test this hypothesis using a novel processing pipeline: 1) Extracting audio features for each track resulting in a multivariate ”feature time series”. 2) Using generative models to represent these time series (acquiring a complete track representation). Specifically, we explore the Gaussian Mixture model, Vector Quantization, Autoregressive model, Markov and Hidden Markov models. 3) Utilizing the generative models in a discriminative setting by selecting the Probability Product Kernel as the natural kernel for all considered track representations. We evaluate the representations using a kernel based model specifically extended to support the robust two-alternative forced choice self-report paradigm, used for eliciting expressed emotions in music. The methods are evaluated using two data sets and show increased predictive performance using temporal information, thus supporting the overall hypothesis.

AB - The temporal structure of music is essential for the cognitive processes related to the emotions expressed in music. However, such temporal information is often disregarded in typical Music Information Retrieval modeling tasks of predicting higher-level cognitive or semantic aspects of music such as emotions, genre, and similarity. This paper addresses the specific hypothesis whether temporal information is essential for predicting expressed emotions in music, as a prototypical example of a cognitive aspect of music. We propose to test this hypothesis using a novel processing pipeline: 1) Extracting audio features for each track resulting in a multivariate ”feature time series”. 2) Using generative models to represent these time series (acquiring a complete track representation). Specifically, we explore the Gaussian Mixture model, Vector Quantization, Autoregressive model, Markov and Hidden Markov models. 3) Utilizing the generative models in a discriminative setting by selecting the Probability Product Kernel as the natural kernel for all considered track representations. We evaluate the representations using a kernel based model specifically extended to support the robust two-alternative forced choice self-report paradigm, used for eliciting expressed emotions in music. The methods are evaluated using two data sets and show increased predictive performance using temporal information, thus supporting the overall hypothesis.

M3 - Article in proceedings

SP - 319

EP - 324

BT - Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014)

PB - International Society for Music Information Retrieval

ER -