Modelling Digital Media Objects

Rasmus Troelsgaard

Research output: Book/ReportPh.D. thesisResearch

211 Downloads (Pure)


The goal of this thesis is to investigate two relevant issues regarding computational representation and classification of digital multi-media objects. With a special focus on music, a model for representation of objects comprising multiple heterogeneous data types is investigated. Necessary to this work are considerations regarding integration of multiple diverse data modalities and evaluation of the resulting concept representation.

Regarding modelling of data exhibiting certain sequential structure, a number of theoretical and empirical results are presented. These are results related to model parameter estimation and the use of sequence models in a classification scenario. The latter being of importance in various digital multimedia navigation and retrieval tasks.

In the fields of topic modelling and multi-modal integration, we formulate a model to describe entities composed of multiple aspects. The particular aspects considered in the publications are sound, song lyrics, and user-provided metadata. This model integrates the diverse data types comprising the objects and defines concrete unified representations in a joint “semantic” space. Within the context of this model, general measures of similarity between such multi-modal objects are investigated.

In the fields of method of moments and sequence modelling, we increase practical applicability of a certain moment based parameter estimation method for Hidden Markov models by showing how to use full-length sequences in the estimation process. Consequently, this impacts the quality of the estimated model parameters.

Subsequently, we show how to perform time series classification using a composite likelihood formulated from third order moments defined by the Hidden Markov model. Compared to the conventional likelihood based method, our contribution is less computationally expensive, while retaining the level of classification performance.
Original languageEnglish
Place of PublicationKgs. Lyngby
PublisherTechnical University of Denmark
Number of pages97
Publication statusPublished - 2017
SeriesDTU Compute PHD-2016


Dive into the research topics of 'Modelling Digital Media Objects'. Together they form a unique fingerprint.

Cite this