Projects per year
This thesis presents an investigation and development of a number of unsupervised machine learning algorithms using Bayesian statistics. It is still demanding to apply many machine learning algorithms with one reason being that they often require the user to provide expert knowledge, such as a ground truth obtained from the result of an experiment or from querying an expert. As more processes become documented by data collection, more opportunities arise with a growing demand for machine learning algorithms that can learn in an unsupervised manner without requiring the ground truth. Here, Bayesian statistics can provide ways to exploit this potential created by the extensive data collection. This work includes three papers on unsupervised probabilistic machine learning methods using Bayesian inference. The probabilistic approaches allow for ways of automatically inferring the model complexity required to describe the analyzed data instead of requiring the user to determine it manually. The first paper investigates the benefit of using a probabilistic framework for a multiway decomposition method known as PARAFAC2 commonly applied for dealing with chromatographic data. The second paper proposes approaches for using the same probabilistic framework for the problem of binary or one-class classification as encountered in food authentication tasks. The third paper proposes an online algorithm for learning a clustering on a data stream based on a model known as Bayesian Hierarchical Clustering. Throughout these papers, the power of the probabilistic methods is demonstrated under these circumstances on synthetic and real data.
|Publisher||Technical University of Denmark|
|Number of pages||119|
|Publication status||Published - 2022|