Extracting Essential Information and Making Inference from Data

Research output: Book/ReportPh.D. thesis

319 Downloads (Orbit)

Abstract

In recent years, machine learning and artificial intelligence systems have seen great success across a variety of domains. These systems are fueled by their underlying training data, which often stem from extensive historical datasets. Inspired by these advancements, the collection of data has exploded, and significant effort has been funneled into improving the models. Nonetheless, the data sources used to train machine learning models are often unstructured, noisy and encode historical biases. These aspects are frequently neglected during model optimization, and can seep into the trained model, resulting in poor or biased predictive inference. The goal of this thesis is to shift the focus towards the data by assessing and developing data reduction methods for learning summary data representations, which capture the essential information and are representative of the original data source. En route to this, an important consideration rests on defining and evaluating what representative data entail, and how they may be appropriately extracted. The thesis is divided into two parts. Firstly, part I presents an overview of current practises in data reduction, which demonstrate how dimensionality and numerosity reduction can be used to learn smaller data representations that can lead to reduced computational burdens and improved inference. Part II of the thesis presents the research contributions, which highlight and address various intricacies of the problem by evaluating the ways data can be representative of a target population, and how such data representations can be learned across several scientific domains.
Original languageEnglish
PublisherTechnical University of Denmark
Number of pages217
Publication statusPublished - 2023

Fingerprint

Dive into the research topics of 'Extracting Essential Information and Making Inference from Data'. Together they form a unique fingerprint.

Cite this