Multiblock PLS: Block dependent prediction modeling for Python

Andreas Baum*, Laurent Vermue

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

325 Downloads (Pure)

Abstract

Partial Least Squares (PLS) regression is a statistical method for supervised multivariate analysis. It relates two data blocks X and Y to each other with the aim of establishing a prediction model. When deployed in production, this model can be used to predict an outcome y from a newly measured feature vector x. PLS is popular in chemometrics, process control and other analytic fields, due to its striking advantages, namely the ability to analyze small sample sizes and the ability to handle high-dimensional data with cross-correlated features (where Ordinary Least Squares regression typically fails). In addition, and in contrast to many other machine learning approaches, PLS models can be interpreted using its latent variable structure just like principal components can be interpreted for a PCA analysis.
Original languageEnglish
Article number1190
JournalThe Journal of Open Source Software
Volume4
Issue number34
Number of pages5
DOIs
Publication statusPublished - 2019

Cite this

@article{c2297d21ee654de990dcf2f553ce868b,
title = "Multiblock PLS: Block dependent prediction modeling for Python",
abstract = "Partial Least Squares (PLS) regression is a statistical method for supervised multivariate analysis. It relates two data blocks X and Y to each other with the aim of establishing a prediction model. When deployed in production, this model can be used to predict an outcome y from a newly measured feature vector x. PLS is popular in chemometrics, process control and other analytic fields, due to its striking advantages, namely the ability to analyze small sample sizes and the ability to handle high-dimensional data with cross-correlated features (where Ordinary Least Squares regression typically fails). In addition, and in contrast to many other machine learning approaches, PLS models can be interpreted using its latent variable structure just like principal components can be interpreted for a PCA analysis.",
author = "Andreas Baum and Laurent Vermue",
year = "2019",
doi = "10.21105/joss.01190",
language = "English",
volume = "4",
journal = "The Journal of Open Source Software",
issn = "2475-9066",
publisher = "Open Journals",
number = "34",

}

Multiblock PLS: Block dependent prediction modeling for Python. / Baum, Andreas; Vermue, Laurent.

In: The Journal of Open Source Software, Vol. 4, No. 34, 1190, 2019.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Multiblock PLS: Block dependent prediction modeling for Python

AU - Baum, Andreas

AU - Vermue, Laurent

PY - 2019

Y1 - 2019

N2 - Partial Least Squares (PLS) regression is a statistical method for supervised multivariate analysis. It relates two data blocks X and Y to each other with the aim of establishing a prediction model. When deployed in production, this model can be used to predict an outcome y from a newly measured feature vector x. PLS is popular in chemometrics, process control and other analytic fields, due to its striking advantages, namely the ability to analyze small sample sizes and the ability to handle high-dimensional data with cross-correlated features (where Ordinary Least Squares regression typically fails). In addition, and in contrast to many other machine learning approaches, PLS models can be interpreted using its latent variable structure just like principal components can be interpreted for a PCA analysis.

AB - Partial Least Squares (PLS) regression is a statistical method for supervised multivariate analysis. It relates two data blocks X and Y to each other with the aim of establishing a prediction model. When deployed in production, this model can be used to predict an outcome y from a newly measured feature vector x. PLS is popular in chemometrics, process control and other analytic fields, due to its striking advantages, namely the ability to analyze small sample sizes and the ability to handle high-dimensional data with cross-correlated features (where Ordinary Least Squares regression typically fails). In addition, and in contrast to many other machine learning approaches, PLS models can be interpreted using its latent variable structure just like principal components can be interpreted for a PCA analysis.

U2 - 10.21105/joss.01190

DO - 10.21105/joss.01190

M3 - Journal article

VL - 4

JO - The Journal of Open Source Software

JF - The Journal of Open Source Software

SN - 2475-9066

IS - 34

M1 - 1190

ER -