Process Knowledge Discovery Using Sparse Principal Component Analysis

Huihui Gao, Shriram Gajjar, Murat Kulahci, Qunxiong Zhu, Ahmet Palazoglu

Research output: Contribution to journalJournal articleResearchpeer-review


As the goals of ensuring process safety and energy efficiency become ever more challenging, engineers increasingly rely on data collected from such processes for informed decision making. During recent decades, extracting and interpreting valuable process information from large historical data sets have been an active area of research. Among the methods used, principal component analysis (PCA) is a well-established technique that allows for dimensionality reduction for large data sets by finding new uncorrelated variables, namely principal components (PCs). However, it is difficult to interpret the derived PCs, as each PC is a linear combination of all of the original variables and the loadings are typically nonzero. Sparse principal component analysis (SPCA) is a relatively recent technique proposed for producing PCs with sparse loadings via the variance sparsity trade-off. We propose a forward SPCA approach that helps uncover the underlying process knowledge regarding variable relations. This approach systematically determines the optimal sparse loadings for each sparse PC while improving interpretability and minimizing information loss. The salient features of the proposed approach are demonstrated through the Tennessee Eastman process simulation. The results indicate how knowledge and process insight can be discovered through a systematic analysis of sparse loadings.
Original languageEnglish
JournalIndustrial and Engineering Chemistry Research
Issue number46
Pages (from-to)12046-12059
Publication statusPublished - 2016


  • PCA

Fingerprint Dive into the research topics of 'Process Knowledge Discovery Using Sparse Principal Component Analysis'. Together they form a unique fingerprint.

Cite this