Semi-supervised eigenvectors for large-scale locally-biased learning

Toke Jansen Hansen, Michael W. Mahoney

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks nearby that prespecified target region. For example, one might be interested in the clustering structure of a data graph near a prespecified seed set of nodes, or one might be interested in finding partitions in an image that are near a prespecified ground truth set of pixels. Locally-biased problems of this sort are particularly challenging for popular eigenvector-based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities, thus limiting the applicability of eigenvector-based methods in situations where one is interested in very local properties of the data.

In this paper, we address this issue by providing a methodology to construct semi-supervised eigenvectors of a graph Laplacian, and we illustrate how these locally-biased eigenvectors can be used to perform locally-biased machine learning. These semi-supervised eigenvectors capture successively-orthogonalized directions of maximum variance, conditioned on being well-correlated with an input seed set of nodes that is assumed to be provided in a semi-supervised manner. We show that these semi-supervised eigenvectors can be computed quickly as the solution to a system of linear equations; and we also describe several variants of our basic method that have improved scaling properties. We provide several empirical examples demonstrating how these semi-supervised eigenvectors can be used to perform locally-biased learning; and we discuss the relationship between our results and recent machine learning algorithms that use global eigenvectors of the graph Laplacian.
Original languageEnglish
JournalJournal of Machine Learning Research
Volume15
Issue number1
Pages (from-to)3691-3734
ISSN1533-7928
Publication statusPublished - 2014

Keywords

  • Semi-supervised learning
  • Spectral clustering
  • Kernel methods
  • Large-scale machine learning
  • Local spectral methods
  • Locally-biased learning

Fingerprint

Dive into the research topics of 'Semi-supervised eigenvectors for large-scale locally-biased learning'. Together they form a unique fingerprint.

Cite this