Adaptive regularization

Lars Kai Hansen, Carl Edward Rasmussen, C. Svarer, Jan Larsen

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    722 Downloads (Pure)

    Abstract

    Regularization, e.g., in the form of weight decay, is important for training and optimization of neural network architectures. In this work the authors provide a tool based on asymptotic sampling theory, for iterative estimation of weight decay parameters. The basic idea is to do a gradient descent in the estimated generalization error with respect to the regularization parameters. The scheme is implemented in the authors' Designer Net framework for network training and pruning, i.e., is based on the diagonal Hessian approximation. The scheme does not require essential computational overhead in addition to what is needed for training and pruning. The viability of the approach is demonstrated in an experiment concerning prediction of the chaotic Mackey-Glass series. The authors find that the optimized weight decays are relatively large for densely connected networks in the initial pruning phase, while they decrease as pruning proceeds
    Original languageEnglish
    Title of host publicationProceedings of the 4th IEEE Workshop Neural Networks for Signal Processing
    PublisherIEEE
    Publication date1994
    Pages78-87
    ISBN (Print)07-80-32026-3
    DOIs
    Publication statusPublished - 1994
    Event1994 IEEE Workshop on Neural Networks for Signal Processing - Ermoino, Greece
    Duration: 6 Sept 19948 Sept 1994
    Conference number: 4
    https://ieeexplore.ieee.org/xpl/conhome/2959/proceeding
    http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=2959

    Conference

    Conference1994 IEEE Workshop on Neural Networks for Signal Processing
    Number4
    Country/TerritoryGreece
    CityErmoino
    Period06/09/199408/09/1994
    Internet address

    Bibliographical note

    Copyright: 1994 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE

    Fingerprint

    Dive into the research topics of 'Adaptive regularization'. Together they form a unique fingerprint.

    Cite this