Abstract
Regularization, e.g., in the form of weight decay, is important for training and optimization of neural network architectures. In this work the authors provide a tool based on asymptotic sampling theory, for iterative estimation of weight decay parameters. The basic idea is to do a gradient descent in the estimated generalization error with respect to the regularization parameters. The scheme is implemented in the authors' Designer Net framework for network training and pruning, i.e., is based on the diagonal Hessian approximation. The scheme does not require essential computational overhead in addition to what is needed for training and pruning. The viability of the approach is demonstrated in an experiment concerning prediction of the chaotic Mackey-Glass series. The authors find that the optimized weight decays are relatively large for densely connected networks in the initial pruning phase, while they decrease as pruning proceeds
Original language | English |
---|---|
Title of host publication | Proceedings of the 4th IEEE Workshop Neural Networks for Signal Processing |
Publisher | IEEE |
Publication date | 1994 |
Pages | 78-87 |
ISBN (Print) | 07-80-32026-3 |
DOIs | |
Publication status | Published - 1994 |
Event | 1994 IEEE Workshop on Neural Networks for Signal Processing - Ermoino, Greece Duration: 6 Sept 1994 → 8 Sept 1994 Conference number: 4 https://ieeexplore.ieee.org/xpl/conhome/2959/proceeding http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=2959 |
Conference
Conference | 1994 IEEE Workshop on Neural Networks for Signal Processing |
---|---|
Number | 4 |
Country/Territory | Greece |
City | Ermoino |
Period | 06/09/1994 → 08/09/1994 |
Internet address |