Projects per year
Abstract
The present thesis is about optimization of recurrent neural networks applied to time series modeling. In particular is considered fully recurrent networks working from only a single external input, one layer of nonlinear hidden units and a li near output unit applied to prediction of discrete time series. The overall objective s are to improve training by application of secondorder methods and to improve generalization ability by architecture optimization accomplished by pruning. The major topics covered in the thesis are: 1. The problem of training recurrent networks is analyzed from a numerical point of view. Especially it is analyzed how numerical illconditioning of the Hessian matrix might arise. 2. Training is significantly improved by application of the damped GaussNewton method, involving the Hessian. This method is found to outperform gradient descent in terms of both quality of solution obtained as well as computation time required. 3. A theoretical definition of the generalization error for recurrent networks is provided. This definition justifies a commonly adopted approach for estimating generalization ability. 4. The viability of pruning recurrent networks by the Optimal Brain Damage (OBD) and Optimal Brain Surgeon (OBS) pruning schemes is investigated. OBD is found to be very effective whereas OBS is severely influenced by numerical problems which leads to pruning of important weights. 5. A novel operational tool for examination of the internal memory of recurrent networks is proposed. The tool allows for assessment of the length of the effe ctive memory of previous inputs built up in the recurrent network during application. Time series modeling is also treated from a more general point of view, namely modeling of the joint probability distribution function of the observed series. Two recurrent models rooted in statistical physics are considered in this respect, namely the ``Boltzmann chain'' and the ``Boltzmann zipper'' and a comprehensive tutorial on these models is provided. Boltzmann chains and zippers are found to benefit as well from secondorder training and architecture optimization by pruning which is illustrated on artificial problems and a small speech recognition problem.
Original language  English 

Place of Publication  Kgs. Lyngby 

Publisher  Technical University of Denmark 
Number of pages  322 
Publication status  Published  Oct 1997 
Series  IMMPHD199737 

Fingerprint
Dive into the research topics of 'Optimization of recurrent neural networks for time series modeling'. Together they form a unique fingerprint.Projects
 1 Finished

Signalbehandling med feedback netværk
Pedersen, M. W., Hansen, L. K., Larsen, J., Lautrup, B. & Sørensen, H. B. D.
01/09/1994 → 31/10/1997
Project: PhD