Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation

Payman Sadegh, J. C. Spall

    Research output: Contribution to journalJournal articleResearchpeer-review

    667 Downloads (Pure)

    Abstract

    The simultaneous perturbation stochastic approximation (SPSA) algorithm has attracted considerable attention for challenging optimization problems where it is difficult or impossible to obtain a direct gradient of the objective (say, loss) function. The approach is based on a highly efficient simultaneous perturbation approximation to the gradient based on loss function measurements. SPSA is based on picking a simultaneous perturbation (random) vector in a Monte Carlo fashion as part of generating the approximation to the gradient. This paper derives the optimal distribution for the Monte Carlo process. The objective is to minimize the mean square error of the estimate. The authors also consider maximization of the likelihood that the estimate be confined within a bounded symmetric region of the true parameter. The optimal distribution for the components of the simultaneous perturbation vector is found to be a symmetric Bernoulli in both cases. The authors end the paper with a numerical study related to the area of experiment design
    Original languageEnglish
    JournalI E E E Transactions on Automatic Control
    Volume43
    Issue number10
    Pages (from-to)1480-1484
    ISSN0018-9286
    DOIs
    Publication statusPublished - 1998

    Bibliographical note

    Copyright: 1998 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE

    Fingerprint

    Dive into the research topics of 'Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation'. Together they form a unique fingerprint.

    Cite this