Subset Selection by Local Convex Approximation

Henrik Øjelund, Payman Sadegh, Henrik Madsen, Poul Thyregod

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    Abstract

    This paper concerns selection of the optimal subset of variables in a lenear regression setting. The posed problem is combinatiorial and the globally best subset can only be found in exponential time. We define a cost function for the subset selection problem by adding the penalty term to the usual least squares criterion. We propose an optimization technique for the posed probelm based on a modified version of the Newton-Raphson iterations, combined with a backward elimination type algorithm. THe Newton-Raphson modification concerns iterative approximations to the non-convex cost function of the subset selection problem so as to guarantee positive definiteness of the Hessian term, hence avoiding numerical instability. The backward Elemination type algorithm attempts to improve the results upon termination of the modified Newton-Raphson search by sing the current solution as an initial guess. The efficiency of the method is illustrated by a numerical example with highly correlated explanatory variables for which the commonly used techiques such as forward selection/backward elimination perform poorly.
    Original languageEnglish
    Title of host publicationSymposium i Anvendt Statistik
    Place of PublicationCopenhagen
    PublisherAKF - SFI SUrvey
    Publication date1999
    Pages220-230
    Publication statusPublished - 1999
    Event21st Symposium in Applied Statistics - Handelshøjskolen (CBS), Copenhagen, Denmark
    Duration: 25 Jan 199927 Jan 1999
    Conference number: 21

    Conference

    Conference21st Symposium in Applied Statistics
    Number21
    LocationHandelshøjskolen (CBS)
    CountryDenmark
    CityCopenhagen
    Period25/01/199927/01/1999

    Cite this

    Øjelund, H., Sadegh, P., Madsen, H., & Thyregod, P. (1999). Subset Selection by Local Convex Approximation. In Symposium i Anvendt Statistik (pp. 220-230). AKF - SFI SUrvey.