Approximating The DCM

Rasmus Elsborg Madsen

    Research output: Book/ReportReportResearchpeer-review

    Abstract

    The Dirichlet compound multinomial (DCM), which has recently been shown to be well suited for modeling for word burstiness in documents, is here investigated. A number of conceptual explanations that account for these recent results, are provided. An exponential family approximation of the DCM that is substantially faster to train, while still producing similar probabilities and classification performance is provided.
    Original languageEnglish
    Publication statusPublished - 2005

    Cite this