Semantic analysis of links in the musical Wikipedia

Lasse Lohilahti Mølgaard, Lars Kai Hansen, Jan Larsen

    Research output: Book/ReportReportResearch

    Abstract

    Wikipedia has significant potential in music information retrieval research. In this work we analyze the of the link structure in the musical Wikipedia. Wikipedia links differ in certain ways from links on the Web at large. There are an over-abundance of internal links in Wikipedia, links are generated automatically, and they may even maliciously be used to promote certain topics. Wikipedia has been analyzed recently using methods fromWeb and text mining, however, the fact the link structure is different from the Web’s makes this approach questionable. To better understand the link structure and specifically to test the level of consistency of links and page content we perform Probabilistic Latent Semantic Analysis to extract topics from Wikipedia articles. The PLSA model is used to quantify how articles are related. The PLSA-based similarity of documents is then used to evaluate the semantic relevance of the actual links. Our analysis highlights the diversity of Wikipedia links and we conclude that semantic analysis could be a useful tool for Wikipedia.
    Original languageEnglish
    Number of pages7
    Publication statusPublished - 2008

    Fingerprint Dive into the research topics of 'Semantic analysis of links in the musical Wikipedia'. Together they form a unique fingerprint.

    Cite this