Abstract
Although primarily an encyclopedia, Wikipedia’s expansive content provides a knowledge base that has been continuously exploited by researchers in a wide variety of domains. This article systematically reviews the scholarly studies that have used Wikipedia as a data source, and investigates the means by which Wikipedia has been employed in three main computer science research areas: information retrieval, natural language processing, and ontology building. We report and discuss the research trends of the identified and examined studies. We further identify and classify a list of tools that can be used to extract data from Wikipedia, and compile a list of currently available data sets extracted from Wikipedia.
| Original language | English |
|---|---|
| Journal | Information Processing & Management |
| Volume | 53 |
| Issue number | 2 |
| Pages (from-to) | 505–529 |
| Number of pages | 25 |
| ISSN | 0306-4573 |
| DOIs | |
| Publication status | Published - 2017 |
Keywords
- Information retrieval
- Information extraction
- Natural language processing
- Ontologies
- Wikipedia
- Literature review