NELA-GT-2018: A large multi-labelled news dataset for the study of misinformation in news articles

Jeppe Nørregaard, Benjamin D. Horne, Sibel Adalı

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

224 Downloads (Pure)

Abstract

In this paper, we present a dataset of 713k articles collected between 02/2018-11/2018. These articles are collected directly from 194 news and media outlets including mainstream, hyper-partisan, and conspiracy sources. We incorporate ground truth ratings of the sources from 8 different assessment sites covering multiple dimensions of veracity, including reliability, bias, transparency, adherence to journalistic standards, and consumer trust. The NELA-GT-2018 dataset can be found at https://doi.org/10.7910/DVN/ULHLCB.
Original languageEnglish
Title of host publicationProceedings of the Thirteenth International AAAI Conference on Web and Social Media
PublisherAAAI Press
Publication date2019
Pages630-638
ISBN (Print)978-1-57735-806-0
Publication statusPublished - 2019
Event13th International Conference on Web and Social Media - Munich, Germany
Duration: 11 Jun 201914 Jun 2019

Conference

Conference13th International Conference on Web and Social Media
Country/TerritoryGermany
CityMunich
Period11/06/201914/06/2019

Fingerprint

Dive into the research topics of 'NELA-GT-2018: A large multi-labelled news dataset for the study of misinformation in news articles'. Together they form a unique fingerprint.

Cite this