Paper
News Category Dataset
Published Sep 23, 2022 · Rishabh Misra
ArXiv
104
Citations
9
Influential Citations
Abstract
People rely on news to know what is happening around the world and inform their daily lives. In today's world, when the proliferation of fake news is rampant, having a large-scale and high-quality source of authentic news articles with the published category information is valuable to learning authentic news' Natural Language syntax and semantics. As part of this work, we present a News Category Dataset that contains around 210k news headlines from the year 2012 to 2022 obtained from HuffPost, along with useful metadata to enable various NLP tasks. In this paper, we also produce some novel insights from the dataset and describe various existing and potential applications of our dataset.
This News Category Dataset contains 210k news headlines from 2012 to 2022 from HuffPost with useful metadata for learning authentic news' Natural Language syntax and semantics.
Full text analysis coming soon...