Common Crawl Financial News Dataset

This dataset contains financial articles related to companies in the S&P500 index for the period from September 2016 to February 2020. The articles were extracted from the Common Crawl dataset ( based on the methodology in Jazbec, Pasztor, Faltings, Antulov-Fantulin and Kolm (2020).

Data and Resources
To access the resources you must log in
  • Common_Crawl_Financial_NewsCSV

    COMMON CRAWL FINANCIAL NEWS DATASET This dataset contains financial articles...

    The resource: 'Common_Crawl_Financial_News' is not accessible as guest user. You must login to access it!
Personal Data Attributes

Description: Personal Data related Information

Field Value
Anonymised No
ChildrenData No
General Data Yes
Personal Data No
Personal data was manifestly made public by the data subject No
Sensitive Data No
Additional Info
Field Value
Accessibility Both
Accessibility Mode Download
Availability On-Line
Basic rights Download
Creation Date 2024-02-01
Creator Antulov-Fantulin, Nino,,
Dataset Citation Dataset from Jazbec, M., Pàsztor, B., Faltings, F., Antulov-Fantulin, N., & Kolm, P. N. (2021). On the impact of publicly available news and information transfer to financial markets. Royal Society Open Science, 8(7), 202321
Dataset Re-Use Safeguards None
Field/Scope of use Non-commercial only
Group Demography, Economy and Finance 2.0
License term 2024-02-01 /2040-12-31
Manifestation Type Virtual
Processing Degree Primary
Retention Period 2024-02-01 /2050-12-31
SoBigData Node SoBigData EU
Sublicense rights No
Territory of use World Wide
Thematic Cluster Social Data [SD]
Time Coverage 2016-09-01 /2020-02-01
system:type Dataset
Management Info
Field Value
Author Vasiliauskaite Vaiva
Maintainer Antulov-Fantulin Nino
Version 1
Last Updated 27 February 2024, 09:37 (CET)
Created 1 February 2024, 16:50 (CET)