Reddit Echo Chamber dataset

In a digital environment, the term echo chamber refers to an alarming phenomenon in which beliefs are amplified or reinforced by communication repetition inside a closed system and insulated from rebuttal. Up to date, a formal definition, as well as a platform-independent approach for its detection, is still lacking. This paper proposes a general framework to identify echo chambers on online social networks built on top of features they commonly share. Our approach is based on a four-step pipeline that involves (i) the identification of a controversial issue; (ii) the inference of users’ ideology on the controversy; (iii) the construction of users’ debate network; and (iv) the detection of homogeneous meso-scale communities. We further apply our framework in a detailed case study on Reddit, covering the first two and a half years of Donald Trump’s presidency. Our main purpose is to assess the existence of Pro-Trump and Anti-Trump echo chambers among three sociopolitical issues, as well as to analyze their stability and consistency over time. Even if users appear strongly polarized concerning their ideology, most tend not to insulate themselves in echo chambers. However, the identified polarized communities were proven to be stable over time.

Data and Resources
To access the resources you must log in
  • Reddit EchochamberZIP

    The resource: 'Reddit Echochamber' is not accessible as guest user. You must login to access it!
Personal Data Attributes

Description: Personal Data related Information

Field Value
Anonymised Pseudo Anonymized
ChildrenData No
General Data Yes
Personal Data No
Personal data was manifestly made public by the data subject No
Sensitive Data No
Additional Info
Field Value
Accessibility Both
Accessibility Mode Download
Availability On-Line
Basic rights Download
Basic rights Copying
Basic rights Distribution
Creation Date 2024-02-02
Creator Morini, Virginia,
Dataset Citation Morini, V., Pollacci, L., & Rossetti, G. (2021). Toward a standard approach for echo chamber detection: Reddit case study. Applied Sciences, 11(12), 5390.
Dataset Re-Use Safeguards None
External Identifier 10.5281/zenodo.10606191
Field/Scope of use Any use
Group Societal Debates and Misinformation
License term 2024-02-02 /2030-12-31
Manifestation Type Virtual
Processing Degree Primary
Retention Period 2024-02-02 /2050-12-31
Size 13MB
SoBigData Node SoBigData EU
SoBigData Node SoBigData IT
Sublicense rights No
Territory of use World Wide
Thematic Cluster Social Network Analysis [SNA]
system:type Dataset
Management Info
Field Value
Author Rossetti Giulio
Maintainer Rossetti Giulio
Version 1
Last Updated 2 February 2024, 09:44 (CET)
Created 2 February 2024, 09:36 (CET)