6 items found

Types: Dataset Groups: Societal Debates and Misinformation Tags: Web data

Filter Results
  • Dataset

    ClueWeb09

    The ClueWeb09 dataset consists of about 1 billion web pages in ten languages that were collected in January and February 2009. It was created to support research on...
  • Dataset

    DE webarchive

    The dataset consists of all the content from the .de top level domain as crawled by the Internet Archive.
    • HTML
      The resource: 'Internet Archive Wayback ...' is not accessible as guest user. You must login to access it!
  • Dataset

    .ee Web archive

    .ee Web archive consisting of snapshots from 2015
  • Dataset

    UK election abuse data

    The GATE team (gate.ac.uk) at the University of Sheffield have collected 1.4 million tweets sent to and by UK members of parliament in the months leading up to the 2015 and...
    • XLS
      The resource: 'uk-election-abuse.tar.gz' is not accessible as guest user. You must login to access it!
  • Dataset

    Articles and comments of major Estonian newspapers

    The dataset contains articles and comments of four major Estonian news portals since early 2000s to 2016.
  • Dataset

    ClueWeb12

    The ClueWeb12 dataset consists of 733,019,372 English web pages, collected between February 10, 2012 and May 10, 2012. It was created to support research on information...
You can also access this registry using the API (see API Docs).