-
Articles and comments of major Estonian newspapers
The dataset contains articles and comments of four major Estonian news portals since early 2000s to 2016. -
ClueWeb12
The ClueWeb12 dataset consists of 733,019,372 English web pages, collected between February 10, 2012 and May 10, 2012. It was created to support research on information... -
Sheffield NERD Tweet Corpus
The dataset contais 794 tweets annotated with named entities disambiguated against DBpedia, and split into equally sized training and test portions. 400 tweets from 2013 comes...-
FINF
The resource: 'Sheffield NERD Tweet Corpus' is not accessible as guest user. You must login to access it!
-
FINF
-
GPS Tracks - Tuscany 2011
This dataset contains GPS trajectories of private vehicles crossing the region of Tuscany in Italy. It is composed of about 11 mln of trips of 150.000 users collected in May... -
GeoLife - GPS trajectories dataset
This (link to a) GPS trajectory dataset was collected in (Microsoft Research Asia) Geolife project by 182 users in a period of over three years (from April 2007 to August 2012)....-
ZIP
The resource: 'GeoLife Download page' is not accessible as guest user. You must login to access it!
-
ZIP
-
Aalto-Twitter
The dataset consists of about 418 million of tweets from June 25, 2015 to September 19, 2015. Tweets are about trending hashtags gathered though the public Twitter api. -
Aalto-Foursquare
The dataset consists of about 15 million of tweets which point to public Foursquare check-ins. -
Open data from NervousNet
This dataset contains anonymized proximity information sent by 154 mobile phones (both Android and iPhone) via phone apps. These information are sent by bluetooth beacons every...-
ZIP
The resource: 'open data from NervousNet' is not accessible as guest user. You must login to access it!
-
ZIP
-
Micro Project Datasets: Academic Migration and Academic Networks
Datasets used and produced for and from the micro project titled: Academic Migration and Academic Networks: Evidence from Scholarly Big Data and the Iron Curtain-
HTML
The resource: 'Micro Project Datasets' is not accessible as guest user. You must login to access it!
-
HTML
-
Activity data from the Covid19 period
Activity data from Telia telecommunications company, Finland reports the number of people dwelling in area for a certain amount of time. More precisely, activity count... -
Private 64-tiles tessellation of Chicago
Squared tessellation of the city center of Chicago, Illinois, into 64 tiles. Tessellation only of the central part of Chicago, namely the neighborhoods 'LOOP', 'NEAR SOUTH... -
-
ZIP
The resource: 'dataset' is not accessible as guest user. You must login to access it!
-
ZIP
-
Covid infodemic in Italy -- Most retweeted accounts
Top 10 most retweeted accounts on Covid-related keywords, between Jan 30 and Mar 20, 2020.-
ZIP
The resource: 'dataset' is not accessible as guest user. You must login to access it!
-
ZIP
-
Twitter Dataset British MPs
This dataset contains the Twitter tweet_ids from the Timelines of 584 members of British Parliament (collected between 4th and 6th of March 2022). The users are identified from...-
TSV
The resource: 'Twitter Dataset British MPs' is not accessible as guest user. You must login to access it!
-
TSV
-
A dataset of journalists on Twitter
This dataset comprises the Twitter timelines of journalists belonging to 17 different countries from 8 different continental regions, downloaded in May 2018. We used the Twitter...-
HTML
The resource: 'Journalists dataset' is not accessible as guest user. You must login to access it!
-
HTML
-
Interaction bias. Experiments dataset
Artificial Intelligence (AI) is increasingly used to build Decision Support Systems (DSS) across many domains. In our work, we conducted a series of experiments designed to...-
JSON
The resource: 'Dataset' is not accessible as guest user. You must login to access it!
-
JSON
-
Multi-Task Faces (MTF) dataset
The Multi-Task Faces (MTF) dataset consists of cropped human faces for classification tasks or other research purposes. Each image in the dataset is labelled according to four...-
ZIP
The resource: 'MTF_dataset_20230701' is not accessible as guest user. You must login to access it!
-
ZIP
-
DBLP Network
The DBLP computer science bibliography provides a comprehensive list of research papers in computer science. This dataset is a co-authorship network constructed upon the DBLP...-
HTML
The resource: 'DBLP Network' is not accessible as guest user. You must login to access it!
-
HTML
-
Wikinews dataset
This dataset consists of a sample of 365 news published by Wikinews from November 2004 to June 2014 and annotated with about 5000 entities, each associated with a saliency...-
JSON
The resource: 'entity-saliency' is not accessible as guest user. You must login to access it!
-
JSON