Items - SoBigData.eu Catalogue

Dataset

FANCY Dataset

(NLI) FANCY (FActivity, Negation, Common-sense, hYpernimy) is a new dataset with 4000 sentence pairs concerning complex linguistic phenomena such as factivity, negation,...
- The resource: 'FANCY Dataset' is not accessible as guest user. You must login to access it!

Dataset

Santorini Tweets July-August 2021

This dataset contains 225.501 tweets written by 141.277 users. These tweets are geolocated in Santorini, or they contain the word or the hashtag "santorini" in the text. They...
- ZIP
  The resource: 'tweet_santorini.csv' is not accessible as guest user. You must login to access it!

Dataset

The Italian Music Dataset

The dataset is built by exploiting the Spotify and SoundCloud APIs. It is composed of over 14,500 different songs of both famous and less famous Italian musicians. Each song...
- JSON
  The resource: 'Dataset' is not accessible as guest user. You must login to access it!

Dataset

Conversational search dataset with labels

CAsT 2019 data is split into two files one for training and the other one for testing. - Training set: CAsT 2019 conversations from training set and from test set without...
- The resource: 'Conversational dataset ...' is not accessible as guest user. You must login to access it!

Dataset

Dataset for Evaluating Abstractive Summaries of Crisis-Related Social Media

The dataset created for evaluation of summaries generated from social media posted during five natural disasters. The dataset contains: ground truth reports created by human...
- The resource: 'Dataset for Evaluating ...' is not accessible as guest user. You must login to access it!

You can also access this registry using the API (see API Docs).