-
Compounds with Activity against the Dopamine D2 Receptor
Database containing compounds active against the dopamine D2 receptor together with random inactive compounds as negative samples for learning purposes. Train, validation, and...-
ZIP
The resource: 'compound_activity_dopamine_d2' is not accessible as guest user. You must login to access it!
-
ZIP
-
FANCY Dataset
(NLI) FANCY (FActivity, Negation, Common-sense, hYpernimy) is a new dataset with 4000 sentence pairs concerning complex linguistic phenomena such as factivity, negation,... -
Santorini Tweets July-August 2021
This dataset contains 225.501 tweets written by 141.277 users. These tweets are geolocated in Santorini, or they contain the word or the hashtag "santorini" in the text. They...-
ZIP
The resource: 'tweet_santorini.csv' is not accessible as guest user. You must login to access it!
-
ZIP
-
Private Italian Thesaurus for Tourism domain
An Italian thesaurus in the domain of the Tourism, counting 2,684 concepts, organized according to semantic relationships (equivalence, hierarchical and associative). The... -
Italian Tourism Dataset
A set of users' comments crawled and scraped from two main touristic websites (Booking.com and Tripadvisor.com) related to main touristic point of interests in Italy and, in...-
HTML
The resource: 'tourism-dataset' is not accessible as guest user. You must login to access it!
-
HTML
-
Introduction to Data Curation
This course is an introduction to data collection, data preparation & transformation and data analysis. It contains the essential concepts for a researcher in order to...-
PDF
The resource: 'Introduction to Data Curation' is not accessible as guest user. You must login to access it!
-
PDF
-
-
ZIP
The resource: 'geo-annotated tweets.zip' is not accessible as guest user. You must login to access it!
-
ZIP
-
Wyscout soccer-logs dataset
A dataset of soccer-logs for all the main soccer leagues in the world, from season 2014/2015 to the current one. -
Soccer Events
This dataset contains data regarding one full season of soccer games. For each player there are locations (positions in pitch) visited and all the events they generated...-
ZIP
The resource: 'Soccer event data' is not accessible as guest user. You must login to access it!
-
ZIP
-
Ephemerality metric
https://github.com/HPAI-BSC/ephemerality Code for calculating the ephemerality metrics that can be used to estimate how "ephemeral" discussion topics are based on their...-
ZIP
The resource: 'ephemerality-main' is not accessible as guest user. You must login to access it!
-
ZIP
-
The PGM-index a fully-dynamic compressed learned index with provable worst-ca...
We present the first learned index that supports predecessor, range queries and updates within provably efficient time and space bounds in the worst case. In the (static)...-
PDF
The resource: 'link to publication' is not accessible as guest user. You must login to access it!
-
PDF
-
Python library for direct and indirect discrimination prevention in data mining
This python library implements the discrimination discovery and prevention method proposed in the paper: “A methodology for direct and indirect discrimination prevention in...-
GitHub
The resource: 'Link to library' is not accessible as guest user. You must login to access it!
-
GitHub
-
GSP - Geo-Semantic-Parsing
GSP receives a text document as input and returns an enriched document, where all mentions of places/locations are associated to the corresponding geographic coordinates. To... -
SMAPH Query Entity Linker
The SMAPH system links queries to the entities it mentions, disambiguating mentions if needed. Entities are Wikipedia pages. This problem is known as "entity recognition and...-
HTML
The resource: 'SMAPH documentation' is not accessible as guest user. You must login to access it!
-
HTML
-
Quantum Distance-Based Classifier
The Quantum Distance-Based Classifier is a technique inspired by the classical k-Nearest Neighbors that leverages quantum properties to perform prediction. -
CLiQS
CLiQS is a Python language software package for social media texts summarization with a diversified approach. -
Dictionary creator
This tool creates a dictionary with inverse document frequency (idf) values from the Google NGrams dataset. -
Wikipedia Word Embeddings
Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0... -
Conversational search dataset with labels
CAsT 2019 data is split into two files one for training and the other one for testing. - Training set: CAsT 2019 conversations from training set and from test set without...