43 items found

Tags: Information Retrieval

Filter Results
  • Dataset

    Compounds with Activity against the Dopamine D2 Receptor

    Database containing compounds active against the dopamine D2 receptor together with random inactive compounds as negative samples for learning purposes. Train, validation, and...
    • ZIP
      The resource: 'compound_activity_dopamine_d2' is not accessible as guest user. You must login to access it!
  • Dataset

    FANCY Dataset

    (NLI) FANCY (FActivity, Negation, Common-sense, hYpernimy) is a new dataset with 4000 sentence pairs concerning complex linguistic phenomena such as factivity, negation,...
    • The resource: 'FANCY Dataset' is not accessible as guest user. You must login to access it!
  • Dataset

    Santorini Tweets July-August 2021

    This dataset contains 225.501 tweets written by 141.277 users. These tweets are geolocated in Santorini, or they contain the word or the hashtag "santorini" in the text. They...
    • ZIP
      The resource: 'tweet_santorini.csv' is not accessible as guest user. You must login to access it!
  • Access required...

    ×

    Dataset

    Private Italian Thesaurus for Tourism domain

    An Italian thesaurus in the domain of the Tourism, counting 2,684 concepts, organized according to semantic relationships (equivalence, hierarchical and associative). The...
  • Dataset

    Italian Tourism Dataset

    A set of users' comments crawled and scraped from two main touristic websites (Booking.com and Tripadvisor.com) related to main touristic point of interests in Italy and, in...
    • HTML
      The resource: 'tourism-dataset' is not accessible as guest user. You must login to access it!
  • TrainingMaterial

    Introduction to Data Curation

    This course is an introduction to data collection, data preparation & transformation and data analysis. It contains the essential concepts for a researcher in order to...
    • PDF
      The resource: 'Introduction to Data Curation' is not accessible as guest user. You must login to access it!
    • ZIP
      The resource: 'geo-annotated tweets.zip' is not accessible as guest user. You must login to access it!
  • Dataset

    Wyscout soccer-logs dataset

    A dataset of soccer-logs for all the main soccer leagues in the world, from season 2014/2015 to the current one.
  • Dataset

    Soccer Events

    This dataset contains data regarding one full season of soccer games. For each player there are locations (positions in pitch) visited and all the events they generated...
    • ZIP
      The resource: 'Soccer event data' is not accessible as guest user. You must login to access it!
  • Method

    Ephemerality metric

    https://github.com/HPAI-BSC/ephemerality Code for calculating the ephemerality metrics that can be used to estimate how "ephemeral" discussion topics are based on their...
    • ZIP
      The resource: 'ephemerality-main' is not accessible as guest user. You must login to access it!
  • JournalArticle

    The PGM-index a fully-dynamic compressed learned index with provable worst-ca...

    We present the first learned index that supports predecessor, range queries and updates within provably efficient time and space bounds in the worst case. In the (static)...
    • PDF
      The resource: 'link to publication' is not accessible as guest user. You must login to access it!
  • Method

    Python library for direct and indirect discrimination prevention in data mining

    This python library implements the discrimination discovery and prevention method proposed in the paper: “A methodology for direct and indirect discrimination prevention in...
    • GitHub
      The resource: 'Link to library' is not accessible as guest user. You must login to access it!
  • Method

    GSP - Geo-Semantic-Parsing

    GSP receives a text document as input and returns an enriched document, where all mentions of places/locations are associated to the corresponding geographic coordinates. To...
  • Application

    SMAPH Query Entity Linker

    The SMAPH system links queries to the entities it mentions, disambiguating mentions if needed. Entities are Wikipedia pages. This problem is known as "entity recognition and...
    • HTML
      The resource: 'SMAPH documentation' is not accessible as guest user. You must login to access it!
  • Method

    Quantum Distance-Based Classifier

    The Quantum Distance-Based Classifier is a technique inspired by the classical k-Nearest Neighbors that leverages quantum properties to perform prediction.
  • Method

    CLiQS

    CLiQS is a Python language software package for social media texts summarization with a diversified approach.
    • The resource: 'CLiQS-CM' is not accessible as guest user. You must login to access it!
  • Method

    Fast-Forward Indexes

    Efficient interpolation-based ranking on CPUs
    • The resource: 'PyPI packge for ...' is not accessible as guest user. You must login to access it!
  • Method

    Dictionary creator

    This tool creates a dictionary with inverse document frequency (idf) values from the Google NGrams dataset.
    • The resource: 'Source code' is not accessible as guest user. You must login to access it!
  • Dataset

    Wikipedia Word Embeddings

    Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0...
    • The resource: 'Embeddings' is not accessible as guest user. You must login to access it!
  • Dataset

    Conversational search dataset with labels

    CAsT 2019 data is split into two files one for training and the other one for testing. - Training set: CAsT 2019 conversations from training set and from test set without...
    • The resource: 'Conversational dataset ...' is not accessible as guest user. You must login to access it!
You can also access this registry using the API (see API Docs).