-
SWH Filenames
A 69 GB dataset with ~2.3 billion strings representing deduplicated names of source code files collected by Software Heritage, the great library of source code...-
ZIP
The resource: 'SWH Filenames' is not accessible as guest user. You must login to access it!
-
ZIP
-
Private Smart Cities Weather and Pollution conditions
A set of weather and climatic conditions gathered during the Toolsmart PoN project ( Open Community PA 2020 – Pon Governance 2014-2020). Data are obtained from IoT based... -
Santorini Tweets July-August 2021
This dataset contains 225.501 tweets written by 141.277 users. These tweets are geolocated in Santorini, or they contain the word or the hashtag "santorini" in the text. They...-
ZIP
The resource: 'tweet_santorini.csv' is not accessible as guest user. You must login to access it!
-
ZIP
-
SMAPH Query Entity Linker
The SMAPH system links queries to the entities it mentions, disambiguating mentions if needed. Entities are Wikipedia pages. This problem is known as "entity recognition and...-
HTML
The resource: 'SMAPH documentation' is not accessible as guest user. You must login to access it!
-
HTML
-
Quantum Distance-Based Classifier
The Quantum Distance-Based Classifier is a technique inspired by the classical k-Nearest Neighbors that leverages quantum properties to perform prediction. -
Wikipedia Word Embeddings
Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0...