-
Private Cybersecurity NER dataset
Our dataset is created by merging APTNER and CyNER datasets, containing 13601 sentences, 347779 tokens, and 37684 entities. The split ratio was roughly 70% for training and... -
SWH Filenames
A 69 GB dataset with ~2.3 billion strings representing deduplicated names of source code files collected by Software Heritage, the great library of source code...-
ZIP
The resource: 'SWH Filenames' is not accessible as guest user. You must login to access it!
-
ZIP
-
-
PDF
The resource: 'Misinformation Detection ...' is not accessible as guest user. You must login to access it!
-
PDF
-
-
PDF
The resource: 'DEAP-FAKED: Knowledge ...' is not accessible as guest user. You must login to access it!
-
PDF
-
-
PDF
The resource: 'Research Article' is not accessible as guest user. You must login to access it!
-
PDF
-
Quantum Distance-Based Classifier
The Quantum Distance-Based Classifier is a technique inspired by the classical k-Nearest Neighbors that leverages quantum properties to perform prediction. -
Wikipedia Word Embeddings
Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0...