-
Private Cybersecurity NER BERT-base-cased model
This method includes a Python script and files of a BERT-base-cased model fine-tuned on our Cybersecurity NER dataset. The method requires as input a list of sentences that... -
Cybersecurity NER RoBERTa-base model
This method includes a Python script and files of a RoBERTa-base model fine-tuned on our Cybersecurity NER dataset. The method requires as input a list of sentences that will...-
JSON
The resource: 'config' is not accessible as guest user. You must login to access it!
-
TXT
The resource: 'merges' is not accessible as guest user. You must login to access it!
-
BIN
The resource: 'model' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'model_args' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'scheduler' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'special_tokens_map' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'tokenizer_config' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'training_args' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'tokenizer' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'vocab' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'optimizer' is not accessible as guest user. You must login to access it!
-
py
The resource: 'inference' is not accessible as guest user. You must login to access it!
-
JSON
-
Cybersecurity NER SecureBERT model
This method includes a Python script and files of a SecureBERT model fine-tuned on our Cybersecurity NER dataset. The method requires as input a list of sentences that will be...-
JSON
The resource: 'config' is not accessible as guest user. You must login to access it!
-
TXT
The resource: 'merges' is not accessible as guest user. You must login to access it!
-
BIN
The resource: 'model' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'model_args' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'optimizer' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'scheduler' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'special_tokens_map' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'tokenizer' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'tokenizer_config' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'training_args' is not accessible as guest user. You must login to access it!
-
TXT
The resource: 'vocab' is not accessible as guest user. You must login to access it!
-
text/x-python
The resource: 'inference' is not accessible as guest user. You must login to access it!
-
JSON
-
Private Cybersecurity NER dataset
Our dataset is created by merging APTNER and CyNER datasets, containing 13601 sentences, 347779 tokens, and 37684 entities. The split ratio was roughly 70% for training and... -
SWH Filenames
A 69 GB dataset with ~2.3 billion strings representing deduplicated names of source code files collected by Software Heritage, the great library of source code...-
ZIP
The resource: 'SWH Filenames' is not accessible as guest user. You must login to access it!
-
ZIP
-
Introduction to Data Curation
This course is an introduction to data collection, data preparation & transformation and data analysis. It contains the essential concepts for a researcher in order to...-
PDF
The resource: 'Introduction to Data Curation' is not accessible as guest user. You must login to access it!
-
PDF
-
Python library for direct and indirect discrimination prevention in data mining
This python library implements the discrimination discovery and prevention method proposed in the paper: “A methodology for direct and indirect discrimination prevention in...-
GitHub
The resource: 'Link to library' is not accessible as guest user. You must login to access it!
-
GitHub
-
Quantum Distance-Based Classifier
The Quantum Distance-Based Classifier is a technique inspired by the classical k-Nearest Neighbors that leverages quantum properties to perform prediction. -
Dictionary creator
This tool creates a dictionary with inverse document frequency (idf) values from the Google NGrams dataset. -
Wikipedia Word Embeddings
Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0...