-
Private EnviroStream
This repository contains datasets, queries and a generator for the EnviroStream, a benchmark for Stream Reasoning (SR) systems. SR focuses on applying inference to dynamic... -
Gene Disease Association Data and Features
This dataset contains data that can be used for disease gene discovery purposes. The data cover ten different diseases with associated seed genes (derived from DisGeNET) and...-
RAR
The resource: 'Gene_Disease_Association_Da ...' is not accessible as guest user. You must login to access it!
-
RAR
-
The Hackernews dataset
This corpus has been extracted from The Hacker News website (https://thehackernews.com), a CS news platform that attracts over 8 million readers monthly, which is daily... -
-
CSV
The resource: 'World Trade Web_2000' is not accessible as guest user. You must login to access it!
-
CSV
-
SWH Filenames
A 69 GB dataset with ~2.3 billion strings representing deduplicated names of source code files collected by Software Heritage, the great library of source code...-
ZIP
The resource: 'SWH Filenames' is not accessible as guest user. You must login to access it!
-
ZIP
-
Synthetic Dataset for Causal Analysis
The dataset is a synthetic version of the well-known German Credit dataset (https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data). It includes variables such as...-
CSV
The resource: 'synthetic german data' is not accessible as guest user. You must login to access it!
-
CSV
-
-
CSV
The resource: 'WTN_adj_2020' is not accessible as guest user. You must login to access it!
-
CSV
-
Carbon Trade Network_2020
Weighted, directed adjacency matrix of the Carbon Trade Network in the year 2020-
CSV
The resource: 'CTN_adj_2020' is not accessible as guest user. You must login to access it!
-
CSV
-
Carbon Trade Network_2000
Weighted, directed adjacency matrix of the Carbon Trade Network in the year 2000-
CSV
The resource: 'CTN_adj_2000' is not accessible as guest user. You must login to access it!
-
CSV
-
DeLag: Microservices execution traces
The dataset contains execution traces collected from the well-know open-source microservices system Train-ticket. The traces are generated over a variety of scenario,...-
parquet
The resource: 'Unnamed resource' is not accessible as guest user. You must login to access it!
-
parquet
-
Private Identified CNVs from whole exome sequencing data of BRCA1/2 negative breast c...
This dataset offers a comprehensive analysis of Copy Number Variations (CNVs) identified in Whole Exome Sequencing (WES) data from patients with breast cancer who tested... -
Prognostic stratification of patients with differentiated thyroid cancer
Proper risk stratification of patients with differentiated thyroid cancer (DTC) is essential to avoid both unnecessary diagnostic procedures in low-risk patients and clinical... -
A comparison of approaches for type-2 diabetes treatment
This experiment compares the performance of some GNN-based approaches for predicting the therapy recommended to type-2 diabetes patients -
A hybrid approach for PPI
We propose a new framework that can exploit topological and biological information to predict protein-protein interactions. The algorithm relies on the underlying hypothesis... -
EpiCID: A framework for discovering interactions between SNPs
Epistatic interactions (EIs) of gene loci often determine complex trait phenotypes. EIs may indicate the underlying molecular mechanisms of multifactorial traits and diseases.... -
Introduction to Data Curation
This course is an introduction to data collection, data preparation & transformation and data analysis. It contains the essential concepts for a researcher in order to...-
PDF
The resource: 'Introduction to Data Curation' is not accessible as guest user. You must login to access it!
-
PDF
-
Thyroid-cancer patients
The data used originate from the web-based database of the Italian Thyroid Cancer Observatory (ITCO), opened in 2013 at the Thyroid Cancer Center of the Sapienza University of... -
BioTAGME: A comprehensive platform for biological knowledge network analysis
This Network was built through BioTAGME, a system that combines TAGME, an entity-annotation framework based on Wikipedia corpus with a network-based inference methodology (i.e.,... -
Scientific Publications Dataset
This is the sciMAG2015 dataset, i.e., the open dataset linking Microsoft Academic Graph and sciMAGO's journal classification for bibliometrics studies. It includes publication...-
Data
The resource: 'sciMAG2015 - Data' is not accessible as guest user. You must login to access it!
-
Data
-
Soccer Team Performance
The dataset contains the performance features (passes, shots, goals, tackles, ecc) of soccer teams during the games of six major European leagues in three seasons. The dataset...