approved
Gene Disease Association Data and Features

This dataset contains data that can be used for disease gene discovery purposes. The data cover ten different diseases with associated seed genes (derived from DisGeNET) and features specifically designed for network-based disease gene prioritization. Also, a BioGRID-derived protein-protein interaction network (containing only human genes) is provided.

Tags
Data and Resources
To access the resources you must log in
Personal Data Attributes

Description: Personal Data related Information

Field Value
Anonymised No
ChildrenData N/A (Not appliable)
Cross Border Authorised Yes
Data Flow Legal Basis Data were downloaded for BioGRID and DisGeNet online knowledgebases and used with in-house scripts to perform data cleaning and feature generation.
Data Protection Impact Assessment No
Ethics Committee Approval No
General Data No
Informed Consent Template No
Non Personal Data Explanation The dataset is derived from publicly available data containing no personal information.
Personal Data No
Personal data was manifestly made public by the data subject N/A (Not appliable)
Sensitive Data No
Additional Info
Field Value
Accessibility Virtual Access
Accessibility Mode Download
Attribution requirements The data were generated/processed in and for the following work: Paola Stolfi, Andrea Mastropietro, Giuseppe Pasculli, Paolo Tieri, Davide Vergni, NIAPU: network-informed adaptive positive-unlabeled learning for disease gene identification, Bioinformatics, Volume 39, Issue 2, February 2023, btac848, https://doi.org/10.1093/bioinformatics/btac848
Availability On-Line
Basic rights Download
Basic rights Copying
Basic rights Distribution
Basic rights Modification
Creation Date 2022-05-27 16:30
Creator Mastropietro, Andrea, mastropietro@diag.uniroma1.it, orcid.org/0000-0002-3456-9428
Dataset Citation Paola Stolfi, Andrea Mastropietro, Giuseppe Pasculli, Paolo Tieri, Davide Vergni, NIAPU: network-informed adaptive positive-unlabeled learning for disease gene identification, Bioinformatics, Volume 39, Issue 2, February 2023, btac848, https://doi.org/10.1093/bioinformatics/btac848
Dataset Re-Use Safeguards None
DiskSize 7
Field/Scope of use Research only
Format .rar
Group Health Studies
Language eng, English
License term 2022-05-27 16:30/2028-05-27 16:30
Manifestation Type Replica
Processing Degree Secondary
Retention Period 2028-05-27 16:30/2029-05-27 16:30
Semantic Coverage genes, network data, protein-protein interaction
Size 21 files
SoBigData Node SoBigData EU
SoBigData Node SoBigData IT
Sublicense rights No
Territory of use World Wide
Thematic Cluster Other
system:type Dataset
Management Info
Field Value
Author Mastropietro Andrea
Maintainer Mastropietro Andrea
Version 1
Last Updated 27 February 2024, 09:45 (CET)
Created 1 December 2023, 17:31 (CET)