approved
Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles
Tags
Data and Resources
To access the resources you must log in
Additional Info
Field Value
Detailed description Metastatic cancers account for up to 90% of cancer-related deaths. The clear differentiation of metastatic cancers from primary cancers is crucial for cancer type identification and developing targeted treatment for each cancer type. DNA methylation patterns are suggested to be an intriguing target for cancer prediction and are also considered to be an important mediator for the transition to metastatic cancer. In the present study, we used 24 cancer types and 9303 methylome samples downloaded from publicly available data repositories, including The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO). We constructed machine learning classifiers to discriminate metastatic, primary, and non-cancerous methylome samples. We applied support vector machines (SVM), Naive Bayes (NB), extreme gradient boosting (XGBoost), and random forest (RF) machine learning models to classify the cancer types based on their tissue of origin. RF outperformed the other classifiers, with an average accuracy of 99%. Moreover, we applied local interpretable model-agnostic explanations (LIME) to explain important methylation biomarkers to classify cancer types.
Group Health Studies
Involved Institutions University of Tartu
Involved People Sharma, Rajesh, rajesh.sharma@ut.ee, orcid.org/0000-0003-3581-1332
SoBigData Node SoBigData EU
State Complete
Thematic Cluster Visual Analytics [VA]
Thematic Cluster Other
ThematicCluster Visual Analytics
system:type Experiment
Management Info
Field Value
Author Sharma Rajesh
Maintainer Sharma Rajesh
Version 1
Last Updated 3 October 2023, 16:36 (CEST)
Created 10 October 2022, 12:29 (CEST)