approved
Code and data accompanying the paper: Quantifying Privacy Risks in Synthetic Data: A Study on Black-Box Membership Inference

This repository contains the code and data for the paper “Quantifying Privacy Risks in Synthetic Data: A Study on Black-Box Membership Inference”. It enables full reproducibility of the empirical study that evaluates privacy–utility trade-offs in synthetic tabular data under multiple black-box membership inference attacks (MIAs). Using the Adult, COMPAS, and Southern German Credit datasets, the code repeatedly generates synthetic datasets (30 train/test splits) with four generators (Mixup, TVAE, CTGAN, CTAB-GAN+), measures downstream utility via XGBoost performance, and quantifies privacy risk using several MIAs (distance-, distribution-, Monte Carlo-based, and DOMIAS) together with common privacy metrics (DCR, NNDR, and Privacy Loss).

Tags
Data and Resources
To access the resources you must log in
Additional Info
Field Value
Associate Project FAIR
Group Social Impact of AI and explainable ML
Involved Institutions Nexa Center for Internet & Society
Involved Institutions Politecnico di Torino
Involved People Fantino, Giacomo, giacomo.fantino@polito.it, orcid.org/0009-0009-5808-604X
Involved People Rondina, Marco, marco.rondina@polito.it, orcid.org/0009-0008-8819-3623
Involved People Vetrò, Antonio, antonio.vetro@polito.it, orcid.org/0000-0003-2027-3308
Involved People De Martin, Juan Carlos, juancarlos.demartin@polito.it, orcid.org/0000-0002-7867-1926
SoBigData Node SoBigData EU
SoBigData Node SoBigData IT
State Complete
Thematic Cluster Privacy Enhancing Technology [PET]
system:type Experiment
Management Info
Field Value
Author Fantino Giacomo
Maintainer Fantino Giacomo
Version 1
Last Updated 12 January 2026, 16:15 (CET)
Created 9 January 2026, 09:46 (CET)