Quantifying Privacy Risks for Continuous Trait Data

dc.contributor.authorHe, Muqing
dc.contributor.authorZou, Deqing
dc.contributor.authorQiang, Weizhong
dc.contributor.authorXu, Shouhuai
dc.contributor.authorWu, Wenbo
dc.contributor.authorJin, Hai
dc.date.accessioned2022-10-26T11:08:24Z
dc.date.available2022-10-26T11:08:24Z
dc.date.issued2022-10-20
dc.date.updated2022-10-26T11:08:25Z
dc.description.abstractIn the context of life sciences, the rapid biotechnical development leads to the creation of huge amounts of biological data. The use of such data naturally brings concerns on human genetic privacy breaches, which also discourage biological data sharing. Prior studies have investigated the possibility of the privacy issues associated with individuals' trait data. However, there are few studies on quantitatively analyzing the probability of the privacy risk. In this paper, we fill this void by proposing a scheme for systematically breaching genomic privacy, which is centered on quantifying the probability of the privacy risk of continuous trait data. With well-designed synthetic datasets, our theoretical analysis and experiments lead to several important findings, such as: (i) The size of genetic signatures and the sensitivity (true positive rate) significantly affect the accuracy of re-identification attack. (ii) Both the size of genetic signatures and the minor allele frequency have a significant impact on distinguishing true positive and false positive matching between traits and genetic profiles. (iii) The size of the matching quantitative trait locus dataset has a large impact on the confidence of the privacy risk assessment. Validation with a real dataset shows that our findings can effectively estimate the privacy risks of the continuous trait dataset.
dc.description.departmentManagement Science and Statistics
dc.identifierdoi: 10.3390/app122010586
dc.identifier.citationApplied Sciences 12 (20): 10586 (2022)
dc.identifier.urihttps://hdl.handle.net/20.500.12588/1153
dc.rightsAttribution 4.0 United States
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectgenomic privacy
dc.subjectre-identification
dc.subjectquantitative trait locus
dc.subjectsensitivity
dc.titleQuantifying Privacy Risks for Continuous Trait Data
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
applsci-12-10586.pdf
Size:
3.65 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.86 KB
Format:
Item-specific license agreed upon to submission
Description: