Inference attacks on genomic privacy with an improved HMM and an RCNN model for unrelated individuals

Information Sciences - Tập 512 - Trang 207-218 - 2020
Hongfa Ding1,2, Youliang Tian3, Changgen Peng1,4, Youshan Zhang5, Shuwen Xiang1
1State Key Laboratory of Public Big Data, College of Mathematics and Statistics, Guizhou University, Guiyang 550025, China
2College of Information, Guizhou University of Finances and Economics, Guiyang 550025, China
3College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
4CETC Big Data Research Institute Co.,Ltd., Guiyang 550025, China
5Department of Computer Science and Engineering, Lehigh University, Bethlehem 18015, USA

Tài liệu tham khảo

Ayday, 2017, Inference attacks against kin genomic privacy, IEEE Secur. Privacy, 15, 29, 10.1109/MSP.2017.3681052 Ayday, 2013, Personal use of the genomic data: privacy vs. storage cost, 2723 Cai, 2015, Deterministic identification of specific individuals from GWAS results, Bioinformatics, 31, 1701, 10.1093/bioinformatics/btv018 Deznabi, 2018, An inference attack on genomic data using kinship, complex correlations, and phenotype information, IEEE/ACM Trans. Comput. Biol.Bioinf., 15, 1333, 10.1109/TCBB.2017.2709740 Durbin, 1998 En.wikipedia.org, 2019, Inference attack, Accessed April 22. (https://en.wikipedia.org/wiki/Inference_attack). Ganju, 2018, Property inference attacks on fully connected neural networks using permutation invariant representations, 619 Gong, 2016, You are who you know and how you behave: Attribute inference attacks via users’ social friends and behaviors, 979 Gymrek, 2013, Identifying personal genomes by surname inference, Science, 339, 321, 10.1126/science.1229566 Harmanci, 2016, Quantification of private information leakage from phenotype-genotype data: linking attacks, Nat. Methods, 13, 251, 10.1038/nmeth.3746 He, 2017, Addressing the threats of inference attacks on traits and genotypes from individual genomic data, 223 P. Hess, Controversial geneticist warns: we can read your face in your dna., 2017, Accessed June 2, 2018. (https://www.inverse.com/article/36145-genetic-privacy-venter-23andme). Homer, 2008, Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays, PLOS Genet., 4, 1, 10.1371/journal.pgen.1000167 B. Howie, J. Marchini, 2019, IMPUTE2, Accessed April 22. (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html#reference). Howie, 2009, A flexible and accurate genotype imputation method for the next generation of genome-wide association studies, PLOS Genet., 5, 1, 10.1371/journal.pgen.1000529 Hu, 1996, HMM based online handwriting recognition, IEEE Trans. Pattern Anal. Mach.Intell., 18, 1039, 10.1109/34.541414 Humbert, 2013, Addressing the concerns of the lacks family: quantification of kin genomic privacy, 1141 Libbrecht, 2015, Machine learning applications in genetics and genomics, Nat. Rev. Genet., 16, 321, 10.1038/nrg3920 Long, 2017, Fully convolutional networks for semantic segmentation, IEEE Trans. Pattern Anal. Mach.Intell., 39, 640, 10.1109/TPAMI.2016.2572683 Mailman, 2007, The NCBI dbGaP database of genotypes and phenotypes, Nat. Genet., 39, 1181, 10.1038/ng1007-1181 Marchini, 2007, A new multipoint method for genome-wide association studies by imputation of genotypes, Nat. Genet., 39, 906, 10.1038/ng2088 Narain, 2016, Inferring user routes and locations using zero-permission mobile sensors, 397 Nyholt, 2009, On Jim Watson’s APOE status: genetic information is hard to hide, Eur. J. Hum. Genet., 17, 147, 10.1038/ejhg.2008.198 Peng, 2016, Information entropy models and privacy metrics methods for privacy protection, J. Softw., 27, 1891 Pouliot, 2016, The shadow nemesis: Inference attacks on efficiently deployable, efficiently searchable encryption, 1341 Rabiner, 1989, A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE, 77, 257, 10.1109/5.18626 Rohlfs, 2012, Familial identification: population structure and relationship distinguishability, PLOS Genet., 8, e1002469, 10.1371/journal.pgen.1002469 Samani, 2015, Quantifying genomic privacy via inference attack with high-order SNV correlations, 32 Schadt, 2012, Bayesian method to predict individual SNP genotypes from gene expression data, Nat. Genet., 44, 603, 10.1038/ng.2248 S. Scutti, What the golden state killer case means for your genetic privacy, 2018, Accessed May 28, 2018. (https://www.cnn.com/2018/04/27/health/golden-state-killer-genetic-privacy/index.html). Shi, 2017, An overview of human genetic privacy, Ann. New York Acad. Sci., 1387, 61, 10.1111/nyas.13211 Shokri, 2017, Membership inference attacks against machine learning models, 3 Shringarpure, 2015, Privacy risks from genomic data-sharing beacons, Am. J. Hum. Genet., 97, 631, 10.1016/j.ajhg.2015.09.010 Stamp, 2004, A revealing introduction to hidden Markov models, 26 L. Sweeney, A. Abu, J. Winn, Identifying participants in the personal genome project by name, 2013. The Genomes Project Consortium, 2015, A global reference for human genetic variation, Nature, 526, 68, 10.1038/nature15393 IGSR: the international genome sample resource, 2019, Accessed April 22. (http://www.internationalgenome.org/), The International Genome Sample Resource (IGSR), 2019Which populations are part of your study?, Accessed April 22. (http://www.internationalgenome.org/category/population/). The National Human Genome Research Institute, 2019, Privacy in genomics, Accessed April 22. (https://www.genome.gov/27561246/privacy-in-genomics). Thorisson, 2005, The international HapMap project web site, Genome Res., 15, 1592, 10.1101/gr.4413105 U.S. Equal Employment Opportunity Commission, Genetic information nondiscrimination act of 2008, 2008, = from Accessed 1 June 2018. https://www.eeoc.gov/laws/statutes/gina.cfm). Wagner, 2017, Evaluating the strength of genomic privacy metrics, ACM Trans. Priv. Secur., 20, 2:1, 10.1145/3020003 Walsh, 2011, Irisplex: a sensitive dna tool for accurate prediction of blue and brown eye colour in the absence of ancestry information, Forensic Sci. Int., 5, 170, 10.1016/j.fsigen.2010.02.004 Wang, 2009, Learning your identity and disease from research papers: information leaks in genome wide association study, 534 Wang, 2016, Infringement of individual privacy via mining differentially private GWAS statistics, 355