AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We show that differential privacy substantially interferes with the main purpose of these models in personalized medicine: for ε values that protect genomic privacy, which is the central privacy concern in our application, the risk of negative patient outcomes increases beyond ac...

Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing

USENIX Security, (2014): 17-32

Cited by: 514|Views308
EI WOS

Abstract

We initiate the study of privacy in pharmacogenetics, wherein machine learning models are used to guide medical treatments based on a patient's genotype and background. Performing an in-depth case study on privacy in personalized warfarin dosing, we show that suggested models carry privacy risks, in particular because attackers can perfor...More

Code:

Data:

0
Introduction
  • Technical advances have enabled inexpensive, high-fidelity molecular analyses that characterize the genetic make-up of an individual.
  • Overestimating the dose can, just as seriously, lead to uncontrolled bleeding events because the drug interferes with clotting
  • Because of these risks, in existing clinical practice patients starting on warfarin are given a fixed initial dose but must visit a clinic many times over the first few weeks or months of treatment in order to determine the correct dosage which gives the desired therapeutic effect
Highlights
  • In recent years, technical advances have enabled inexpensive, high-fidelity molecular analyses that characterize the genetic make-up of an individual
  • We study the degree to which these models leak sensitive information about patient genotype, which would pose a danger to genomic privacy
  • We showed that models used in warfarin therapy introduce a threat to patients’ genomic privacy
  • When models are produced using state-of-the-art differential privacy mechanisms, genomic privacy is protected for small ε(≤ 1), but as ε increases towards larger values this protection vanishes
  • We evaluated the utility of differential privacy mechanisms by simulating clinical trials that use private models in warfarin therapy
  • It does almost as well as regression models trained to predict these markers, suggesting that model inversion can be nearly as effective as learning in an “ideal” setting
  • We show that differential privacy substantially interferes with the main purpose of these models in personalized medicine: for ε values that protect genomic privacy, which is the central privacy concern in our application, the risk of negative patient outcomes increases beyond acceptable levels
Results
  • The authors evaluated the inference algorithms on both mechanisms discussed above at a range of ε values: 0.25, 1, 5, 20, and 100.
  • For each algorithm and ε, the authors generated 100 private models on the training cohort, and attempted to infer VKORC1 and CYP2C9 for each individual in both the training and validation cohort.
  • All computations were performed in R.
  • 8 24 23rd USENIX Security Symposium aucroc aucroc
Conclusion
  • The authors have argued that Aπ is optimal in one particular sense, i.e., it minimizes the expected misclassification rate on the maximum-entropy prior given the available information.
  • It is not hard to specify joint priors p for which the marginals p1,...,d,y convey little useful information, so the expected misclassification rate minimized here diverges substantially from the true rate.
  • In these cases, Aπ may perform poorly, and more background information is needed to accurately predict model inputs.
  • The authors show that differential privacy substantially interferes with the main purpose of these models in personalized medicine: for ε values that protect genomic privacy, which is the central privacy concern in the application, the risk of negative patient outcomes increases beyond acceptable levels
Related work
  • The tension between privacy and data utility has been explored by several authors. Brickell and Shmatikov [6] found strong evidence for a tradeoff in attribute privacy and predictive performance in common data mining tasks when k-anonymity, -diversity, and t-closeness are applied before releasing a full dataset. Differential privacy arose partially as a response to Dalenius’ desideratum: anything that can be learned from the database about a specific individual should be learnable without access to the database [9]. Dwork showed the impossibility of achieving this result in the presence of utility requirements [11], and proposed an alternative goal that proved feasible to achieve in many settings: the risk to one’s privacy should not substantially increase as a result of participating in a statistical database. Differential privacy formalizes this goal, and constructive research on the topic has subsequently flourised.

    Differential privacy is often misunderstood by those who wish to apply it, as pointed out by Dwork and others [13]. Kifer and Machanavajjhala [25] addressed several common misconceptions about the topic, and showed that under certain conditions, it fails to achieve a privacy goal related to Dwork’s: nearly all evidence of an individual’s participation should be removed. Using hypothetical examples from social networking and census data release, they demonstrate that when rows in a database are correlated, or when previous exact statistics for a dataset have been released, this notion of privacy may be violated even when differential privacy is used. Part of our work extends theirs by giving a concrete examples from a realistic application where common misconceptions about differential privacy lead to surprising privacy breaches, i.e., that it will protect genomic attributes from unwanted disclosure. We further extend their analysis by providing a quantitative study of the tradeoff between privacy and utility in the application.
Funding
  • Open access to the Proceedings of the 23rd USENIX Security Symposium is sponsored by USENIX
Reference
  • Clarification of optimal anticoagulation through genetics. http://coagstudy.org.
    Findings
  • The pharmacogenomics knowledge base. http://www.pharmgkb.org.
    Findings
  • J. L. Anderson, B. D. Horne, S. M. Stevens, A. S. Grove, S. Barton, Z. P. Nicholas, S. F. Kahn, H. T. May, K. M. Samuelson, J. B. Muhlestein, J. F. Carlquist, and for the Couma-Gen Investigators. Randomized trial of genotype-guided versus standard warfarin dosing in patients initiating oral anticoagulation. Circulation, 116(22):2563–2570, 2007.
    Google ScholarLocate open access versionFindings
  • P. L. Bonate. Clinical trial simulation in drug development. Pharmaceutical Research, 17(3):252–256, 2000.
    Google ScholarLocate open access versionFindings
  • L. D. Brace. Current status of the international normalized ratio. Lab Medicine, 32(7):390–392, 2001.
    Google ScholarLocate open access versionFindings
  • J. Brickell and V. Shmatikov. The cost of privacy: destruction of data-mining utility in anonymized data publishing. In KDD, 2008.
    Google ScholarLocate open access versionFindings
  • J. Carlquist, B. Horne, J. Muhlestein, D. Lapp, B. Whiting, M. Kolek, J. Clarke, B. James, and J. Anderson. Genotypes of the Cytochrome P450 Isoform, CYP2C9, and the Vitamin K Epoxide Reductase Complex Subunit 1 conjointly determine stable warfarin dose: a prospective study. Journal of Thrombosis and Thrombolysis, 22(3), 2006.
    Google ScholarLocate open access versionFindings
  • G. Cormode. Personal privacy vs population privacy: learning to attack anonymization. In KDD, 2011.
    Google ScholarLocate open access versionFindings
  • T. Dalenius. Towards a methodology for statistical disclosure control. Statistik Tidskrift, 15(429444):2–1, 1977.
    Google ScholarLocate open access versionFindings
  • F. K. Dankar and K. El Emam. The application of differential privacy to health data. In ICDT, 2012.
    Google ScholarLocate open access versionFindings
  • C. Dwork. Differential privacy. In ICALP. Springer, 2006.
    Google ScholarFindings
  • C. Dwork. The promise of differential privacy: A tutorial on algorithmic techniques. In FOCS, 2011.
    Google ScholarLocate open access versionFindings
  • C. Dwork, F. McSherry, K. Nissim, and A. Smith. Differential privacy: A primer for the perplexed. In Joint UNECE/Eurostat work session on statistical data confidentiality, 2011.
    Google ScholarFindings
  • V. A. Fusaro, P. Patil, C.-L. Chi, C. F. Contant, and P. J. Tonellato. A systems approach to designing effective clinical trials using simulations. Circulation, 127(4):517–526, 2013.
    Google ScholarLocate open access versionFindings
  • S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. In KDD, 2008.
    Google ScholarLocate open access versionFindings
  • A. K. Hamberg, Dahl, M. L., M. Barban, M. G. Srordo, M. Wadelius, V. Pengo, R. Padrini, and E. Jonsson. A PK-PD model for predicting the impact of age, CYP2C9, and VKORC1 genotype on individualization of warfarin therapy. Clinical Pharmacology Theory, 81(4):529–538, 2007.
    Google ScholarLocate open access versionFindings
  • D. Hand and R. Till. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45(2):171– 186, 2001.
    Google ScholarLocate open access versionFindings
  • N. Holford, S. C. Ma, and B. A. Ploeger. Clinical trial simulation: A review. Clinical Pharmacology Theory, 88(2):166–182.
    Google ScholarLocate open access versionFindings
  • N. H. G. Holford, H. C. Kimko, J. P. R. Monteleone, and C. C. Peck. Simulation of clinical trials. Annual Review of Pharmacology and Toxicology, 40(1):209–234, 2000.
    Google ScholarLocate open access versionFindings
  • N. Homer, S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V. Pearson, D. A. Stephan, S. F. Nelson, and D. W. Craig. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics, 4(8), 08 2008.
    Google ScholarLocate open access versionFindings
  • International Warfarin Pharmacogenetic Consortium. Estimation of the warfarin dose with clinical and pharmacogenetic data. New England Journal of Medicine, 360(8):753–764, 2009.
    Google ScholarLocate open access versionFindings
  • E. Jaynes. On the rationale of maximum-entropy methods. Proceedings of the IEEE, 70(9), Sept 1982.
    Google ScholarLocate open access versionFindings
  • F. Kamali and H. Wynne. Pharmacogenetics of warfarin. Annual Review of Medicine, 61(1):63–75, 2010.
    Google ScholarLocate open access versionFindings
  • S. P. Kasiviswanathan, M. Rudelson, and A. Smith. The power of linear reconstruction attacks. In SODA, 2013.
    Google ScholarLocate open access versionFindings
  • D. Kifer and A. Machanavajjhala. No free lunch in data privacy. In SIGMOD, 2011.
    Google ScholarLocate open access versionFindings
  • M. J. Kim, S. M. Huang, U. A. Meyer, A. Rahman, and L. J. Lesko. A regulatory science perspective on warfarin therapy: a pharmacogenetic opportunity. J Clin Pharmacol, 49:138–146, Feb 2009.
    Google ScholarLocate open access versionFindings
  • S. E. Kimmel, B. French, S. E. Kasner, J. A. Johnson, J. L. Anderson, B. F. Gage, Y. D. Rosenberg, C. S. Eby, R. A. Madigan, R. B. McBane, S. Z. Abdel-Rahman, S. M. Stevens, S. Yale, E. R. Mohler, M. C. Fang, V. Shah, R. B. Horenstein, N. A. Limdi, J. A. Muldowney, J. Gujral, P. Delafontaine, R. J. Desnick, T. L. Ortel, H. H. Billett, R. C. Pendleton, N. L. Geller, J. L. Halperin, S. Z. Goldhaber, M. D. Caldwell, R. M. Califf, and J. H. Ellenberg. A pharmacogenetic versus a clinical algorithm for warfarin dosing. New England Journal of Medicine, 369(24):2283–2293, 2013. PMID: 24251361.
    Locate open access versionFindings
  • T. Komarova, D. Nekipelov, and E. Yakovlev. Estimation of Treatment Effects from Combined Data: Identification versus Data Security. NBER volume Economics of Digitization: An Agenda, To appear.
    Google ScholarFindings
  • M. J. Kovacs, M. Rodger, D. R. Anderson, B. Morrow, G. Kells, J. Kovacs, E. Boyle, and P. S. Wells. Comparison of 10-mg and 5-mg warfarin initiation nomograms together with low-molecular-weight heparin for outpatient treatment of acute venous thromboembolism. Annals of Internal Medicine, 138(9):714–719, 2003.
    Google ScholarLocate open access versionFindings
  • J. Lee and C. Clifton. How much is enough? Choosing ε for differential privacy. In ISC, 2011.
    Google ScholarLocate open access versionFindings
  • J. Lee and C. Clifton. Differential identifiability. In KDD, 2012.
    Google ScholarLocate open access versionFindings
  • J. Lei. Differentially private m-estimators. In NIPS, 2011.
    Google ScholarLocate open access versionFindings
  • Y. Lindell and E. Omri. A practical application of differential privacy to personalized online advertising. IACR Cryptology ePrint Archive, 2011.
    Google ScholarLocate open access versionFindings
  • G. Loukides, J. C. Denny, and B. Malin. The disclosure of diagnosis codes can breach research participants’ privacy. Journal of the American Medical Informatics Association, 17(3):322–327, 2010.
    Google ScholarLocate open access versionFindings
  • G. Loukides, A. Gkoulalas-Divanis, and B. Malin. Anonymization of electronic medical records for validating genome-wide association studies. Proceedings of the National Academy of Sciences, 107(17):7898–7903, Apr. 2010.
    Google ScholarLocate open access versionFindings
  • A. Narayanan and V. Shmatikov. Robust deanonymization of large sparse datasets. In Oakland, 2008.
    Google ScholarLocate open access versionFindings
  • A. Narayanan and V. Shmatikov. Myths and fallacies of Personally Identifiable Information. Commun. ACM, 53(6), June 2010.
    Google ScholarLocate open access versionFindings
  • J. Reed, A. J. Aviv, D. Wagner, A. Haeberlen, B. C. Pierce, and J. M. Smith. Differential privacy for collaborative security. In Proceedings of the Third European Workshop on System Security, EUROSEC, 2010.
    Google ScholarLocate open access versionFindings
  • S. Sankararaman, G. Obozinski, M. I. Jordan, and E. Halperin. Genomic privacy and limits of individual detection in a pool. Nature Genetics, 41(9):965–967, 2009.
    Google ScholarLocate open access versionFindings
  • E. A. Sconce, T. I. Khan, H. A. Wynne, P. Avery, L. Monkhouse, B. P. King, P. Wood, P. Kesteven, A. K. Daly, and F. Kamali. The impact of CYP2C9 and VKORC1 genetic polymorphism and patient characteristics upon warfarin dose requirements: proposal for a new dosing regimen. Blood, 106(7):2329–2333, 2005.
    Google ScholarLocate open access versionFindings
  • S. V. Sorensen, S. Dewilde, D. E. Singer, S. Z. Goldhaber, B. U. Monz, and J. M. Plumb. Costeffectiveness of warfarin: Trial versus real-world stroke prevention in atrial fibrillation. American Heart Journal, 157(6):1064 – 1073, 2009.
    Google ScholarLocate open access versionFindings
  • L. Sweeney. Simple demographics often identify people uniquely. 2000.
    Google ScholarFindings
  • F. Takeuchi, R. McGinnis, S. Bourgeois, C. Barnes, N. Eriksson, N. Soranzo, P. Whittaker, V. Ranganath, V. Kumanduri, W. McLaren, L. Holm, J. Lindh, A. Rane, M. Wadelius, and P. Deloukas. A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose. PLoS Genet, 5(3), 03 2009.
    Google ScholarLocate open access versionFindings
  • S. Vinterbo. Differentially private projected histograms: Construction and use for prediction. In ECML-PKDD, 2012.
    Google ScholarLocate open access versionFindings
  • D. Vu and A. Slavkovic. Differential privacy for clinical trial data: Preliminary evaluations. In ICDM Workshops, 2009.
    Google ScholarLocate open access versionFindings
  • R. Wang, Y. F. Li, X. Wang, H. Tang, and X. Zhou. Learning your identity and disease from research papers: information leaks in genome wide association studies. In CCS, 2009.
    Google ScholarLocate open access versionFindings
  • J. Zhang, Z. Zhang, X. Xiao, Y. Yang, and M. Winslett. Functional mechanism: regression analysis under differential privacy. In VLDB, 2012.
    Google ScholarLocate open access versionFindings
Author
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科