AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
We show that differential privacy substantially interferes with the main purpose of these models in personalized medicine: for ε values that protect genomic privacy, which is the central privacy concern in our application, the risk of negative patient outcomes increases beyond ac...
Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing
USENIX Security, (2014): 17-32
We initiate the study of privacy in pharmacogenetics, wherein machine learning models are used to guide medical treatments based on a patient's genotype and background. Performing an in-depth case study on privacy in personalized warfarin dosing, we show that suggested models carry privacy risks, in particular because attackers can perfor...More
PPT (Upload PPT)
- Technical advances have enabled inexpensive, high-fidelity molecular analyses that characterize the genetic make-up of an individual.
- Overestimating the dose can, just as seriously, lead to uncontrolled bleeding events because the drug interferes with clotting
- Because of these risks, in existing clinical practice patients starting on warfarin are given a fixed initial dose but must visit a clinic many times over the first few weeks or months of treatment in order to determine the correct dosage which gives the desired therapeutic effect
- In recent years, technical advances have enabled inexpensive, high-fidelity molecular analyses that characterize the genetic make-up of an individual
- We study the degree to which these models leak sensitive information about patient genotype, which would pose a danger to genomic privacy
- We showed that models used in warfarin therapy introduce a threat to patients’ genomic privacy
- When models are produced using state-of-the-art differential privacy mechanisms, genomic privacy is protected for small ε(≤ 1), but as ε increases towards larger values this protection vanishes
- We evaluated the utility of differential privacy mechanisms by simulating clinical trials that use private models in warfarin therapy
- It does almost as well as regression models trained to predict these markers, suggesting that model inversion can be nearly as effective as learning in an “ideal” setting
- We show that differential privacy substantially interferes with the main purpose of these models in personalized medicine: for ε values that protect genomic privacy, which is the central privacy concern in our application, the risk of negative patient outcomes increases beyond acceptable levels
- The authors evaluated the inference algorithms on both mechanisms discussed above at a range of ε values: 0.25, 1, 5, 20, and 100.
- For each algorithm and ε, the authors generated 100 private models on the training cohort, and attempted to infer VKORC1 and CYP2C9 for each individual in both the training and validation cohort.
- All computations were performed in R.
- 8 24 23rd USENIX Security Symposium aucroc aucroc
- The authors have argued that Aπ is optimal in one particular sense, i.e., it minimizes the expected misclassification rate on the maximum-entropy prior given the available information.
- It is not hard to specify joint priors p for which the marginals p1,...,d,y convey little useful information, so the expected misclassification rate minimized here diverges substantially from the true rate.
- In these cases, Aπ may perform poorly, and more background information is needed to accurately predict model inputs.
- The authors show that differential privacy substantially interferes with the main purpose of these models in personalized medicine: for ε values that protect genomic privacy, which is the central privacy concern in the application, the risk of negative patient outcomes increases beyond acceptable levels
- The tension between privacy and data utility has been explored by several authors. Brickell and Shmatikov  found strong evidence for a tradeoff in attribute privacy and predictive performance in common data mining tasks when k-anonymity, -diversity, and t-closeness are applied before releasing a full dataset. Differential privacy arose partially as a response to Dalenius’ desideratum: anything that can be learned from the database about a specific individual should be learnable without access to the database . Dwork showed the impossibility of achieving this result in the presence of utility requirements , and proposed an alternative goal that proved feasible to achieve in many settings: the risk to one’s privacy should not substantially increase as a result of participating in a statistical database. Differential privacy formalizes this goal, and constructive research on the topic has subsequently flourised.
Differential privacy is often misunderstood by those who wish to apply it, as pointed out by Dwork and others . Kifer and Machanavajjhala  addressed several common misconceptions about the topic, and showed that under certain conditions, it fails to achieve a privacy goal related to Dwork’s: nearly all evidence of an individual’s participation should be removed. Using hypothetical examples from social networking and census data release, they demonstrate that when rows in a database are correlated, or when previous exact statistics for a dataset have been released, this notion of privacy may be violated even when differential privacy is used. Part of our work extends theirs by giving a concrete examples from a realistic application where common misconceptions about differential privacy lead to surprising privacy breaches, i.e., that it will protect genomic attributes from unwanted disclosure. We further extend their analysis by providing a quantitative study of the tradeoff between privacy and utility in the application.
- Open access to the Proceedings of the 23rd USENIX Security Symposium is sponsored by USENIX
- Clarification of optimal anticoagulation through genetics. http://coagstudy.org.
- The pharmacogenomics knowledge base. http://www.pharmgkb.org.
- J. L. Anderson, B. D. Horne, S. M. Stevens, A. S. Grove, S. Barton, Z. P. Nicholas, S. F. Kahn, H. T. May, K. M. Samuelson, J. B. Muhlestein, J. F. Carlquist, and for the Couma-Gen Investigators. Randomized trial of genotype-guided versus standard warfarin dosing in patients initiating oral anticoagulation. Circulation, 116(22):2563–2570, 2007.
- P. L. Bonate. Clinical trial simulation in drug development. Pharmaceutical Research, 17(3):252–256, 2000.
- L. D. Brace. Current status of the international normalized ratio. Lab Medicine, 32(7):390–392, 2001.
- J. Brickell and V. Shmatikov. The cost of privacy: destruction of data-mining utility in anonymized data publishing. In KDD, 2008.
- J. Carlquist, B. Horne, J. Muhlestein, D. Lapp, B. Whiting, M. Kolek, J. Clarke, B. James, and J. Anderson. Genotypes of the Cytochrome P450 Isoform, CYP2C9, and the Vitamin K Epoxide Reductase Complex Subunit 1 conjointly determine stable warfarin dose: a prospective study. Journal of Thrombosis and Thrombolysis, 22(3), 2006.
- G. Cormode. Personal privacy vs population privacy: learning to attack anonymization. In KDD, 2011.
- T. Dalenius. Towards a methodology for statistical disclosure control. Statistik Tidskrift, 15(429444):2–1, 1977.
- F. K. Dankar and K. El Emam. The application of differential privacy to health data. In ICDT, 2012.
- C. Dwork. Differential privacy. In ICALP. Springer, 2006.
- C. Dwork. The promise of differential privacy: A tutorial on algorithmic techniques. In FOCS, 2011.
- C. Dwork, F. McSherry, K. Nissim, and A. Smith. Differential privacy: A primer for the perplexed. In Joint UNECE/Eurostat work session on statistical data confidentiality, 2011.
- V. A. Fusaro, P. Patil, C.-L. Chi, C. F. Contant, and P. J. Tonellato. A systems approach to designing effective clinical trials using simulations. Circulation, 127(4):517–526, 2013.
- S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. In KDD, 2008.
- A. K. Hamberg, Dahl, M. L., M. Barban, M. G. Srordo, M. Wadelius, V. Pengo, R. Padrini, and E. Jonsson. A PK-PD model for predicting the impact of age, CYP2C9, and VKORC1 genotype on individualization of warfarin therapy. Clinical Pharmacology Theory, 81(4):529–538, 2007.
- D. Hand and R. Till. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45(2):171– 186, 2001.
- N. Holford, S. C. Ma, and B. A. Ploeger. Clinical trial simulation: A review. Clinical Pharmacology Theory, 88(2):166–182.
- N. H. G. Holford, H. C. Kimko, J. P. R. Monteleone, and C. C. Peck. Simulation of clinical trials. Annual Review of Pharmacology and Toxicology, 40(1):209–234, 2000.
- N. Homer, S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V. Pearson, D. A. Stephan, S. F. Nelson, and D. W. Craig. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics, 4(8), 08 2008.
- International Warfarin Pharmacogenetic Consortium. Estimation of the warfarin dose with clinical and pharmacogenetic data. New England Journal of Medicine, 360(8):753–764, 2009.
- E. Jaynes. On the rationale of maximum-entropy methods. Proceedings of the IEEE, 70(9), Sept 1982.
- F. Kamali and H. Wynne. Pharmacogenetics of warfarin. Annual Review of Medicine, 61(1):63–75, 2010.
- S. P. Kasiviswanathan, M. Rudelson, and A. Smith. The power of linear reconstruction attacks. In SODA, 2013.
- D. Kifer and A. Machanavajjhala. No free lunch in data privacy. In SIGMOD, 2011.
- M. J. Kim, S. M. Huang, U. A. Meyer, A. Rahman, and L. J. Lesko. A regulatory science perspective on warfarin therapy: a pharmacogenetic opportunity. J Clin Pharmacol, 49:138–146, Feb 2009.
- S. E. Kimmel, B. French, S. E. Kasner, J. A. Johnson, J. L. Anderson, B. F. Gage, Y. D. Rosenberg, C. S. Eby, R. A. Madigan, R. B. McBane, S. Z. Abdel-Rahman, S. M. Stevens, S. Yale, E. R. Mohler, M. C. Fang, V. Shah, R. B. Horenstein, N. A. Limdi, J. A. Muldowney, J. Gujral, P. Delafontaine, R. J. Desnick, T. L. Ortel, H. H. Billett, R. C. Pendleton, N. L. Geller, J. L. Halperin, S. Z. Goldhaber, M. D. Caldwell, R. M. Califf, and J. H. Ellenberg. A pharmacogenetic versus a clinical algorithm for warfarin dosing. New England Journal of Medicine, 369(24):2283–2293, 2013. PMID: 24251361.
- T. Komarova, D. Nekipelov, and E. Yakovlev. Estimation of Treatment Effects from Combined Data: Identification versus Data Security. NBER volume Economics of Digitization: An Agenda, To appear.
- M. J. Kovacs, M. Rodger, D. R. Anderson, B. Morrow, G. Kells, J. Kovacs, E. Boyle, and P. S. Wells. Comparison of 10-mg and 5-mg warfarin initiation nomograms together with low-molecular-weight heparin for outpatient treatment of acute venous thromboembolism. Annals of Internal Medicine, 138(9):714–719, 2003.
- J. Lee and C. Clifton. How much is enough? Choosing ε for differential privacy. In ISC, 2011.
- J. Lee and C. Clifton. Differential identifiability. In KDD, 2012.
- J. Lei. Differentially private m-estimators. In NIPS, 2011.
- Y. Lindell and E. Omri. A practical application of differential privacy to personalized online advertising. IACR Cryptology ePrint Archive, 2011.
- G. Loukides, J. C. Denny, and B. Malin. The disclosure of diagnosis codes can breach research participants’ privacy. Journal of the American Medical Informatics Association, 17(3):322–327, 2010.
- G. Loukides, A. Gkoulalas-Divanis, and B. Malin. Anonymization of electronic medical records for validating genome-wide association studies. Proceedings of the National Academy of Sciences, 107(17):7898–7903, Apr. 2010.
- A. Narayanan and V. Shmatikov. Robust deanonymization of large sparse datasets. In Oakland, 2008.
- A. Narayanan and V. Shmatikov. Myths and fallacies of Personally Identifiable Information. Commun. ACM, 53(6), June 2010.
- J. Reed, A. J. Aviv, D. Wagner, A. Haeberlen, B. C. Pierce, and J. M. Smith. Differential privacy for collaborative security. In Proceedings of the Third European Workshop on System Security, EUROSEC, 2010.
- S. Sankararaman, G. Obozinski, M. I. Jordan, and E. Halperin. Genomic privacy and limits of individual detection in a pool. Nature Genetics, 41(9):965–967, 2009.
- E. A. Sconce, T. I. Khan, H. A. Wynne, P. Avery, L. Monkhouse, B. P. King, P. Wood, P. Kesteven, A. K. Daly, and F. Kamali. The impact of CYP2C9 and VKORC1 genetic polymorphism and patient characteristics upon warfarin dose requirements: proposal for a new dosing regimen. Blood, 106(7):2329–2333, 2005.
- S. V. Sorensen, S. Dewilde, D. E. Singer, S. Z. Goldhaber, B. U. Monz, and J. M. Plumb. Costeffectiveness of warfarin: Trial versus real-world stroke prevention in atrial fibrillation. American Heart Journal, 157(6):1064 – 1073, 2009.
- L. Sweeney. Simple demographics often identify people uniquely. 2000.
- F. Takeuchi, R. McGinnis, S. Bourgeois, C. Barnes, N. Eriksson, N. Soranzo, P. Whittaker, V. Ranganath, V. Kumanduri, W. McLaren, L. Holm, J. Lindh, A. Rane, M. Wadelius, and P. Deloukas. A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose. PLoS Genet, 5(3), 03 2009.
- S. Vinterbo. Differentially private projected histograms: Construction and use for prediction. In ECML-PKDD, 2012.
- D. Vu and A. Slavkovic. Differential privacy for clinical trial data: Preliminary evaluations. In ICDM Workshops, 2009.
- R. Wang, Y. F. Li, X. Wang, H. Tang, and X. Zhou. Learning your identity and disease from research papers: information leaks in genome wide association studies. In CCS, 2009.
- J. Zhang, Z. Zhang, X. Xiao, Y. Yang, and M. Winslett. Functional mechanism: regression analysis under differential privacy. In VLDB, 2012.