Allocating Unique Property Reference Numbers to Patient Addresses Using A Deterministic Address-Matching Algorithm: Evaluation of Accuracy, Match Rate and Bias

International Journal for Population Data Science(2020)

引用 0|浏览2
暂无评分
摘要
Introduction Representing patient-registered addresses as pseudonymised Unique Property Reference Numbers (UPRNs) enables linkage of environmental and household information to electronic health records (EHRs). However, the accuracy and potential biases in address-matching algorithm results applied to patient addresses is unknown. Objectives and Approach To investigate accuracy, match rate, and biases in assigning UPRNs to general practitioner (GP)-registered patient addresses for a geographically-defined UK population, using a bespoke deterministic address-matching algorithm comprising 213 rules applied in rank order of minimising false-positives, developed for the Discovery Data Service. We ran this algorithm to match 906,220 adult patient GP-registered addresses (48% female, 47% non-White, 89% 20-64) sampled in mid-2018 from 159 GP practices in four London boroughs to Ordnance Survey’s AddressBase Premium database. We evaluated the error rates using a gold-standard dataset. We used binary logistic regression to estimate the likelihood (Odds Ratio [OR]; 95% Confidence Intervals [CI]) of no UPRN match according to and adjusting for patient age, sex, ethnic background, deprivation, residential mobility and multiple GP registrations. Results 96% of patient addresses were successfully assigned a UPRN. Algorithm sensitivity, specificity, positive and negative predictive-values and F-measure were, respectively: 0.993, 0.019, 0.914, 0.204, and 0.9516. After mutual adjustment, UPRN assignment was less likely for: men (OR: 0.87; 95%CI: 0.83,0.91); adolescents and the elderly (15-19 years: 0.57;0.43,0.77; ≥90 years: 0.39;0.18,0.84); those from Chinese ethnic backgrounds (0.87;0.8,0.91), living in the least deprived areas (0.25;0.21,0.31), or with two or more distinct UPRNs across multiple registrations (0.37;0.28,0.49); and more likely for: those from Bangladeshi ethnic backgrounds (1.79;1.61,2.00), registered before 2018 (5.10;4.42,5.87), or with multiple GP registrations (2.36;1.82,3.05). Conclusion / Implications The Discovery open-source algorithm achieves a high accurate match rate and quantifies the demographic groups that may be under-represented among those successfully matched. This is the first time that bias in matching rates for an address-matching algorithm has been evaluated using patient-registered addresses.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要