Predicting the risk of diabetes complications using machine learning and social administrative data in a country with ethnic inequities in health: Aotearoa New Zealand

medRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览0
暂无评分
摘要
Background In the age of big data, linked social and administrative health data in combination with machine learning (ML) is being increasingly used to improve prediction in cardiovascular diseases (CVD). We aimed to apply ML methods on extensive national-level health and social administrative datasets to predict future diabetes complications by ethnicity. Methods Five ML models were used to predict CVD events among all people with known diabetes in the population of New Zealand, utilizing national-level administrative data at the individual level. Results The Xgboost ML model had the best predictive power for predicting CVD events three years into the future among the population with diabetes. The optimization procedure also found limited improvement in AUC by ethnicity. The results indicated no trade-off between model predictive performance and equity gap of prediction by ethnicity. The list of variables of importance was different among different models/ethnic groups, for examples: age, deprivation, having had a hospitalization event, and the number of years living with diabetes. Discussion and conclusions We provide further evidence that ML with administrative health data can be used for meaningful future prediction of health outcomes. As such it could be utilized to inform health planning and healthcare resource allocation for diabetes management and the prevention of CVD events. Our results may suggest limited scope for developing prediction models by ethnic group and that the major ways to reduce inequitable health outcomes is probably via improved delivery of prevention and management to those groups with diabetes at highest need. ### Competing Interest Statement The authors have declared no competing interest. ### Funding Statement This work was funded by the Royal Society Te Aparangi. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: Ethics committee of University of Otago gave ethical approval for this work I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes Access to the anonymised data used in this study was provided by Stats NZ under the security and confidentiality provisions of the Statistics Act 1975. Only people authorised by the Statistics Act 1975 are allowed to see data about a particular person, household, business, or organisation, and the results in this paper have been confidentialised to protect these groups from identification and to keep their data safe. * AUC : Area under the Receiver Operating Characteristics curve CVD : Cardiovascular diseases IDI : Integrated Data Infrastructure ML : Machine learning NZ : Aotearoa New Zealand RF : Random Forest SNZ : Stats New Zealand
更多
查看译文
关键词
diabetes complications,new zealand,aotearoa new zealand,social administrative data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要