Generalizability of an acute kidney injury prediction model across health systems

Nature Machine Intelligence(2022)

引用 6|浏览0
Delays in the identification of acute kidney injury in hospitalized patients are a major barrier to the development of effective interventions for treatment. A recent study described a series of models that outperformed previously published models in predicting acute kidney injury up to 48 h in advance, including a recurrent neural network that achieved state-of-the-art performance (area under the curve 0.92) and a gradient-boosted decision tree model that was close behind (area under the curve 0.89). Because these models were trained in a population of US veterans that was 94% male, questions have arisen about its generalizability to other health systems where the populations are more sex balanced. In this study, we aimed to evaluate how well an acute kidney injury model trained in a population of US veterans performs in females at the Veterans Affairs and the extent to which its performance generalizes to a large academic hospital setting. We found that the model performed worse in predicting acute kidney injury in females in both populations, with miscalibration in lower stages of acute kidney injury and worse discrimination (a lower area under the curve) in higher stages of acute kidney injury. We demonstrate that, while this discrepancy in performance can be largely corrected in non-veterans by updating the original model using data from a sex-balanced academic hospital cohort, the worse model performance persists in veterans. Our study sheds light on the importance of characterizing the generalizability of artificial intelligence studies, and on the complexity of discrepancies in model performance in subgroups that cannot be explained simply on the basis of sample size.
Acute kidney injury,Mathematics and computing,Medical research,Engineering,general
AI 理解论文
Chat Paper