50 Assessment of Skin Graft in Pediatric Burn Patients Using Machine Learning Is Comparable to Human Expert Performance

Journal of Burn Care & Research(2020)

引用 1|浏览3
暂无评分
摘要
Abstract Introduction Though widely used, current scar assessment scales are inaccurate and highly subjective, further complicating the already difficult task of determining the optimal management of burn patients. Additional disadvantages of these tools include the need for direct examination by an experienced clinician and the inability to retrospectively review them. The lack of an accurate assessment tool inevitably impairs any research examining novel therapeutic strategies designed to improve burn scar outcomes by introducing observer bias at every step. Common examples of these tools include the Vancouver Scar Scale and Visual analog scale. New imaging and processing technologies have the potential of bringing accuracy, reproducibility, and accessibility to burn scar assessments. With these goals in mind, our team developed a novel scoring system and a classification model based on Machine Learning algorithms and analyzed 87 pictures to obtain scores on Inflammation (I), Scar (S), Uniformity (U), and Pigmentation (P). Methods All algorithms were trained using both the sub-acute and the long-term phase pictures. The classification model is based on supervised learning, which requires many examples of annotated pictures and corresponding scar scores. The model used a Linear Discriminant Analysis (LDA) algorithm and visual features of the scars and the natural skin. To train and evaluate this model, four burn care providers individually annotated 186 pictures of skin grafts and later formed a committee to annotate by consensus a subset of representative pictures. While the individual predictions were used as an accuracy baseline, the consensus annotation was the true score and used to train the model. Results The model predictions were more accurate in scores mainly based on color (I and P), rather than texture (S and U), as shown by the micro-averaged Area Under the Curve (AUC) of 0.86, 0.61, 0.51, and 0.80 for I, S, U, and P, respectively (Figure 1). The model accuracy was higher than the human baseline for the I (F1 of 0.60 vs. 0.59±0.13, respectively) and P scores (0.54 vs. 0.51±0.09), but lower in the S (0.30 vs. 0.63±0.22) and U scores (0.62 vs. 0.86±0.19). Conclusions Our findings are encouraging and suggest that further improvement of the accuracy of the algorithm could be achieved on the second phase of our assessment development project by increasing the number of pictures it learns from and adding more visual features related to skin texture. Applicability of Research to Practice Our study provides an accurate and reproducible evaluation of burn scars, that leads to newer therapeutic strategies employed by specialized burn care facilities.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要