MLcps: Machine Learning Cumulative Performance Score for classification problems

biorxiv(2022)

引用 0|浏览5
暂无评分
摘要
A performance metric is a tool to measure the correctness of a trained Machine Learning (ML) model. Numerous performance metrics have been developed for classification problems making it overwhelming to select the appropriate one since each of them represents a particular aspect of the model. Furthermore, selection of a performance metric becomes harder for problems with imbalanced and/or small datasets. Therefore, in clinical studies where datasets are frequently imbalanced and, in situations when the prevalence of a disease is low or the collection of patient samples is difficult, deciding on a suitable metric for performance evaluation of an ML model becomes quite challenging. The most common approach to address this problem is measuring multiple metrics and compare them to identify the best-performing ML model. However, comparison of multiple metrics is laborious and prone to user preference bias. Furthermore, evaluation metrics are also required by ML model optimization techniques such as hyperparameter tuning, where we train many models, each with different parameters, and compare their performances to identify the best-performing parameters. In such situations, it becomes almost impossible to assess different models by comparing multiple metrics. Here, we propose a new metric called Machine Learning Cumulative Performance Score (MLcps) as a Python package for classification problems. MLcps combines multiple pre-computed performance metrics into one metric that conserves the essence of all pre-computed metrics for a particular model. We tested MLcps on 4 different publicly available biological datasets and the results reveal that it provides a comprehensive picture of overall model robustness. MLcps is available at https://pypi.org/project/MLcps/ and cases of use are available at https://mybinder.org/v2/gh/FunctionalUrology/MLcps.git/main. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
classification,machine learning,performance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要