Relationship preserving feature selection for unlabelled clinical trials time-series.
BCB'10: ACM International Conference on Bioinformatics and Computational Biology Niagara Falls New York August, 2010(2010)
摘要
Feature selection has been widely studied in supervised data mining applications, where the typical goal is to create clusters through the selection of a reduced attribute set that maximizes classification accuracies. Such a goal may not be appropriate for preserving inter-attribute relationships of unlabelled time-series, such as the case of clinical trials data. In this paper, we select the features based on the time-series relationships of attributes by measuring their inter-attribute movement. We present performance measures and methods for feature selection over unlabelled time-series with the aim of preserving inter-attribute relationships. The performance metrics estimate the effectiveness of a given feature set with respect to representation quality by measuring the nearest neighbors before and after feature selection. We provide techniques to combine and compare data from non-standard variable-length time-series sources and provide a mechanism to inject expert opinion into the feature selection process. The methodologies and comparative results are presented in the context of a real pharmaceutical database application.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要