Relationship preserving feature selection for unlabelled clinical trials time-series.

BCB'10: ACM International Conference on Bioinformatics and Computational Biology Niagara Falls New York August, 2010(2010)

引用 2|浏览10
Feature selection has been widely studied in supervised data mining applications, where the typical goal is to create clusters through the selection of a reduced attribute set that maximizes classification accuracies. Such a goal may not be appropriate for preserving inter-attribute relationships of unlabelled time-series, such as the case of clinical trials data. In this paper, we select the features based on the time-series relationships of attributes by measuring their inter-attribute movement. We present performance measures and methods for feature selection over unlabelled time-series with the aim of preserving inter-attribute relationships. The performance metrics estimate the effectiveness of a given feature set with respect to representation quality by measuring the nearest neighbors before and after feature selection. We provide techniques to combine and compare data from non-standard variable-length time-series sources and provide a mechanism to inject expert opinion into the feature selection process. The methodologies and comparative results are presented in the context of a real pharmaceutical database application.
AI 理解论文
Chat Paper