Mining alpha/beta concepts as relevant bi-sets from transactional data

msra(2004)

引用 23|浏览10
暂无评分
摘要
We are designing new data mining techniques on boolean contexts to identify a priori interesting bi-sets (i.e., sets of objects or transactions associated to sets of attributes or items). A typical impor- tant case concerns formal concept mining (i.e., maximal rectangles of true values or associated closed sets by means of the so-called Galois connec- tion). It has been applied with some success to, e.g., gene expression data analysis where objects denote biological situations and attributes denote gene expression properties. However in such real-life application domains, it turns out that the Galois association is a too strong one when considering intrinsically noisy data. It is clear that strong associations that would however accept a bounded number of exceptions would be extremely useful. We study the new pattern domain of fi=fl concepts, i.e., consistent maximal bi-sets with less than fi false values per row and less than fl false values per column. We provide a complete algorithm that computes all the fi=fl concepts based on the generation of concept unions pruned thanks to anti-monotonic constraints. An experimental validation on synthetic data is given. It illustrates that more relevant associations can be discovered in noisy data. We also discuss a practical application in molecular biology that illustrates an incomplete but quite useful extraction when all the concepts that are needed beforehand can not be discovered.
更多
查看译文
关键词
synthetic data,data mining,molecular biology,transaction data,gene expression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要