Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant FactorJust Accepted

Journal of the ACM(2021)

引用 0|浏览0
暂无评分
摘要
We consider the numerical taxonomy problem of fitting a positive distance function \({\mathcal {D}:{S\choose 2}\rightarrow \mathbb {R}_{\gt 0}} \) by a tree metric. We want a tree T with positive edge weights and including S among the vertices so that their distances in T match those in \(\mathcal {D} \). A nice application is in evolutionary biology where the tree T aims to approximate the branching process leading to the observed distances in \(\mathcal {D} \) [Cavalli-Sforza and Edwards 1967]. We consider the total error, that is the sum of distance errors over all pairs of points. We present a deterministic polynomial time algorithm minimizing the total error within a constant factor. We can do this both for general trees, and for the special case of ultrametrics with a root having the same distance to all vertices in S . The problems are APX-hard, so a constant factor is the best we can hope for in polynomial time. The best previous approximation factor was O ((log n )(log log n )) by Ailon and Charikar [2005] who wrote “Determining whether an O (1) approximation can be obtained is a fascinating question”.
更多
查看译文
关键词
approximation algorithms,phylogenic reconstructions,hierarchical clustering,tree metrics,ultrametrics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要