On the Complexity of Recognizing Wheeler Graphs

Algorithmica(2022)

引用 3|浏览3
暂无评分
摘要
In recent years, several compressed indexes based on variants of the Burrows–Wheeler transform have been introduced. Some of these are used to index structures far more complex than a single string, as was originally done with the FM-index (Ferragina and Manzini in J. ACM 52(4):552–581, https://doi.org/10.1145/1082036.1082039, 2005). As such, there has been an increasing effort to better understand under which conditions such an indexing scheme is possible. This has led to the introduction of Wheeler graphs (Gagie et al. in Theor Comput Sci 698:67–78, https://doi.org/10.1016/j.tcs.2017.06.016, 2017). Gagie et al. showed that de Bruijn graphs, generalized compressed suffix arrays, and several other BWT related structures can be represented as Wheeler graphs, and that Wheeler graphs can be indexed in a space-efficient way. Hence, being able to recognize whether a given graph is a Wheeler graph, or being able to approximate a given graph by a Wheeler graph, could have numerous applications in indexing. Here we resolve the open question of whether there exists an efficient algorithm for recognizing if a given graph is a Wheeler graph. We show: The problem of recognizing whether a given graph G=(V, E) is a Wheeler graph is NP-complete for any edge label alphabet of size σ≥ 2 , even when G is a DAG. This holds even on a restricted subset of graphs called d -NFAs for d ≥ 5 . This is in contrast to recent results demonstrating the problem can be solved in polynomial time for d -NFAs where d ≤ 2 . We also show that the recognition problem can be solved in linear time for σ =1 on graphs without self-loops; There exists an 2^elogσ + O(n + e) time exact algorithm where n = |V| and e = |E| . This algorithm relies on graph isomorphism being computable in strictly sub-exponential time; We define an optimization variant of the problem called Wheeler Graph Violation, abbreviated WGV, where the aim is to identify the smallest set of edges that have to be removed from a graph to obtain a Wheeler graph. We show WGV is APX-hard, even when G is a DAG, implying there exists a constant C > 1 for which there is no C -approximation algorithm (unless P = NP). Also, conditioned on the Unique Games Conjecture, for all C > 1 , it is NP-hard to find a C -approximation, implying WGV is not in APX; We define the Wheeler Subgraph problem, abbreviated WS, where the aim is to find the largest subgraph which is a Wheeler Graph (the dual of WGV). In contrast to WGV, we give an O(σ ) -approximation algorithm for the WS problem, implying it is in APX for σ = O(1) . The above findings suggest that most problems under this theme are computationally difficult. However, we identify a class of graphs for which the recognition problem is polynomial-time solvable, raising the question of which properties determine this problem’s difficulty.
更多
查看译文
关键词
Wheeler graphs,FM-index,Burrows–Wheeler transform
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要