Sequence landscape is a data structure that presents exact occurrences of substrings of a source string in a target string. It also specifies the frequency of each of these occurrences. The original construction algorithm [1] uses a directed acyclic word graph (DAWG), which is the smallest automaton that accepts all the substrings of a string, as the underlying data structure. Although the running time and space requirements for building the DAWG in asymptotically linear, the constants are relatively large.
HyDA-Vista: Towards Optimal Guided Selection of k-mer Size for Sequence Assembly
Basir Shariat, Narjes Sadat Movahedi, Hamidreza Chitsaz and Christina Boucher. To appear in BMC Bioinformatics