학술논문
Contextual Pattern Matching in Less Space
Document Type
Conference
Source
2023 Data Compression Conference (DCC) DCC Data Compression Conference (DCC), 2023. :160-167 Mar, 2023
Subject
Language
ISSN
2375-0359
Abstract
We revisit the Contextual Pattern Matching Problem, defined as follows: preprocess a text T[1, n], so that given a query consisting of a string P and a length P, the occurrences of all distinct strings XPY where |X|=|Y|=P can be reported. This problem was introduced by Navarro, who presented an O($\overline{r}\log(n/\overline{r}))$ space data structure, where $\overline{r}$ is the maximum of the number of runs in the BWT of the text $\mathrm{T}[1,n]$ and its reverse. His solution reports all c contextual occurrences in $O(|P|+c\log n)$ time. However, the only known bounds on $\overline{r}$ are $\overline{r}=O(r\log^{2}n)$ where r is the number of runs in the BWT of T, making it desirable to avoid using structures with space dependent on $\overline{r}$. We demonstrate that this is possible without a significant sacrifice in query time by providing an $O(r\log(n/r))$ space solution that answers queries in $O(|P|+c\log P\cdot\log(n/r))$ time.