Implementation and Evaluation of an External Memory String B-Tree
Fellipe Bernardes Lima
Preprocessing texts of huge size to answer substring queries is not trivial whenever considering realistic models. We approach this problem by offering an efficient implementation of the String B-Tree data structure, which aims to solve the sub-string search problem under the dynamic operations. We achieve optimal space usage for the Patricia Tries by representing them via multiarray encoding and our approach for the String B-Tree uses less space than any other external memory full-text index available. The tree representation is well suited for entropy compression and the String B-Tree is efficiently constructed regarding CPU processing and I/O operations. During that construction process we reuse the SA and the LCP arrays, which can be generated by other efficient algorithms.