Wavelet Tree Construction on GPUs

  • Bit Vectors are one of the most basic data structures in computer science. Operations on bit vectors include rank and select queries.

     

    - rank_1(k) returns the number of 1-bits up to position k in the bit vector and
    - select_1(k) returns the position at which the k-th 1-bit is stored.

     

    One of the many applications of bit vectors with rank and select support are Wavelet Trees. A Wavelet Tree can store a compressed sequence of integers and answer rank and select queries on those integers. This can be used, for example, in bioinformatics. Wavelet trees can be constructed in parallel by splitting the input sequence, constructing independent wavelet trees, and merging the results. The idea of merging is simple but it is rather slow because it requires a large amount of non-contiguous data to be copied. GPUs support fast memory access, which could make this parallel wavelet tree construction more efficient. There seem to be efficient implementations of neither rank and select data structures, nor wavelet tree construction on GPUs. The Nvidia nvbio library provides an implementation but does not use state of the art algorithms.

     

    A detailed description can be found here.