Home | english  | Impressum | Datenschutz | Sitemap | KIT

Scalable Construction of Text Indexes with Thrill

Scalable Construction of Text Indexes with Thrill
Tagung:

IEEE International Conference on Big Data

Herausgeber:

IEEE

Tagungsort:

Seattle, WA, USA

Jahr:

2018

Datum:

10.12.-13.12.2018

Links:PDF
Autoren:

Timo Bingmann, Simon Gog, Florian Kurpicz

Referent:

Timo Bingmann

The suffix array is the key to efficient solutions for myriads of string processing problems in different application domains, like data compression, data mining, or bioinformatics. With the rapid growth of available data, suffix array construction algorithms have to be adapted to advanced computational models such as external memory and distributed computing. In this article, we present five suffix array construction algorithms utilizing the new algorithmic big data batch processing framework Thrill, which allows scalable processing of input sizes on distributed systems in orders of magnitude that have not been considered before.