Home | english  | Impressum | Sitemap | KIT

Communication Efficient Checking of Big Data Operations

Communication Efficient Checking of Big Data Operations
Autor:

Lorenz Hübschle-Schneider und Peter Sanders

Links:
Quelle:

Technical Report Oktober 2017, arXiv:1710.08255

Datum: 23.10.2017

We propose fast probabilistic algorithms with low (i.e., sublinear in the input size) communication volume to check the correctness of operations in Big Data processing frameworks and distributed databases. Our checkers cover many of the commonly used operations, including sum, average, median, and minimum aggregation, as well as sorting, union, merge, and zip. An experimental evaluation of our implementation in Thrill (Bingmann et al., 2016) confirms the low overhead and high failure detection rate predicted by theoretical analysis.