Parallele Algorithmen Programmieraufgaben

1. Reduce and AllReduce for non-powers-of-two (1 participant)
2. Parallel Breadth-First-Search Frontier (1 participant)
3. Bisecting \(k\)-means Clustering (1 participant)
4. HyperLogLog (2 participants) [UNFINISHED]
5. Gaussian Mixture Clustering (1 participant) [UNFINISHED]
6. Linear Regression with Stochastic Gradient Descent Optimization (3 participants) [UNFINISHED]

1 Reduce and AllReduce for non-powers-of-two (1 participant)

Goal: Implement AllReduce() and Reduce() in Thrill's network layer for \(p\) workers with \(p \neq 2^k\). Add enough test cases to check the code's correctness. Prepare a short presentation on how this algorithm works.

Thrill requires AllReduce and Reduce collective operations. For \(2^k\) the hypercube algorithm presented in the lecture is used. For non-\(2^k\), however, currently only a trivial \(2 \log_2 p\) reduction algorithm is used. For performance, one really wants to use a \(\log_2 p + O(1)\) algorithm. The following paper proposes such an algorithm, which is composed of a hierarchy of butterfly exchanges:

Rabenseifner, Rolf, and Jesper Larsson Träff. "More Efficient Reduction Algorithms for Non-Power-of-Two Number of Processors in Message-Passing Parallel Systems." In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 36–46. LNCS 3241. Springer, 2004. http://link.springer.com/chapter/10.1007/978-3-540-30218-6_13

See the trivial and hypercube AllReduce() implementations in Thrill as a starting point: https://github.com/thrill/thrill/blob/master/thrill/net/collective.hpp

2 Parallel Breadth-First-Search Frontier (1 participant)

Goal: Write a parallel breadth-first search in Thrill. Apply the algorithm on generated and real-world graphs.

The graph should be stored as a DIA of vector<size_t>, within which each index describes the outgoing edges of a vertex. Determine how to perform a BFS using ReduceToIndex to represent the queues. In the presentation discuss into how ReduceToIndex works internally and how the queues are processed.

Buluç, Aydin, and Kamesh Madduri. "Parallel breadth-first search on distributed memory systems." Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 2011.

3 Bisecting \(k\)-means Clustering (1 participant)

Goal: Implementation of the bisecting K-Means clustering Algorithm in Thrill. Test it on generated or real-world benchmarks inputs.

The "standard" \(k\)-means algorithm is Lloyd's algorithms which performs \(k\) iterations and calculates new centers as the average of previous classifications. The following authors present a different algorithms for iteratively improving \(k\)-means clusters:

Steinbach, Michael, George Karypis, and Vipin Kumar. "A comparison of document clustering techniques." KDD workshop on text mining. Vol. 400. No. 1. 2000. http://glaros.dtc.umn.edu/gkhome/fetch/papers/docclusterKDDTMW00.pdf

Figure out how to parallelize the sequential base algorithm or transcribe the Spark code into Thrill/C++. Test your code on appropriate inputs.

Spark Implementation: https://github.com/apache/spark/blob/v2.0.2/mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala

4 HyperLogLog (2 participants) [UNFINISHED]

Goal: a new Thrill DIA Operation HyperLogLog = Count Distinct Approximation.

Flajolet, Philippe, et al. "Hyperloglog: the analysis of a near-optimal cardinality estimation algorithm." DMTCS Proceedings 1 (2008).

Heule, Stefan, Marc Nunkesser, and Alexander Hall. "HyperLogLog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm." Proceedings of the 16th International Conference on Extending Database Technology. ACM, 2013.

Presentation by Robert Sedgewick: https://www.cs.princeton.edu/~rs/talks/AC11-Cardinality.pdf