The Standard Template Library for XXL data sets, STXXL, lacks a frequently requested feature: matrix calculations. The objects of interest are matrices that are too large to fit into the RAM of a usual computer, e. g. being of dimension 100000 x 100000. Instead, they have to be stored in so-called external memory, usually hard disks or SSDs.
Specific layouts and algorithms exist to to ensure I/O-efficient processing of the matrix operations, i. e. to do as few as possible coarse-grained accesses to the disks. Thus, work on this topic will start with a literature study. The software engineering part of this work consists of designing the API.
Using this interface, the most algorithms are to be implemented as an extension of STXXL, based on an I/O-efficient matrix storage. The algorithms include scalar multiplication (scan), transposition, and matrix-matrix multiplication. To speed up internal computations,multi-core parallelism should be used through the OpenMP and/or the Multi-Core Standard Template Library. Different algorithms and implementation variants should be evaluated and compared on that basis.
- Good knowledge of C++ and the Standard Template Library, or the willingness to learn the language
- Software Engineering Competence
- Develop software that is actually used by other researchers
- Gain insight to external memory algorithms, and learn how to compute fast although accessing hard disk
Link to the STXXL homepage.