|Lehrstuhl:||Lehrstuhl Prof. Sanders|
|Zeit:||Vorbesprechung am 26.04.2006, 13.15 - 13.45|
|Dozent:||P. Sanders, R. Dementiev, D. Schultes, J. Singler|
Vorbesprechung am 26.04.2006 13.15 - 13.45 Uhr
Massive data sets arise naturally in many domains: geographic information systems, computer graphics, database systems, telecommunication billing systems, network analysis, and scientific computing. Applications working in those domains have to process terabytes of data. However, the internal memories of computers can keep only a small fraction of these huge data sets. During the processing the applications need to access external storage (e.g. hard disks). One such access can be about 106 times slower than a main memory access. For any such access to the hard disk, accesses to the next elements in the external memory are much cheaper. In order to amortize the high cost of a random access one can read or write contiguous chunks of size B. The I/O becomes the main bottleneck for the applications dealing with the large data sets, therefore one tries to minimize the number of I/O operations performed. In order to increase I/O bandwidth, applications use multiple disks, in parallel. In each I/O step the algorithms try to transfer D blocks between the main memory and disks (one block from each disk). This model has been formalized by Vitter and Shriver as Parallel Disk Model (PDM) and is the standard theoretical model for designing and analyzing I/O-efficient algorithms.
In this praktikum we develop I/O-efficient algorithms that can solve very big problems that do not fit into main memory of a computer. We will use the Stxxl library to implement algorithms and data structures for processing huge graphs, texts and matrices.
|Algorithms for Memory Hierarchies Book||LNCS 2625 Tutorial|
|STXXL: Standard Template Library for XXL Data Sets||Technical Report 2005/18, Fakultät für Informatik, Universität Karlsruhe|