The algorithms of STL can be divided into two groups by their memory access pattern: scanning algorithms and random access algorithms.
Scanning algorithms. These are the algorithms that work with Input, Output, Forward, and Bidirectional iterators only. Since random access operations are not allowed with these kinds of iterators, the algorithms inherently exhibit strong spatial locality of reference. STXXL containers and their iterators are STL-compatible, therefore one can directly apply STL scanning algorithms to them, and they will run I/O-efficiently (see the use of std::generate and std::unique algorithms in the Listing 4). Scanning algorithms are the majority of the STL algorithms (62 out of 71). STXXL also offers specialized implementations of some scanning algorithms (stxxl::for_each, stxxl::generate, etc.), which perform better in terms of constant factors in the I/O volume and internal CPU work. These implementations benefit from accessing lower level interfaces of the BM layer instead of using iterator interfaces, resulting in smaller CPU overhead. Being aware of the sequential access pattern of the applied algorithm, the STXXL implementations can do prefetching and use queued writing, thereby leading to the overlapping of I/O with computation.
Random access algorithms. These algorithms require RandomAccess iterators, hence may perform many random I/Os 4. For such algorithms, STXXL provides specialized I/O-efficient implementations that work with STL-user layer external memory containers. Currently the library provides two implementations of sorting: an std::sort-like sorting routine - stxxl::sort, and a sorter that exploits integer keys - stxxl::ksort. Both sorters are highly efficient parallel disk implementations. The algorithm they implement guarantees close to optimal I/O volume and almost perfect overlapping between I/O and computation [16]. The performance of the sorters scales well. With eight disks which have peak bandwidth of 380 MB/s it sorts 128 byte elements with 32 bit keys achieving I/O bandwidth of 315 MB/s.
Listing 4 shows how to program using the STL-user layer
and how STXXL containers can be used together with both STXXL algorithms and
STL algorithms. Definitions of classes edge, random_edge and
edge_cmp are in Listing 3.
The purpose of our example is to generate a huge random directed
graph in sorted edge array representation. The edges in the edge array must be sorted
lexicographically. A straightforward procedure to do this is to: 1) generate a sequence
of random edges, 2) sort the sequence, 3) remove duplicate edges from it.
If we ignore definitions of helper classes the STL/STXXL code for it is only
five lines long:
Line 1 creates an STXXL external memory vector with 10 billion edges.
Line 2 fills the vector with random edges (generate from
STL is used).
In the next line the STXXL external memory sorter sorts randomly generated edges
using 512 megabytes of internal memory. The lexicographical order is defined by
functor my_cmp, stxxl::sort also requires the comparison
functor to provide upper and lower bounds for the elements being sorted.
Line 6 deletes duplicate edges in the external memory vector
with the help of the STL unique algorithm. The NewEnd vector iterator
points to the right boundary of the range without duplicates.
Finally (Line 7), we chop the vector
at the NewEnd boundary.
Now we count the number of I/Os performed by this example: external vector
construction takes no I/Os; filling with random values requires a scan --
I/Os;
sorting will take
I/Os; duplicate removal needs no more than
I/Os;
chopping a vector is I/O free. The total number of I/Os is
.