The algorithms of STL can be divided into two groups
by their memory access pattern:
*scanning* algorithms and *random access* algorithms.

**Scanning algorithms.** These are the algorithms that work with Input, Output,
Forward, and Bidirectional iterators only.
Since random access operations are not
allowed with these kinds of iterators, the algorithms inherently
exhibit strong spatial locality of reference. STXXL containers and their iterators are STL-compatible, therefore
one can directly apply STL scanning algorithms to them,
and they will
run I/O-efficiently (see the use of `std::generate` and `std::unique`
algorithms in the Listing 4).
Scanning algorithms are the majority of the STL algorithms (62 out of 71).
STXXL also offers specialized implementations of some scanning
algorithms (`stxxl::for_each`, `stxxl::generate`, etc.),
which perform better in terms of constant factors in the
I/O volume and internal CPU work. These implementations
benefit from accessing lower level interfaces of the BM layer
instead of using iterator interfaces, resulting
in smaller CPU overhead. Being aware of the sequential
access pattern of the applied algorithm, the STXXL
implementations can do prefetching
and use queued writing, thereby leading to the overlapping of
I/O with computation.

**Random access algorithms.** These algorithms require RandomAccess iterators,
hence may perform many random I/Os ^{4}.
For such algorithms, STXXL provides
specialized I/O-efficient implementations that work with STL-user layer external
memory containers. Currently the library provides two implementations of
sorting: an `std::sort`-like sorting routine - `stxxl::sort`,
and a sorter that exploits integer keys - `stxxl::ksort`. Both sorters
are highly efficient parallel disk implementations. The algorithm
they implement guarantees close to optimal I/O volume and almost perfect
overlapping between I/O and computation [16].
The performance of the sorters scales well. With eight disks which have peak
bandwidth of 380 MB/s it sorts 128 byte elements with 32 bit keys
achieving I/O bandwidth of 315 MB/s.

Listing 4 shows how to program using the STL-user layer
and how STXXL containers can be used together with both STXXL algorithms and
STL algorithms. Definitions of classes `edge`, `random_edge` and
`edge_cmp` are in Listing 3.
The purpose of our example is to generate a huge random directed
graph in sorted edge array representation. The edges in the edge array must be sorted
lexicographically. A straightforward procedure to do this is to: 1) generate a sequence
of random edges, 2) sort the sequence, 3) remove duplicate edges from it.
If we ignore definitions of helper classes the STL/STXXL code for it is only
five lines long:
Line 1 creates an STXXL external memory vector with 10 billion edges.
Line 2 fills the vector with random edges (`generate` from
STL is used).
In the next line the STXXL external memory sorter sorts randomly generated edges
using 512 megabytes of internal memory. The lexicographical order is defined by
functor `my_cmp`, `stxxl::sort` also requires the comparison
functor to provide upper and lower bounds for the elements being sorted.
Line 6 deletes duplicate edges in the external memory vector
with the help of the STL `unique` algorithm. The `NewEnd` vector iterator
points to the right boundary of the range without duplicates.
Finally (Line 7), we chop the vector
at the `NewEnd` boundary.
Now we count the number of I/Os performed by this example: external vector
construction takes no I/Os; filling with random values requires a scan -- I/Os;
sorting will take I/Os; duplicate removal needs no more than I/Os;
chopping a vector is I/O free. The total number of I/Os is .