STXXL is a layered library; it consists of three layers (see Figure 1). The lowest layer, the Asynchronous I/O primitives layer (AIO layer) abstracts away the details of how asynchronous I/O is performed on a particular operating system. Other existing external memory algorithm libraries rely only on synchronous I/O APIs  or allow reading ahead sequences stored in a file using the POSIX asynchronous I/O API . These libraries also rely on uncontrolled operating system I/O caching and buffering in order to overlap I/O and computation in some way. However, this approach has significant performance penalties for accesses without locality. Unfortunately, asynchronous I/O APIs are very different on different operating systems (e.g. POSIX AIO and Win32 Overlapped I/O). Therefore, we have introduced the AIO layer to make porting STXXL easy. Porting the whole library to a different platform (for example Windows) requires only reimplementing the AIO layer using native file access methods and/or native multithreading mechanisms. STXXL has already several implementations of the layer which use different file access methods under POSIX/UNIX systems.
The Block management layer (BM layer) provides a programming interface simulating the parallel disk model. The BM layer provides abstraction for a fundamental concept in the external memory algorithm design - block of elements. The block manager implements block allocation/deallocation allowing several block-to-disk assignment strategies: striping, randomized striping, randomized cycling, etc. The block management layer provides implementation of parallel disk buffered writing , optimal prefetching , and block caching. The implementations are fully asynchronous and designed to explicitly support overlapping between I/O and computation.
The top of STXXL consists of two modules. The STL-user layer provides external memory sorting, external memory stack, external memory priority queue, etc. which have (almost) the same interfaces (including syntax and semantics) as their STL counterparts. The Streaming layer provides efficient support for pipelining external memory algorithms. Many external memory algorithms, implemented using this layer, can save factor of 2-3 in I/Os. For example, the algorithms for external memory suffix array construction implemented with this module  require only of I/Os which must be performed by implementations that use conventional data structures and algorithms (either from STXXL STL-user layer, or LEDA-SM, or TPIE). The win is due to an efficient interface, that couples the input and the output of the algorithm-components (scans, sorts, etc.). The output from an algorithm is directly fed into another algorithm as input, without needing to store it on the disk in between. This generic pipelining interface is the first of this kind for external memory algorithms.