Posted on 2019-03-22 16:00 by Timo Bingmann at Permlink with 0 Comments. Tags: c++ stxxl thrill
Last week I had the pleasure of being invited to the Dagstuhl seminar 19111 on Theoretical Models of Storage Systems. I gave a talk on the history of STXXL and Thrill, but also wanted to include some current developments. Most interesting I found is the gap closing between RAM and disk bandwidth due to the (relatively) new Non-Volatile Memory Express (NVMe) storage devices.
Since I am involved in many projects using external memory, I decided to perform a simple set of fundamental experiments to compare rotational disks and newer solid-state devices (SSDs). The results were interesting enough to write this blog article about.
Among the tools of STXXL/FOXXLL there are two benchmarks which perform two distinct access patterns: Scan (benchmark_disks
) and Random (benchmark_disks_random
).
The Scan experiment is probably the fastest access method as it reads or writes the disk (actually: storage device) sequentially. The Random experiment is good to determine the access latency of the disk as it first has to seek to the block and then transfer the data. Notice that the Random experiment does batched block accesses like one would perform in a query/answering system where the next set of random blocks depends on calculations performed with the preceding blocks (like in a B-Tree). This is a different experiment than done by most "throughput" measurement tools which issue a continuous stream of random block accesses.
Posted on 2016-09-21 20:00 by Timo Bingmann at Permlink with 0 Comments. Tags: talk thrill stxxl
Today, I gave a technical presentation comparing STXXL and Project Thrill at the STXXL Workshop organized within the DFG SPP 1736. The main topic of the workshop was to determine the future development course of STXXL, and the biggest question in this regard was how to bring more multi-core parallelization into STXXL. Thrill or an adaptation of its ideas may be the solution to this challenge: 2016-09-21 STXXL and Thrill Slides.pdf .
Posted on 2015-06-30 17:11 by Timo Bingmann at Permlink with 0 Comments. Tags: c++ talk stxxl
We are very glad to have been given the opportunity to present our work on bulk-parallel priority queues for external memory at the 14th International Symposium on Experimental Algorithms (SEA 2015) in Paris. Our paper is in the proceedings and also available here: paper-SEA15-Bulk-Parallel-Priority-Queue.pdf
The talk was given by Thomas Keh, and the slides of the presentation are available online: 2015-06-29 A Bulk-Parallel Priority Queue in External Memory with STXXL.pdf . The implementation is available in the current master branch of STXXL at github.
Posted on 2015-04-03 14:54 by Timo Bingmann at Permlink with 0 Comments. Tags: research stxxl
Today, our technical report on "A Bulk-Parallel Priority Queue in External Memory with STXXL" is now available on arXiv as 1504.00545 or locally: 1504.00545v1.pdf with source 1504.00545v1.tar.gz
(130 KiB). A big thanks goes to Thomas Keh for the hard work he did in his bachelor thesis, and to Peter Sanders for all the insights into priority queues. The technical report is an extended version of our paper that will appear at the 14th International Symposium on Experimental Algorithms - SEA 2015.
The bulk-parallel priority queue is current available in the development repository of STXXL on Github.
On 2015-06-29, Thomas Keh presented the PPQ at SEA'15 in Paris.
Posted on 2014-10-29 11:17 by Timo Bingmann at Permlink with 0 Comments. Tags: c++ university stxxl
STXXL is an implementation of the C++ standard template library STL for external memory (out-of-core) computations, i.e., STXXL implements containers and algorithms that can process huge volumes of data that only fit on disks. While the compatibility to the STL supports ease of use and compatibility with existing applications, another design priority is high performance.
More history about STXXL can be found in the blog post to 1.4.0. Today, the second release of the new 1.4 branch was published:
What's new in 1.4.1 ?
linuxaio
"), which exploits Native Command Queuing (NCQ) if available.stxxl::unordered_map
branch, which provides a hash map backed by external memory.struct default_completion_handler
with a NULL
pointer, thus avoiding superfluous new/delete work for each I/O request.stxxl::external_shared_ptr
which is a proxy class to allow use of shared_ptr classes inside stxxl containers.atomic_counted_object
in class file for request reference counting.See the CHANGELOG for further minor changes.
Posted on 2014-06-22 20:00 by Timo Bingmann at Permlink with 0 Comments. Tags: c++ stxxl talk frontpage
This is a recording of a talk that I gave last week at the 3rd LSDMA Topical Meeting in Berlin. The talk covers a basic introduction into the STXXL library's features and contains many short code examples that serve as a tutorial.
The slides of the presentation 2014-06-22 STXXL 1.4.0 and Beyond.pdf are available online and the recorded talk can be seen on Youtube video below.
And here is the video of the recording: https://www.youtube.com/watch?v=UswxcAOJKBE
Posted on 2013-12-12 16:50 by Timo Bingmann at Permlink with 0 Comments. Tags: c++ university stxxl
STXXL is an implementation of the C++ standard template library STL for external memory (out-of-core) computations, i.e., STXXL implements containers and algorithms that can process huge volumes of data that only fit on disks. While the compatibility to the STL supports ease of use and compatibility with existing applications, another design priority is high performance.
The project was originally created by Roman Dementiev and Peter Sanders at MPI Informatik in Saarbrücken. It moved to Karlsruhe with them in 2004. After Roman's PhD defense, there was a cooperation with the Algorithm Engineering group at the University of Frankfurt to create better parallel asynchronous sorting. Afterwards, stewardship moved to Frankfurt, where work on flash/SSD drives and various external memory graph algorithms was done.
After a longer stretch without further work, I have decided to take part in future development as a maintainer. This is partly due to my previous experience with it while implement in eSAIS, the external memory suffix and LCP array construction algorithm. And thus, today, the first release of the new 1.4 branch was published:
What's new in 1.4.0 ?
stxxl::sequence
and stxxl::sorter
stxxl::queue
and stxxl::priority_queue
Posted on 2012-11-19 15:49 by Timo Bingmann at Permlink with 2 Comments. Tags: research stringology stxxl c++
This web page accompanies our conference paper "Inducing Suffix and LCP Arrays in External Memory", which we presented at the Workshop on Algorithm Engineering and Experiments (ALENEX 2013). A PDF of the publication is available from this site as alenex13esais.pdf or from the online proceedings. The paper was joint work with my colleagues Johannes Fischer and Vitaly Osipov.
The slides to my presentation of the paper on January 7th, 2013 in New Orleans, LA, USA is available: alenex13esais-slides.pdf . They contain little text and an example of the eSAIS algorithm with a simplified PQ.
We have also submitted a full version of the eSAIS paper to a journal. Due to long publication cycles, we make a pre-print of the journal article is available here: esais-preprint.pdf . The full paper contains more details on the inducing algorithm for the LCP array and additional experimental details.
Our implementations of eSAIS, the eSAIS-LCP variants, DC3 and DC3-LCP algorithms as described in the paper are available below under the GNU General Public License v3 (GPL).
eSAIS and DC3 with LCP version 0.5.4 (current) updated 2013-12-13 | ||
Source code archive: (includes STXXL 1.4.0) | eSAIS-DC3-LCP-0.5.4.tar.bz2 | Browse online |
Git repositories | Suffix and LCP construction algorithmsgit clone https://github.com/bingmann/eSAIS cd eSAIS; git submodule init; git submodule update | |
STXXL 1.4.0git clone https://github.com/stxxl/stxxl |
For more information about compiling and testing the implementation, please refer to the README included in the source.