panthema / 2014
STXXL simple logo

Released STXXL 1.4.1

Posted on 2014-10-29 11:17 by Timo Bingmann at Permlink with 0 Comments. Tags: #c++ #university #stxxl

STXXL is an implementation of the C++ standard template library STL for external memory (out-of-core) computations, i.e., STXXL implements containers and algorithms that can process huge volumes of data that only fit on disks. While the compatibility to the STL supports ease of use and compatibility with existing applications, another design priority is high performance.

More history about STXXL can be found in the blog post to 1.4.0. Today, the second release of the new 1.4 branch was published:

What's new in 1.4.1 ?

  • Integrated support for kernel based asynchronous I/O on Linux (new file type "linuxaio"), which exploits Native Command Queuing (NCQ) if available.
  • Merged stxxl::unordered_map branch, which provides a hash map backed by external memory.
  • Replaced struct default_completion_handler with a NULL pointer, thus avoiding superfluous new/delete work for each I/O request.
  • Added stxxl::external_shared_ptr which is a proxy class to allow use of shared_ptr classes inside stxxl containers.
  • Fixing bugs and warnings on 32-bit systems (yes, they still exist).
  • Use atomic_counted_object in class file for request reference counting.
  • Adding support for MinGW-w64 (64-bit) systems with working SJLJ thread implementations.

See the CHANGELOG for further minor changes.

Screenshot of YouTube View Counter of the "15 Sorting Algorithms in 6 Minutes"

1.000.000 Views of Sound of Sorting YouTube Video

Posted on 2014-10-26 17:30 by Timo Bingmann at Permlink with 1 Comments. Tags: #sorting #sound of sorting #frontpage

Some time last week my YouTube video "15 Sorting Algorithms in 6 Minutes" reached 1 million views. The original video was uploaded on 2013-05-20 which is just 524 days ago, so on average about every 45.2 seconds someone started to watch the video (which itself is about 6 minutes long). The world is a really big place. Most of the views, however, occured in spikes of interest, as seen in the graph below.

The original video contained only 15 algorithms. The program "Sound of Sorting" itself, which was used to create the animations, now contains 30 algorithms or variants. For some of the additional algorithms I also created videos on YouTube, which are also worthwhile watching.

There are two parallel algorithms based on sorting networks, where each left-to-right sweep is one parallel sorting step:

An algorithm, which is optimal in terms of the number of writes to sort the array:

An adaptive sorting algorithm which detects presorted areas, and is used in many modern programming language runtime libraries:

An in-place stable mergesort with O(1) extra space:

And two algorithms, that have a high asymptotic worst-case complexity, and are thus more a joke, but still pretty to watch:

This article continues on the next page ...

Algorithm schema of Recurse Last Parallel Multiway Mergesort

Practical Massively Parallel Sorting -- Basic Algorithmic Ideas

Posted on 2014-10-24 22:30 by Timo Bingmann at Permlink with 0 Comments. Tags: #research #massive-sorting

Last Friday we were able to finish a technical report on "Practical Massively Parallel Sorting -- Basic Algorithmic Ideas", which is now available on arXiv as 1410.6754 or locally: 1410.6754v1.pdf 1410.6754v1.pdf with source 1410.6754v1.tar.gz 1410.6754v1.tar.gz (130 KiB). A big thanks goes to all the other authors, Michael Axtmann, Peter Sanders, and Christian Schulz.

The report proposes two new hierarchical algorithms "Recurse Last Mergesort" (RLM-Sort) and "Adaptive Multipass Sample Sort" (AMS-Sort), together with randomized data exchange schemes to avoid worst case behavior on large instances.

Download 1410.6754v1.pdf

We now plan to implement some of the new algorithms.


To obtain sorting algorithms that scale to the largest available machines, conventional parallel sorting algorithms cannot be used since they either have prohibitive communication volume or prohibitive critical path length for computation. We outline ideas how to combine a number of basic algorithmic techniques which overcome these bottlenecks.

Short screencast from xubuntu.

Recording Frame-Perfect, High-Resolution Screencasts on Linux in the Year 2014

Posted on 2014-06-30 20:35 by Timo Bingmann at Permlink with 2 Comments. Tags: #vncrec

Last week I produced a recording of a talk about "STXXL 1.4.0 and Beyond". I thought it would be trivial in the year 2014 to create a frame-perfect high-resolution recording of a PDF presentation slideshow with moving mouse cursor on Linux. I was so wrong.

In the end, I created a patched version of vncrec: now called "vncrec-rgb 0.4", and the tutorial on the vncrec-rgb webpage. But now, why do screen recording with vncrec?

This article continues on the next page ...

First slide of the talk showing the inside of a hard disk

Recording of a Talk "STXXL 1.4.0 and Beyond"

Posted on 2014-06-22 20:00 by Timo Bingmann at Permlink with 0 Comments. Tags: #c++ #stxxl #talk #frontpage

This is a recording of a talk that I gave last week at the 3rd LSDMA Topical Meeting in Berlin. The talk covers a basic introduction into the STXXL library's features and contains many short code examples that serve as a tutorial.

The slides of the presentation 2014-06-22 STXXL 1.4.0 and Beyond.pdf 2014-06-22 STXXL 1.4.0 and Beyond.pdf are available online and the recorded talk can be seen on Youtube video below.

Download 2014-06-22 STXXL 1.4.0 and Beyond.pdf

And here is the video of the recording:

This article continues on the next page ...

Ternary search tree used in parallel super scalar string sample sort and LCP-aware tournament tree

Released parallel-string-sorting 0.6
including Parallel Super Scalar String Sample Sort and Parallel Multiway LCP-Mergesort

Posted on 2014-03-09 10:20 by Timo Bingmann at Permlink with 0 Comments. Tags: #c++ #parallel-string-sorting #sorting

This short post announces the second public version of our parallel string sorting project. It is a test framework and algorithm collection containing most sequential and parallel string sorting implementations.

The collection includes parallel super scalar string sample sort (pS5), which we developed and showed to have the highest parallel speedups on modern single-socket multi-core shared memory systems. Additionally, the collection now contains parallel multiway LCP-mergesort, which can be used to speed up string sorting on NUMA multi-socket machines.

See the parallel-string-sorting project page for our technical report and more information about version 0.6.