Presentation "Massive Suffix Array Construction with Thrill" at DFG SPP 1736 Annual Colloquium

Posted on 2015-10-01 19:40 by Timo Bingmann at Permlink with 0 Comments. Tags: #c++ #talk #thrill

Today, we gave an overview presentation of the vision behind Project Thrill, its current state, and how it will be used to implement suffix and LCP array construction, and many other distributed algorithms: 2015-10-01 Massive Suffix Array Construction with Thrill.pdf .

Download 2015-10-01 Massive Suffix Array Construction with Thrill.pdf

Publication: Engineering Parallel String Sorting in Algorithmica Journal

Posted on 2015-09-22 15:46 by Timo Bingmann at Permlink with 0 Comments. Tags: #research #sorting #university

Our paper "Engineering Parallel String Sorting" was accepted for publication in Springer's Algorithmica Journal, and is available online at http://dx.doi.org/10.1007/s00453-015-0071-1.

Compare to our older technical report, the journal edition contains many (minor and major) corrections, additional references and better explanations.

A pre-print version is available here: Engineering-Parallel-String-Sorting.pdf .

Download Engineering-Parallel-String-Sorting.pdf

Please refer to the main parallel-string-sorting page for details and source code.

This article continues on the next page ...

emacs Tutorial: Beating the Learning Curve - From Zero to Lightspeed

Posted on 2015-08-19 13:21 by Timo Bingmann at Permlink with 2 Comments. Tags: #talk

At our institute I gave an ambitious presentation today which showcased much of my daily work flow in emacs. People have titled it the "emacs lightshow" due to the speed of flashing and changing lights on my screens. Of course, the idea is to show people what emacs can do and at the same time get them to try it. Emacs is different from other editors in that it is a life operating system that is infinitely complex, is constantly extended, and adapts to what you need.

For this presentation I made an emacs tutorial. There are many emacs tutorials online, and they are probably better than this one. However, I focused on listing something like the top 100 key command sequences that you need in real day-to-day editing life, instead of the most flashy features. This makes this tutorial something that you can print out, and go through step-by-step once to try everything out; and then start over and learn the most important keys from the top.

The emacs tutorial is available as a PDF: emacs Tutorial - Beating the Learning Curve - From Zero to Lightspeed.pdf

Download emacs Tutorial - Beating the Learning Curve - From Zero to Lightspeed.pdf

or available as the org-mode source file: emacs Tutorial - Beating the Learning Curve - From Zero to Lightspeed.org

and in my github .emacs.d repo: https://github.com/bingmann/dot-emacs.git

Presentation of Parallel Priority Queue at the Conference SEA'2015

Posted on 2015-06-30 17:11 by Timo Bingmann at Permlink with 0 Comments. Tags: #c++ #talk #stxxl

We are very glad to have been given the opportunity to present our work on bulk-parallel priority queues for external memory at the 14th International Symposium on Experimental Algorithms (SEA 2015) in Paris. Our paper is in the proceedings and also available here: paper-SEA15-Bulk-Parallel-Priority-Queue.pdf

The talk was given by Thomas Keh, and the slides of the presentation are available online: 2015-06-29 A Bulk-Parallel Priority Queue in External Memory with STXXL.pdf . The implementation is available in the current master branch of STXXL at github.

Download 2015-06-29 A Bulk-Parallel Priority Queue in External Memory with STXXL.pdf

Abstract

We propose the design and an implementation of a bulk-parallel external memory priority queue to take advantage of both shared-memory parallelism and high external memory transfer speeds to parallel disks. To achieve higher performance by decoupling item insertions and extractions, we offer two parallelization interfaces: one using "bulk" sequences, the other by defining "limit" items. In the design, we discuss how to parallelize insertions using multiple heaps, and how to calculate a dynamic prediction sequence to prefetch blocks and apply parallel multiway merge for extraction. Our experimental results show that in the selected benchmarks the priority queue reaches 64% of the full parallel I/O bandwidth of SSDs and 49% of rotational disks, or the speed of sorting in external memory when bounded by computation.

A Bulk-Parallel Priority Queue in External Memory with STXXL

Posted on 2015-04-03 14:54 by Timo Bingmann at Permlink with 0 Comments. Tags: #research #stxxl

Today, our technical report on "A Bulk-Parallel Priority Queue in External Memory with STXXL" is now available on arXiv as 1504.00545 or locally: 1504.00545v1.pdf with source 1504.00545v1.tar.gz (130 KiB). A big thanks goes to Thomas Keh for the hard work he did in his bachelor thesis, and to Peter Sanders for all the insights into priority queues. The technical report is an extended version of our paper that will appear at the 14th International Symposium on Experimental Algorithms - SEA 2015.

The bulk-parallel priority queue is current available in the development repository of STXXL on Github.

On 2015-06-29, Thomas Keh presented the PPQ at SEA'15 in Paris.

Abstract

We propose the design and an implementation of a bulk-parallel external memory priority queue to take advantage of both shared-memory parallelism and high external memory transfer speeds to parallel disks. To achieve higher performance by decoupling item insertions and extractions, we offer two parallelization interfaces: one using "bulk" sequences, the other by defining "limit" items. In the design, we discuss how to parallelize insertions using multiple heaps, and how to calculate a dynamic prediction sequence to prefetch blocks and apply parallel multiway merge for extraction. Our experimental results show that in the selected benchmarks the priority queue reaches 75% of the full parallel I/O bandwidth of rotational disks and and 65% of SSDs, or the speed of sorting in external memory when bounded by computation.

Presentation of DALKIT (work in progress) in Berlin

Posted on 2015-03-27 21:00 by Timo Bingmann at Permlink with 0 Comments. Tags: #c++ #talk #thrill

Today, I presented our work in progress on a distributed computation platform for Big Data algorithms at the LSDMA All-Hands-Meeting in Berlin. One of the currently proposed names is DALKIT. The talk covers the current state our student project is in, which consists mainly of the design of the framework's interface, architecture and future components.

The slides of the presentation 2015-03-27 Project DALKIT.pdf are available online. However, as usual, my slides are very difficult to understand without the audio track. For future "final" version presentations there will probably be more videos.