panthema / 2020 / 0518-distributed-string-sorting
Distributed Merge String Sort

IPDPS Paper "Communication-Efficient String Sorting" and Talk Recording

Posted on 2020-05-18 15:30 by Timo Bingmann at Permlink with 0 Comments. Tags: talk university

Due to the coronavirus this year's IPDPS conference is held in a virtual fashion, and we sadly missed a chance to visit New Orleans. Instead, I recorded a YouTube video of our conference talk, because the slides usually are illustrations requiring more explanation.

The paper "Communication-Efficient String Sorting", which I coauthored with Peter Sanders and Matthias Schimek, will still be published in the IEEE proceedings. A preprint of the full paper is available on arXiv:2001.08516 and also from this webpage:

2001.08516v1-Communication-Efficient-String-Sorting.pdf 2001.08516v1-Communication-Efficient-String-Sorting.pdf,

There are two versions of the slides: the longer presentation version slides-20200518-distributed-string-sorting-ipdps.pdf below used for the YouTube recording and a short ten page teaser version slides-20200518-distributed-string-sorting-ipdps-short.pdf for the virtual conference.

Download slides-20200518-distributed-string-sorting-ipdps.pdf

The source code and more documentation about the implementations of our communication-efficient distributed string sorting algorithms can be found on my GitHub repository https://github.com/bingmann/distributed-string-sorting.

Matthias Schimek's master thesis, on which this paper and presentation are based on, can be downloaded from this website 2019_Schimek_Distributed_String_Sorting_Algorithms.pdf as well.

Below you can watch the video recording of my presentation, or head over to YouTube: https://youtu.be/uWro8fsfs5I.

Abstract

There has been surprisingly little work on algorithms for sorting strings on distributed-memory parallel machines. We develop efficient algorithms for this problem based on the multi-way merging principle. These algorithms inspect only characters that are needed to determine the sorting order. Moreover, communication volume is reduced by also communicating (roughly) only those characters and by communicating repetitions of the same prefixes only once. Experiments on up to 1280 cores reveal that these algorithm are often more than five times faster than previous algorithms.


Post Comment
Name:
E-Mail or Homepage:
 

URLs (http://...) are displayed, e-mails are hidden and used for Gravatar.

Many common HTML elements are allowed in the text, but no CSS style.