panthema (page 1 of 1 2 3 4 5 6 7 8 9 10)

Welcome to

This website is a diverse collection of interesting ideas, thus it is panthematic. It contains free open-source software and projects (FOSS), computer science research results, blog articles and more, all created by myself, Timo Bingmann. Over the years, the amount of information, source code and other content has grown rather large. All entries are ordered chronologically in the weblog, with some special projects highlighted in the following summary:

Open-Source Projects

Thrill (in development)
A C++ framework for distributed Big Data computations with emphasis on high performance and a convenient interface like Apache Spark or Flink.
Library of Advanced C++ Data Structures and Algorithms
Library of external memory algorithms, including external block paging and efficient external sorting
malloc_count 0.7.1
Simple tool for run-time memory usage analysis and profiling under Linux
vncrec-rgb 0.4
Patched vncrec for frame-perfect high-resolution screencasts.
The Sound of Sorting 0.6.5
Viral "audibilization" and visualization of sorting algorithms
Automatically import key=value experimental RESULTS into SQL tables and generate plots from them.
pmbw 0.6.2
Benchmark tool for parallel memory bandwidth / measurement under Linux
disk-filltest 0.7.1
Tool to detect bad disks by filling with random data
STX B+ Tree 0.9
Main memory B+ tree implementation with STL compatible interfaces
digup 0.6.40
Console tools to verify file integrity by updating MD5 and SHA digest files
CryptoTE 0.5.390
Text editor with transparent strong encryption, useful for password lists and more.
STX ExecPipe 0.7.1
Convenient C++ interface to execute child programs connected via Unix pipes.
Flex Bison C++ Example 0.1.4
Example of using GNU flex and bison in a C++ program.

Computer Science Research

External Memory Algorithms
Additionally to maintaining STXXL, we also developed a bulk-parallel priority queue for EM.
Parallel String Sorting
Experimental implementations of many string sorting algorithms, including Parallel Super Scalar String Sample Sort (pS5) and Parallel Multiway LCP-Mergesort
External Memory Suffix Sorting
Experimental implementation of eSAIS and DC3, two suffix and LCP array construction algorithms for external memory, using STXXL.

Miscellaneous Weblog Posts

2014-10-26  1.000.000 Views of Sound of Sorting YouTube Video
2014-06-22  Recording of a Talk "STXXL 1.4.0 and Beyond"
2013-10-24  Sound of Sorting: Viral Video on KIT Informatik Webpage
2013-05-06  STX B+ Tree Speed Test Measurements on Raspberry Pi (Model B)
2013-05-05  STX B+ Tree Measuring Memory Usage with malloc_count
2013-01-24  Coding Tricks 101: How to Save the Assembler Code Generated by GCC
2008-09-01  C++ Code Snippet - Print Stack Backtrace Programmatically with Demangled Function Names
2007-03-28  C++ Code Snippet - Compressing STL Strings with zlib


First slide of the talk

Presentation "Scalable Construction of Text Indexes with Thrill" at IEEE Big Data 2018

Posted on 2018-12-12 16:00 by Timo Bingmann at Permlink with 0 Comments. Tags: talk thrill

Today, I gave a presentation of our paper "Scalable Construction of Text Indexes with Thrill" at the IEEE International Conference on Big Data 2018 in Seattle, WA, USA.

The slides of the presentation at the IEEE conference are available here:
slides-Scalable-Construction-of-Text-Indexes-with-Thrill.pdf slides-Scalable-Construction-of-Text-Indexes-with-Thrill.pdf.

The full paper is available from this webpage: paper-Scalable-Construction-of-Text-Indexes-with-Thrill.pdf paper-Scalable-Construction-of-Text-Indexes-with-Thrill.pdf
or refer to the longer version in my dissertation on scalable suffix array construction.

Download Scalable-Construction-of-Text-Indexes-with-Thrill.pdf


The suffix array is the key to efficient solutions for myriads of string processing problems in different application domains, like data compression, data mining, or bioinformatics. With the rapid growth of available data, suffix array construction algorithms have to be adapted to advanced computational models such as external memory and distributed computing. In this article, we present five suffix array construction algorithms utilizing the new algorithmic big data batch processing framework Thrill, which allows scalable processing of input sizes on distributed systems in orders of magnitude that have not been considered before.

Print Quality of Print-On-Demand Books from Amazon Createspace/KDP,, and Ingram Spark

Posted on 2018-11-05 20:00 by Timo Bingmann at Permlink with 0 Comments. Tags: dissertation

Having just finished my PhD thesis in computer science (see the corresponding dissertation), I ventured to actually print it as a proper book. In this article I want to share some of my experience with three print-on-demand book publishers:

Disclaimer: these are my experiences, yours will probably be different. And I hope print-on-demand quality will improve further in the future.

Below are macro photographs of the various print proof and other copies I received from the publishers. The photographs were taken with a Samsung Galaxy S9 smartphone with a (cheap) macro lens. Hence the colors and blur in the photos should be considered with caution, but the sharpness and detail level is sufficient for some discussion.

TL;DR: Amazon's print proof from the USA has the nicest print and color, but they only produce paperback covers. IngramSpark's prints are second best, have a lower resolution, but they produce hardcovers and consistent quality.

Amazon Print Proof from the USA


Uploading the PDF to Amazon CreateSpace is straight-forward due to the convenient web interface. The Amazon CreateSpace print proof was manufactured in Lexington, KY, USA, it was shipped three days after ordering, and arrived eleven days after ordering. For an international shipment from the USA to Germany that is a very acceptable delivery time.

The print quality of the Amazon Print Proof was in my option the best. It has the highest resolution, bright colors, and solid black. The paper and entire book has the feel of high-quality color laser printer output. Sadly, they the only produce paperback softcover books and I preferred a hardcover. On the plus side, publishing via Amazon gives you a free CreateSpace-ISBN and it is immediately listed in the world-wide Amazon catalog.

Original PDF


As a comparison to the printed books, the pictures above show the same excepts rendered as a bitmap image. Note that the layout of the words in each cover image may differ, because each cover needed to by typeset individually to the individual specifications of the publisher.

This blog entry continues on the next page ...

First slide of the talk

Tutorial on Boost.Spirit at C++ User Group Karlsruhe

Posted on 2018-09-12 19:30 by Timo Bingmann at Permlink with 0 Comments. Tags: talk c++ parsing

On September 12th, 2018, I gave another 90min talk with live-coding examples in German at the C++ User Group Karlsruhe in rooms of the Karlsruhe Institute of Technology (KIT).

This time I was asked to present a more advanced topic around C++ and libraries and I chose to present a tutorial on Boost.Spirit.

Boost.Spirit is a parser and generator template meta-programming framework and maybe one of the most crazy and advanced uses of C++. It enables one to write context-free grammars inline as C++ code, which are translated into recursive descent parsers and fully optimized by the compiler.

This powerful framework is however not easy to get started with. I hope my tutorial helps more people to skip the steep learning curve and use Boost.Spirit for securely parsing user input and other structure data.

The tutorial consisted of a set of introduction slides: slides-2018-09-12-Cpp-Meetup.pdf slides-2018-09-12-Cpp-Meetup.pdf. Followed by a live-coding session in German which was recorded by the KIT (see below for the youtube video).

Download slides-2018-09-12-Cpp-Meetup.pdf

The extensive code examples presented in the live coding session are available on this webpage
or on github:

The examples can be seen as instructive templates and copy & paste sources for new development. The examples are:

  1. Learn to walk and parse simple integers and lists.
    Parse 5, [5, 42, 69, 256].
  2. Create a parser for a simple arithmetic grammar (and part two).
    Parse 5 + 6 * 9 + 42 and evaluate correctly.
  3. Parse CSV data directly into a C++ struct.
    Parse AAPL;Apple;252.50; into a struct.
  4. Create an abstract syntax tree (AST) from arithmetic (and part two).
    Parse y = 6 * 9 + 42 * x and evaluate with variables.
  5. Ogle some more crazy examples, e.g. how to parse.
    <h1>Example for <b>C++ HTML Parser<b></h1>
    This HTML <b>snippet</b> parser can also interpret
    *Markdown* style and enables additional tags
    to <% invoke("C++", 42) %> functions.
Furthermore, a recording of the live-coding in German is available on Youtube:

Book Cover of Dissertation

Dissertation "Scalable String and Suffix Sorting: Algorithms, Techniques, and Tools"

Posted on 2018-07-03 18:00 by Timo Bingmann at Permlink with 1 Comments. Tags: talk university dissertation

The road to a Dr. title (PhD in the Anglo-Saxon world) is often long, rough, and twisted. First you have to do original research, produce novel results, publish articles, and then write and ultimately publish a dissertation. Defending your research in the dissertation in front of a panel of professors is one of the final milestones on that journey.

On July 3rd, I successfully defended my dissertation at the Karlsruhe Institute of Technology and the dissertation text has now been published as a book.

Published Dissertation

My final dissertation PDF is available here: dissertation-Bingmann-Scalable-String-and-Suffix-Sorting.pdf dissertation-Bingmann-Scalable-String-and-Suffix-Sorting.pdf.

It has also been published at the KIT library, on arXiv:1808.00963, and finally as a print-on-demand paperback from Amazon.

The published book cover's background shows a list of most common words in the Wikipedia. The words are sorted and their distinguishing prefix is marked in blue. The cover is available as a double-page PDF: dissertation-cover.pdf dissertation-cover.pdf, and as front and back separately: dissertation-cover-front.pdf dissertation-cover-front.pdf and dissertation-cover-back.pdf dissertation-cover-back.pdf.

The LaTeX source code for my dissertation is available for download: (791 KiB). The complete text is in one .tex file, all figures (except the creative commons logo) are generated from the LaTeX code.

Dissertation Defense Presentation

The slides of my presentation during the defense are available for download here: dissertation-defense-slides.pdf dissertation-defense-slides.pdf.

The presentation was only part of the whole defense. My actual slide set also had almost 200 more backup slides, which however were collected from all the other talks already available on this homepage. These backup slides helped greatly in the examination question after the presentation.

Download dissertation-defense-slides.pdf
This blog entry continues on the next page ...

TLX Logo

Note about the new tlx library of Advanced C++ Data Structures and Algorithms

Posted on 2018-05-28 18:20 by Timo Bingmann at Permlink with 0 Comments. Tags: c++

Last year on February 19th, I started a new github repository called tlx with the goal of de-duplicating code from three projects: Thrill, STXXL, and a private project. The idea came up after a STXXL code workshop in Frankfurt (fashionably called hackathons nowadays).

Link to library: and Doxygen Documentation

The first main common pieces of code were:

  1. the fast loser tree implementations from MCSTL by Johannes Singler necessary for efficient multiway merging,
  2. my die() macros for testing and run-time assertions,
  3. a common intrusive reference counter called counting_ptr, and
  4. simple but vital std::string manipulation functions missing from the STL.

The initial reason for tlx to come about was to consolidate all the bug fixes to the loser tree implementations that I had scattered across the three projects. Efficient multiway merging is such a fundamental task and there was no universally available C++ library that implements the tournament tree well.

A long search for an appropriate vacant user account with three letters on github lead to "tlx". This is definitely a good C++ namespace name, but to this day, it is unclear what the letters stand for. Template Libraries for CXX? The missing Library for CXX? Template Library and more eXtensions. Have your pick, someday someone will find a good official expansion.

Since its inception, tlx has grown a lot. Its goal is to consolidate algorithms and data structures from multiple projects. In a sense tlx maybe aims to be the Boost for advanced algorithms. The goals and constraints of tlx are:

Currently, tlx contains

And much more, which one can find on the front page of the Doxygen Documentation

First slide of the talk

Presentation "C++ Goodies" at C++ User Group Karlsruhe

Posted on 2017-03-08 00:00 by Timo Bingmann at Permlink with 0 Comments. Tags: talk c++

On March 8th, 2017, I gave a 90min talk in German at the C++ User Group Karlsruhe which consisted mostly of live-coding examples of how to use new C++11/14/17 features in a practical setting. The contents were:

The source code I wrote for the presentation and the slides
are available in a github repository

Furthermore, a recording of talk in German is available on Youtube:

First slide of the talk

Presentation "Thrill: High-Performance Algorithmic Distributed Batch Data Processing with C++" at IEEE Big Data 2016

Posted on 2016-12-06 16:00 by Timo Bingmann at Permlink with 0 Comments. Tags: talk thrill

Today, I gave a presentation of our paper "Thrill: High-Performance Algorithmic Distributed Batch Data Processing with C++" at the IEEE International Conference on Big Data 2016 in Washington D.C., USA. An extended technical report of our paper is also available on this website or on arXiv.

The slides of the presentation at the IEEE conference are available here:
slides-Thrill-High-Performance-Algorithmic-Distributed-Batch-Data-Processing-with-CPP-TalkAsGiven.pdf slides-Thrill-High-Performance-Algorithmic-Distributed-Batch-Data-Processing-with-CPP-TalkAsGiven.pdf.

Below a longer version of the slides is available for download:
slides-Thrill-High-Performance-Algorithmic-Distributed-Batch-Data-Processing-with-CPP.pdf slides-Thrill-High-Performance-Algorithmic-Distributed-Batch-Data-Processing-with-CPP.pdf.
These slides contain additional figures which are useful to understand the DIA operations in Thrill, along with many extra design slides omitted from shorter talks.

Download slides-Thrill-High-Performance-Algorithmic-Distributed-Batch-Data-Processing-with-CPP.pdf

First slide of the talk

Presentation "STXXL and Thrill (Parallel Batch Processing)" at STXXL Workshop in DFG SPP 1736

Posted on 2016-09-21 20:00 by Timo Bingmann at Permlink with 0 Comments. Tags: talk thrill stxxl

Today, I gave a technical presentation comparing STXXL and Project Thrill at the STXXL Workshop organized within the DFG SPP 1736. The main topic of the workshop was to determine the future development course of STXXL, and the biggest question in this regard was how to bring more multi-core parallelization into STXXL. Thrill or an adaptation of its ideas may be the solution to this challenge: 2016-09-21 STXXL and Thrill Slides.pdf 2016-09-21 STXXL and Thrill Slides.pdf.

Download 2016-09-21 STXXL and Thrill Slides.pdf

A figure from the technical report

Thrill: High-Performance Algorithmic Distributed Batch Data Processing with C++

Posted on 2016-08-20 09:54 by Timo Bingmann at Permlink with 0 Comments. Tags: research c++ thrill

Our technical report on "Thrill: High-Performance Algorithmic Distributed Batch Data Processing with C++" is now available on arXiv as 1608.05634 or locally: 1608.05634v1.pdf 1608.05634v1.pdf with source 1608.05634v1.tar.gz 1608.05634v1.tar.gz (780 KiB).

This report is the first technical documentation about our new distributed computing prototype called Thrill. Thrill is written in modern C++14, and open source under the BSD-2 license. More information on Thrill is available from the project homepage.

Thrill's source is available from Github.

Download 1608.05634v1.pdf


We present the design and a first performance evaluation of Thrill -- a prototype of a general purpose big data processing framework with a convenient data-flow style programming interface. Thrill is somewhat similar to Apache Spark and Apache Flink with at least two main differences. First, Thrill is based on C++ which enables performance advantages due to direct native code compilation, a more cache-friendly memory layout, and explicit memory management. In particular, Thrill uses template meta-programming to compile chains of subsequent local operations into a single binary routine without intermediate buffering and with minimal indirections. Second, Thrill uses arrays rather than multisets as its primary data structure which enables additional operations like sorting, prefix sums, window scans, or combining corresponding fields of several arrays (zipping).

We compare Thrill with Apache Spark and Apache Flink using five kernels from the HiBench suite. Thrill is consistently faster and often several times faster than the other frameworks. At the same time, the source codes have a similar level of simplicity and abstraction.

Sketch of a edge-split-attach operation, which breaks a unique exchange from e.

On the Structure of the Graph of Unique Symmetric Base Exchanges of Bispanning Graphs - Diploma Thesis in Mathematics

Posted on 2016-01-14 18:30 by Timo Bingmann at Permlink with 0 Comments. Tags: maths university

After a long exhausting period with many interruptions my diploma thesis in mathematics "On the Structure of the Graph of Unique Symmetric Base Exchanges of Bispanning Graphs" is finalized and submitted. The full abstract of the thesis is shown below, and an additional German abstract is available further down the page.

The final version of the thesis is available here: thesis pdf Bingmann-On-the-Structure-of-the-Graph-of-Unique-Symmetric-Base-Exchanges-of-Bispanning-Graphs.pdf, and was also uploaded to as 1601.03526.

The underlying problem discussed in the thesis is best explained using a game on a bispanning graph. You can play the game with this Java Applet or using the Java WebStart Launcher (if the Applet does not work). You play Alice's role and want to flip the colors of all edges in the graph. Bob will try to prevent this from happening.

In addition to the thesis itself, the source code of the accompanying computer program is also available. It was used while preparing the thesis to calculate exchanges graphs and to test many hypothesis about bispanning graphs. The program can enumerate all bispanning graphs and their exchange graphs for small numbers of vertices. See the program page for downloadable lists and PDFs of all bispanning graphs for small numbers of vertices.


Bispanning graphs are undirected graphs with an edge set that can be decomposed into two disjoint spanning trees. The operation of symmetrically swapping two edges between the trees, such that the result is a different pair of disjoint spanning trees, is called an edge exchange or a symmetric base exchange. The graph of symmetric base exchanges of a bispanning graph contains a vertex for every valid pair of disjoint spanning trees, and edges between them to represent all possible edge exchanges. We are interested in a restriction of these graphs to only unique symmetric base exchanges, which are edge exchanges wherein selecting one edge leaves only one choice for selecting the other. In this thesis, we discuss the structure of the graph of unique symmetric edge exchanges, and the open question whether these are connected for all bispanning graphs.

This abstract problem can be nicely rephrased into a coloring game with two players: Alice and Bob are given a bispanning graph colored with two disjoint spanning trees, and Alice gets to flip the color of any edge. This creates a cycle in one color and a cut in the other, and Bob must then flip a different edge to repair the constraint that both colors represent disjoint spanning trees. Alice's objective is to invert the color of all edges in the graph, and Bob's to prevent this. We are interested in whether Alice can find a sequence of unique edge exchanges for any bispanning graph, since these leave Bob no choice in which edge to select, hence allowing Alice to win with certainty.

In this thesis, we first define and discuss the properties of bispanning graphs in depth. Intuitively, these are locally dense enough to allow the two disjoint spanning trees to reach all vertices, but sparse enough such that disjoint edge sets do not contain cycles. The whole class of bispanning graphs can be inductively constructed using only two operations, which makes the class tractable for inductive proofs.

We then describe in detail directed, undirected, and simplified versions of edge exchange graphs, first with unrestricted edge exchanges, and then with the restriction to unique symmetric base exchanges. These exchange graphs are related to a set of conjectures put forth by White in 1980 on base exchanges in matroids, and also to conjectures on cyclic base orderings of matroids. To date, these conjectures have not been proven in full generality, despite overwhelming computational evidence.

As steps towards showing the conjecture that the graph of unique symmetric base exchanges is connected for all bispanning graphs, we prove a composition method to construct the unique exchange graph of any bispanning graph from the exchange graphs of smaller bispanning graphs. Furthermore, using a computer program developed alongside this thesis, we are able to enumerate and make statements about all small bispanning graphs and their exchanges graphs.

Our composition method classifies bispanning graphs by whether they contain a non-trivial bispanning subgraph, and by vertex and edge connectivity. For bispanning graphs containing a non-trivial bispanning subgraph, we prove that the unique exchange graph is the Cartesian graph product of two smaller exchange graphs. For 2-vertex-connected bispanning graphs, we show that the bispanning graph is the 2-clique sum of two smaller bispanning graphs, and that the unique exchange graph can be built by joining their exchange graphs and forwarding edges at the join seam. And for all remaining bispanning graphs, we prove a composition method at a vertex of degree three, wherein the unique exchange graph is constructed from the exchange graphs of three reduced bispanning graphs.

We conclude this thesis with ideas and evidence for future approaches to proving the connectivity of the unique exchange graphs and show the most difficult bispanning graphs instances.

This blog entry continues on the next page ...

Show Page: 1 2 3 4 5 6 7 8 9 10 Next >
RSS 2.0 Weblog Feed Atom 1.0 Weblog Feed Valid XHTML 1.1 Valid CSS (2.1)
Copyright 2005-2018 Timo Bingmann - Impressum