<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule" xmlns:atom="http://www.w3.org/2005/Atom">
<channel><title>panthema.net - Timo's Weblog</title><link>http://panthema.net</link><description>panthema.net - RSS 2.0 Feed</description><language>en</language><copyright>Copyright 2005-2013, panthema.net</copyright><pubDate>Wed, 08 May 2013 12:47:00 +0200</pubDate><lastBuildDate>Mon, 20 May 2013 02:02:28 +0200</lastBuildDate><webMaster>tbrss@panthema.net (Timo Bingmann)</webMaster><atom:link href="http://panthema.net/xmlfeed/weblog-rss20.xml" rel="self" title="panthema.net Weblog Feed RSS 2.0" type="application/rss+xml"/><item><title>Released parallel-string-sorting 0.5 including Parallel Super Scalar String Sample Sort</title><link>http://panthema.net/2013/0508-parallel-string-sorting-0.5.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2013/parallel-string-sorting/&quot;&gt;&lt;img src=&quot;/2013/parallel-string-sorting/thumb.jpg&quot; width=&quot;300&quot; height=&quot;170&quot; alt=&quot;Ternary search tree used in parallel super scalar string sample sort&quot; title=&quot;Ternary search tree used in parallel super scalar string sample sort&quot; /&gt;&lt;/a&gt;&lt;/span&gt;Released parallel-string-sorting 0.5 including Parallel Super Scalar String Sample Sort&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2013-05-08 11:47 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2013/0508-parallel-string-sorting-0.5.html&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2013/0508-parallel-string-sorting-0.5.html#notes&quot;&gt;0 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/c++.html&quot;&gt;c++&lt;/a&gt; &lt;a href=&quot;/tags/parallel-string-sorting.html&quot;&gt;parallel-string-sorting&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;This short post announces the first &lt;strong&gt;public version&lt;/strong&gt; of our parallel string sorting project. It is a test framework and algorithm collection containing many sequential and parallel string sorting implementations.&lt;/p&gt;&lt;p&gt;The collection includes parallel super scalar string sample sort (pS&lt;sup&gt;5&lt;/sup&gt;), which we developed and showed to have the &lt;strong&gt;highest parallel speedups&lt;/strong&gt; on modern multi-core shared memory systems.&lt;/p&gt;See the &lt;a href=&quot;/2013/parallel-string-sorting/&quot;&gt;parallel-string-sorting project page&lt;/a&gt; for our technical report and more information about version 0.5. &lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2013/0508-parallel-string-sorting-0.5.html</guid><pubDate>Wed, 08 May 2013 12:47:00 +0200</pubDate></item><item><title>Publishing STX B+ Tree 0.9 - Speed Gains over 0.8.6</title><link>http://panthema.net/2013/0507-STX-B+Tree-0.9/index.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2007/stx-btree/&quot;&gt;&lt;img src=&quot;/2007/stx-btree/btree-thumb-border.png&quot; width=&quot;252&quot; height=&quot;193&quot; alt=&quot;Small drawing of a B+ tree&quot; title=&quot;Small drawing of a B+ tree&quot; /&gt;&lt;/a&gt;&lt;/span&gt;Publishing STX B+ Tree 0.9 - Speed Gains over 0.8.6&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2013-05-07 21:16 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2013/0507-STX-B+Tree-0.9/&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2013/0507-STX-B+Tree-0.9/#notes&quot;&gt;0 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/stx-btree.html&quot;&gt;stx-btree&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;This blog post announces the new version 0.9 of my popular &lt;a href=&quot;/2007/stx-btree&quot;&gt;STX B+ Tree&lt;/a&gt; library of C++ templates, speedtests and demos. Since the last release in 2011, many patches and new ideas have accumulated. Here a summary of the most prominent changes:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Changed the &lt;code&gt;find_lower()&lt;/code&gt; function, which is central to finding keys or insertion points to not use binary search for small node sizes. The reasoning behind this change is discussed in &lt;a href=&quot;/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/&quot;&gt;another blog post&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Added a &lt;code&gt;bulk_load()&lt;/code&gt; to all &lt;code&gt;map&lt;/code&gt; and &lt;code&gt;set&lt;/code&gt; variants to construct a B+ tree from a pre-sorted iterator range. The sorted range is first copied into leaf nodes, over which then the B+ tree inner nodes are iteratively built.&lt;/li&gt;&lt;li&gt;The B+ tree template source code is now published under the &lt;a href=&quot;http://www.boost.org/users/license.html&quot;&gt;Boost Software License&lt;/a&gt;! Use it!&lt;/li&gt;&lt;li&gt;More minor changes are listed in the &lt;a href=&quot;/2007/stx-btree/#changelog&quot;&gt;ChangeLog&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;STX B+ tree version 0.9 is available &lt;a href=&quot;/2007/stx-btree&quot;&gt;from the usual project webpage&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The switch from binary search to linear scan, and all further patches and optimization call for a direct comparison of version 0.8.6 and 0.9. Because of special optimizations to the &lt;code&gt;btree_set&lt;/code&gt; specializations, the following plots differentiate between &lt;code&gt;set&lt;/code&gt; and &lt;code&gt;map&lt;/code&gt;s.&lt;/p&gt;&lt;div style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;/2013/0507-STX-B+Tree-0.9/&quot;&gt;This blog entry continues on the next page ...&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2013/0507-STX-B+Tree-0.9/index.html</guid><pubDate>Tue, 07 May 2013 22:16:00 +0200</pubDate></item><item><title>STX B+ Tree Speed Test Measurements on Raspberry Pi (Model B)</title><link>http://panthema.net/2013/0506-STX-B+Tree-on-Raspberry-Pi/index.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2013/0506-STX-B+Tree-on-Raspberry-Pi/&quot;&gt;&lt;img src=&quot;/2013/0506-STX-B+Tree-on-Raspberry-Pi/thumb.jpg&quot; width=&quot;300&quot; height=&quot;206&quot; alt=&quot;Photo of my Raspberry Pi Model B&quot; title=&quot;Photo of my Raspberry Pi Model B&quot; /&gt;&lt;/a&gt;&lt;/span&gt;STX B+ Tree Speed Test Measurements on Raspberry Pi (Model B)&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2013-05-06 09:48 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2013/0506-STX-B+Tree-on-Raspberry-Pi/&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2013/0506-STX-B+Tree-on-Raspberry-Pi/#notes&quot;&gt;0 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/stx-btree.html&quot;&gt;stx-btree&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;The &lt;a href=&quot;http://www.raspberrypi.org/&quot;&gt;Raspberry Pi&lt;/a&gt; is maybe one of the most hyped embedded system projects in the last year, and I also got myself one for experiments. People are doing amazing things with this Linux-in-a-box SoC. Doubtlessly, the popularity is due to the standardized platform and a large community forming around it, which makes fixing the many small problems with Linux on ARM systems feasible. For me, the Raspberry Pi is an alternative architecture on which to test my algorithms and libraries, which exhibits somewhat &lt;strong&gt;different characteristics&lt;/strong&gt; than the highly optimized desktop CPUs.&lt;/p&gt;&lt;p&gt;So I decided to run my &lt;a href=&quot;/2007/stx-btree/&quot;&gt;STX B+ Tree speed test&lt;/a&gt; on the Raspberry Pi Model B, because most people use the SoC for multimedia purposes and little other memory performance data is available. The B+ tree speed test gives &lt;strong&gt;insight into the platform&amp;apos;s overall memory and processing performance&lt;/strong&gt;, and thus yields a better assessment of how useful the system is for general purpose applications (unlike multimedia decoding). Most benchmarks focus solely on floating point or integer arithmetic, which alone are very poor indicators for overall system performance. The Raspberry Pi forums say it has performance similar to a &amp;quot;Pentium 2 with 300 MHz&amp;quot;, but that is for arithmetic.&lt;/p&gt;&lt;div style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;/2013/0506-STX-B+Tree-on-Raspberry-Pi/&quot;&gt;This blog entry continues on the next page ...&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2013/0506-STX-B+Tree-on-Raspberry-Pi/index.html</guid><pubDate>Mon, 06 May 2013 10:48:00 +0200</pubDate></item><item><title>STX B+ Tree Measuring Memory Usage with malloc_count</title><link>http://panthema.net/2013/0505-STX-B+Tree-Memory-Usage/index.html</link><description>&lt;h1&gt;STX B+ Tree Measuring Memory Usage with malloc_count&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2013-05-05 09:44 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2013/0505-STX-B+Tree-Memory-Usage/&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2013/0505-STX-B+Tree-Memory-Usage/#notes&quot;&gt;2 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/stx-btree.html&quot;&gt;stx-btree&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;Within the next few days, a new version of my popular &lt;a href=&quot;/2007/stx-btree&quot;&gt;STX B+ Tree&lt;/a&gt; library will be released. In light of this imminent release, I created a &lt;strong&gt;memory profile&lt;/strong&gt; with my &lt;a href=&quot;/2013/malloc_count/&quot;&gt;&lt;code&gt;malloc_count&lt;/code&gt; tool&lt;/a&gt;, comparing the requirements of four different C++ maps with integer keys and values.&lt;/p&gt;&lt;p&gt;The test is really simple: create a map container, insert 8 Mi random integer key/value pairs, and destruct it. The memory profile shows the amount of memory over time as allocated via &lt;code&gt;malloc()&lt;/code&gt; or &lt;code&gt;new&lt;/code&gt;. The test encompasses the usual gcc STL&amp;apos;s &lt;strong&gt;&lt;code&gt;map&lt;/code&gt;&lt;/strong&gt; which is a red-black tree, the older &lt;strong&gt;&lt;code&gt;hash_map&lt;/code&gt;&lt;/strong&gt; from gcc&amp;apos;s STL extensions, the newer gcc C++ &lt;strong&gt;&lt;code&gt;tr1::unordered_map&lt;/code&gt;&lt;/strong&gt;, and of course the &lt;strong&gt;&lt;code&gt;stx::btree_map&lt;/code&gt;&lt;/strong&gt; with default configuration. As a reference, I also added the usual STL vector and deque (not map containers), to verify the plotting facilities.&lt;/p&gt;&lt;p&gt;To isolate &lt;strong&gt;heap fragmentation&lt;/strong&gt;, the profiler &lt;code&gt;fork()&lt;/code&gt;s separate process contexts before each run. To avoid problems with multiple equal random keys, the multimap variant of all containers is used. Here is the memory profile (also included in the STX B+ Tree tarball):&lt;/p&gt;&lt;p style=&quot;text-align: center&quot;&gt; &lt;a href=&quot;/2013/0505-STX-B+Tree-Memory-Usage/memprofile.pdf&quot;&gt; &lt;img width=&quot;700&quot; height=&quot;490&quot; title=&quot;Memory profile of map containers&quot; src=&quot;/2013/0505-STX-B+Tree-Memory-Usage/memprofile.png&quot; alt=&quot;Memory profile of map containers&quot;/&gt; &lt;/a&gt;&lt;/p&gt;&lt;div style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;/2013/0505-STX-B+Tree-Memory-Usage/&quot;&gt;This blog entry continues on the next page ...&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2013/0505-STX-B+Tree-Memory-Usage/index.html</guid><pubDate>Sun, 05 May 2013 10:44:00 +0200</pubDate></item><item><title>STX B+ Tree Revisiting Binary Search</title><link>http://panthema.net/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/index.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/&quot;&gt;&lt;img src=&quot;/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/thumb.gif&quot; width=&quot;300&quot; height=&quot;112&quot; alt=&quot;Animation showing binary search and linear scan&quot; title=&quot;Animation showing binary search and linear scan&quot; /&gt;&lt;/a&gt;&lt;/span&gt;STX B+ Tree Revisiting Binary Search&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2013-05-04 12:44 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/#notes&quot;&gt;0 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/stx-btree.html&quot;&gt;stx-btree&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;While developing another piece of software, I happened to require a customizable binary search implementation, which lead me to revisit the binary search function of my quite popular &lt;a href=&quot;/2007/stx-btree/&quot;&gt;STX B+ Tree implementation&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The binary search is a central component in the container, both for performance and correctness, as it is used when traversing the tree in search for a key or an insertion point. It is implemented in the &lt;code&gt;find_lower()&lt;/code&gt; (and &lt;code&gt;find_upper()&lt;/code&gt;) functions. &lt;/p&gt;&lt;p&gt;In a first step, I cleaned the implementation to use an &lt;strong&gt;exclusive right boundary&lt;/strong&gt;. In this binary search variant, &lt;code&gt;hi&lt;/code&gt; points to the first element &lt;strong&gt;beyond&lt;/strong&gt; the end (with the same meaning as usual &lt;code&gt;end()&lt;/code&gt; C++ iterator). This got rid of the special case handled after the while loop. The exclusive right boundary is also a &lt;strong&gt;more natural&lt;/strong&gt; implementation variant (even though most computer science textbooks contain the inclusive version).&lt;/p&gt;&lt;p&gt;Having thus changed the binary search, I reran the speed tests. However, to my surprise, the performance of the library &lt;strong&gt;decreased slightly&lt;/strong&gt;, but consistently. See the code &lt;a href=&quot;https://github.com/bingmann/stx-btree/commit/39580c19dd2ff344d19ebda97efc70b4a5208598&quot;&gt;diff 39580c19&lt;/a&gt; and resulting &lt;a href=&quot;/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/btree-speedtest-diff-39580c19.pdf&quot;&gt;speed test PDF&lt;/a&gt;, where solid lines are after the patch and dashed ones before.&lt;/p&gt;&lt;p&gt;After some research, I found a good, independent &lt;a href=&quot;http://create.stephan-brumme.com/binary-search/&quot;&gt;study of binary search variants by Stephan Brumme&lt;/a&gt;. His summary is that a linear scan is more efficient than binary search, if the keys are located in only one cache line. This (of course) explained the performance decrease I measured, as my &amp;quot;special case&amp;quot; after the search was in fact a very short linear scan of two element.&lt;/p&gt;&lt;div style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/&quot;&gt;This blog entry continues on the next page ...&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2013/0504-STX-B+Tree-Binary-vs-Linear-Search/index.html</guid><pubDate>Sat, 04 May 2013 13:44:00 +0200</pubDate></item><item><title>Released disk-filltest 0.7 - Simple Tool to Detect Bad Disks by Filling with Random Data</title><link>http://panthema.net/2013/0327-disk-filltest-0.7.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2013/disk-filltest/&quot;&gt;&lt;img src=&quot;/2013/disk-filltest/thumb.gif&quot; width=&quot;300&quot; height=&quot;215&quot; alt=&quot;Thumbnail of a pie chart filling to 100%&quot; title=&quot;Thumbnail of a pie chart filling to 100%&quot; /&gt;&lt;/a&gt;&lt;/span&gt;Released disk-filltest 0.7 - Simple Tool to Detect Bad Disks by Filling with Random Data&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2013-03-27 21:32 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2013/0327-disk-filltest-0.7.html&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2013/0327-disk-filltest-0.7.html#notes&quot;&gt;0 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/c++.html&quot;&gt;c++&lt;/a&gt; &lt;a href=&quot;/tags/utilities.html&quot;&gt;utilities&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;This post announces the first version of &lt;code&gt;disk-filltest&lt;/code&gt;, a very simple tool to test for bad blocks on a disk by filling it with random data. The function of disk-filltest is simple:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Write&lt;/strong&gt; files &lt;code&gt;random-########&lt;/code&gt; to the current directory until the disk is full.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Read&lt;/strong&gt; the files again and &lt;strong&gt;verify&lt;/strong&gt; the pseudo-random sequence written.&lt;/li&gt;&lt;li&gt;Any write or read error will be reported, either by the operating system or by checking the pseudo-random sequence.&lt;/li&gt;&lt;li&gt;Optionally, delete the random files after a successful run.&lt;/li&gt;&lt;/ul&gt;See &lt;a href=&quot;/2013/disk-filltest/&quot;&gt;the disk-filltest&lt;/a&gt; project page for more information about version 0.7. &lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2013/0327-disk-filltest-0.7.html</guid><pubDate>Wed, 27 Mar 2013 21:32:00 +0100</pubDate></item><item><title>Released malloc_count 0.7 - Tools for Runtime Memory Usage Analysis and Profiling</title><link>http://panthema.net/2013/0316-malloc_count-0.7.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2013/malloc_count/&quot;&gt;&lt;img src=&quot;/2013/malloc_count/thumb.png&quot; width=&quot;300&quot; height=&quot;240&quot; alt=&quot;Memory profile plot as generated by example in malloc_count tarball&quot; title=&quot;Memory profile plot as generated by example in malloc_count tarball&quot; /&gt;&lt;/a&gt;&lt;/span&gt;Released malloc_count 0.7 - Tools for Runtime Memory Usage Analysis and Profiling&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2013-03-16 22:17 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2013/0316-malloc_count-0.7.html&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2013/0316-malloc_count-0.7.html#notes&quot;&gt;1 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/c++.html&quot;&gt;c++&lt;/a&gt; &lt;a href=&quot;/tags/coding_tricks.html&quot;&gt;coding tricks&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;This post announces the first version of &lt;code&gt;malloc_count&lt;/code&gt;, a very useful tool that I have been fine-tuning in the past months. The code library provides facilities to&lt;/p&gt;&lt;ul&gt;&lt;li&gt;measure the current and peak heap memory allocation, and&lt;/li&gt;&lt;li&gt;write a memory profile for plotting.&lt;/li&gt;&lt;li&gt;Furthermore, separate &lt;code&gt;stack_count&lt;/code&gt; function can measure stack usage.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The code tool works by intercepting the standard &lt;code&gt;malloc()&lt;/code&gt;, &lt;code&gt;free()&lt;/code&gt;, etc functions. Thus &lt;strong&gt;no changes&lt;/strong&gt; are necessary to the inspected source code.&lt;/p&gt;See &lt;a href=&quot;/2013/malloc_count/&quot;&gt;the malloc_count project page&lt;/a&gt; for more information about version 0.7. &lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2013/0316-malloc_count-0.7.html</guid><pubDate>Sat, 16 Mar 2013 22:17:00 +0100</pubDate></item><item><title>Coding Tricks 101: How to Save the Assembler Code Generated by GCC</title><link>http://panthema.net/2013/0124-GCC-Output-Assembler-Code/index.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2013/0124-GCC-Output-Assembler-Code/&quot;&gt;&lt;img src=&quot;/2013/0124-GCC-Output-Assembler-Code/thumb.jpg&quot; width=&quot;300&quot; height=&quot;209&quot; alt=&quot;Instacode coloring of assembler code&quot; title=&quot;Instacode coloring of assembler code&quot; /&gt;&lt;/a&gt;&lt;/span&gt;Coding Tricks 101: How to Save the Assembler Code Generated by GCC&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2013-01-24 18:07 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2013/0124-GCC-Output-Assembler-Code/&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2013/0124-GCC-Output-Assembler-Code/#notes&quot;&gt;1 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/c++.html&quot;&gt;c++&lt;/a&gt; &lt;a href=&quot;/tags/coding_tricks.html&quot;&gt;coding tricks&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;This is the first issue of a series of blog posts about some Linux coding tricks I have collected in the last few years.&lt;/p&gt;&lt;p&gt;Folklore says that compilers are among the most complex computer programs written today. They incorporate many optimization algorithms, inline functions and fold constant expressions; all without changing output, correctness or side effects of the code. If you think about it, the work &lt;a class=&quot;exp&quot; href=&quot;http://gcc.gnu.org&quot;&gt;&lt;code&gt;gcc&lt;/code&gt;&lt;/a&gt;, &lt;a class=&quot;exp&quot; href=&quot;http://www.llvm.org&quot;&gt;&lt;code&gt;llvm&lt;/code&gt;&lt;/a&gt; and other compilers do is really amazing and mostly works just great.&lt;/p&gt;&lt;p&gt;Sometimes, however, you want to know exactly what a compiler does with your C/C++ code. Most straight-forward questions can be answered using a debugger. However, if you want to verify whether the compiler really applies those optimizations to your program, that your intuition expects it to do, then a debugger is usually not useful, because optimized programs can look very different from the original. Some example questions are:&lt;/p&gt;&lt;ul&gt; &lt;li&gt;Is a local integer variable stored in a register and how long does it exist?&lt;/li&gt; &lt;li&gt;Does the compiler use special instructions for a simple copy loop?&lt;/li&gt; &lt;li&gt;Are special conditional instructions used for an &lt;code&gt;if&lt;/code&gt; or &lt;code&gt;switch&lt;/code&gt; statement?&lt;/li&gt; &lt;li&gt;Is a specific function inlined or called each time?&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;These questions can be answered definitely by investigating the compiler&amp;apos;s output. On the Net, there are multiple &amp;quot;online compilers,&amp;quot; which can visualize the assembler output of popular compilers for small pieces of code: see the &amp;quot;&lt;a href=&quot;http://gcc.godbolt.org/&quot;&gt;GCC Explorer&lt;/a&gt;&amp;quot; or &amp;quot;&lt;a href=&quot;http://assembly.ynh.io/&quot;&gt;C/C++ to Assembly v2&lt;/a&gt;&amp;quot;. However, for inspecting parts of a larger project, these tools are unusable, because the interesting pieces are embedded in much larger source files.&lt;/p&gt;&lt;p&gt;Luckily, &lt;code&gt;gcc&lt;/code&gt; does not output binary machine code directly. Instead, it internally writes assembler code, which then is translated by &lt;code&gt;as&lt;/code&gt; into binary machine code (actually, &lt;code&gt;gcc&lt;/code&gt; creates more intermediate structures). This internal assembler code can be outputted to a file, with some annotation to make it easier to read.&lt;/p&gt;&lt;div style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;/2013/0124-GCC-Output-Assembler-Code/&quot;&gt;This blog entry continues on the next page ...&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2013/0124-GCC-Output-Assembler-Code/index.html</guid><pubDate>Thu, 24 Jan 2013 18:07:00 +0100</pubDate></item><item><title>eSAIS - Inducing Suffix and LCP Arrays in External Memory</title><link>http://panthema.net/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/index.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/&quot;&gt;&lt;img src=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/thumb.png&quot; width=&quot;300&quot; height=&quot;443&quot; alt=&quot;Example of the Inducing Process&quot; title=&quot;Example of the Inducing Process&quot; /&gt;&lt;/a&gt;&lt;/span&gt;eSAIS - Inducing Suffix and LCP Arrays in External Memory&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2012-11-19 15:49 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/#notes&quot;&gt;2 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/research.html&quot;&gt;research&lt;/a&gt; &lt;a href=&quot;/tags/stringology.html&quot;&gt;stringology&lt;/a&gt; &lt;a href=&quot;/tags/c++.html&quot;&gt;c++&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;This web page accompanies our conference paper &amp;quot;Inducing Suffix and LCP Arrays in External Memory&amp;quot;, which we presented at the &lt;a href=&quot;http://www.siam.org/meetings/alenex13/&quot;&gt;Workshop on Algorithm Engineering and Experiments (ALENEX 2013)&lt;/a&gt;. A PDF of the publication is available from this site as &lt;a title=&quot;Download alenex13esais.pdf (524 KiB)&quot; href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/alenex13esais.pdf&quot;&gt;alenex13esais.pdf &lt;img width=&quot;13&quot; height=&quot;17&quot; style=&quot;margin-bottom: -2px&quot; src=&quot;/img/filelink-pdf.png&quot; alt=&quot;alenex13esais.pdf&quot;/&gt;&lt;/a&gt; or from the &lt;a href=&quot;http://knowledgecenter.siam.org/0238-000015/0238-000015/1&quot;&gt;online proceedings&lt;/a&gt;. The paper was joint work with my colleagues Johannes Fischer and Vitaly Osipov.&lt;/p&gt;&lt;p&gt;The slides to my presentation of the paper on January 7th, 2013 in New Orleans, LA, USA is also available: &lt;a title=&quot;Download alenex13esais-slides.pdf (599 KiB)&quot; href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/alenex13esais-slides.pdf&quot;&gt;alenex13esais-slides.pdf &lt;img width=&quot;13&quot; height=&quot;17&quot; style=&quot;margin-bottom: -2px&quot; src=&quot;/img/filelink-pdf.png&quot; alt=&quot;alenex13esais-slides.pdf&quot;/&gt;&lt;/a&gt;. They contain little text and an example of the eSAIS algorithm with a simplified PQ.&lt;/p&gt;&lt;p&gt;Our implementations of eSAIS, the eSAIS-LCP variants, DC3 and DC3-LCP algorithms as described in the paper are available below under the &lt;a class=&quot;exp&quot; href=&quot;http://www.gnu.org/licenses/gpl.html&quot;&gt;GNU General Public License v3 (GPL)&lt;/a&gt;.&lt;/p&gt;&lt;table class=&quot;darkfullframe&quot;&gt; &lt;tr&gt; &lt;td colspan=&quot;3&quot;&gt;&lt;b&gt;eSAIS and DC3 with LCP version 0.5.2&lt;/b&gt; (current) &lt;span style=&quot;color:red&quot;&gt;updated 2013-03-30&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td&gt;Source code archive:&lt;br /&gt;(includes STXXL)&lt;/td&gt; &lt;td&gt;&lt;a href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/eSAIS-DC3-LCP-0.5.2.tar.bz2&quot;&gt;eSAIS-DC3-LCP-0.5.2.tar.bz2 (975 KiB)&lt;/a&gt;&lt;br /&gt;&lt;code&gt;MD5: 18abfd0836810d7755b7fcabf09ce5dd&lt;/code&gt;&lt;/td&gt; &lt;td&gt;&lt;a href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/eSAIS-DC3-LCP-0.5.2/&quot;&gt;Browse online&lt;/a&gt;&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td rowspan=&quot;3&quot;&gt;Git repositories&lt;/td&gt; &lt;td colspan=&quot;2&quot;&gt;Suffix and LCP construction algorithms&lt;br /&gt; &lt;code&gt;git clone &lt;a href=&quot;http://algohub.iti.kit.edu/algo2/eSAIS/&quot;&gt;http://algohub.iti.kit.edu/algo2/eSAIS/&lt;/a&gt;&lt;/code&gt;&lt;br /&gt; &lt;code&gt;cd eSAIS; git submodule init; git submodule update&lt;/code&gt;&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td colspan=&quot;2&quot;&gt;STXXL with custom patches&lt;br /&gt; &lt;code&gt;git clone &lt;a href=&quot;http://algohub.iti.kit.edu/algo2/stxxl/&quot;&gt;http://algohub.iti.kit.edu/algo2/stxxl/&lt;/a&gt;&lt;/code&gt;&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td colspan=&quot;2&quot;&gt;Customized &lt;a href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/stxxl-doxygen/&quot;&gt;STXXL HTML documentation&lt;/a&gt;&lt;/td&gt; &lt;/tr&gt;&lt;/table&gt;&lt;p&gt;The algorithm implementations requires a special version of the &lt;a class=&quot;exp&quot; href=&quot;http://stxxl.sourceforge.net&quot;&gt;STXXL library&lt;/a&gt;, which is also listed above. For more information about compiling and testing the implementation, please refer to the &lt;a href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/eSAIS-DC3-LCP-0.5.2/README.html&quot;&gt;README&lt;/a&gt; included in the source.&lt;/p&gt;&lt;div style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/&quot;&gt;This blog entry continues on the next page ...&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2012/1119-eSAIS-Inducing-Suffix-and-LCP-Arrays-in-External-Memory/index.html</guid><pubDate>Mon, 19 Nov 2012 15:49:00 +0100</pubDate></item><item><title>Finding Roots of Polynomials by Clipping - Report and Implementation from my Lab Course in Numerical Mathematics</title><link>http://panthema.net/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/index.html</link><description>&lt;h1&gt;&lt;span style=&quot;float: right; clear: right; margin: 0em 0em 1.5em 1.5em; font-size: 10pt; text-align: center&quot;&gt;&lt;a href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/&quot;&gt;&lt;img src=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/thumb.jpg&quot; width=&quot;300&quot; height=&quot;226&quot; alt=&quot;The QuadClip Algorithm&quot; title=&quot;The QuadClip Algorithm&quot; /&gt;&lt;/a&gt;&lt;/span&gt;Finding Roots of Polynomials by Clipping - Report and Implementation from my Lab Course in Numerical Mathematics&lt;/h1&gt;&lt;p class=&quot;info&quot;&gt;Posted on 2012-03-20 22:29 by &lt;a href=&quot;/about/timo.html&quot;&gt;Timo&lt;/a&gt; at &lt;a href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/&quot;&gt;Permlink&lt;/a&gt; with &lt;a href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/#notes&quot;&gt;0 Comments&lt;/a&gt;. Tags: &lt;a href=&quot;/tags/maths.html&quot;&gt;maths&lt;/a&gt; &lt;a href=&quot;/tags/university.html&quot;&gt;university&lt;/a&gt; &lt;a href=&quot;/tags/c++.html&quot;&gt;c++&lt;/a&gt; &lt;/p&gt;&lt;div class=&quot;textcontent&quot;&gt;&lt;p&gt;This semester I had the pleasure to take part in a lab exercise course supervised by Prof. Thomas Lin&amp;szlig; at the &lt;a class=&quot;exp&quot; href=&quot;http://www.fernuni-hagen.de/english/&quot;&gt;FernUniversity of Hagen&lt;/a&gt;. The objective was to comprehend, implement and evaluate a particular recent advancement in the field of numerical mathematics. My topic was finding the roots of a polynomial by clipping in B&amp;eacute;zier representation using two new methods, one devised by Michael Barto&amp;#328; and Bert J&amp;uuml;ttler [1], the other extended from the first by Ligang Liu, Lei Zhang, Binbin Lin and Guojin Wang [2].&lt;/p&gt;&lt;p&gt;My implementation of this topic was done for the lab course in C++ and contains many in themselves interesting sub-algorithms, which are combined into the clipping algorithms for finding roots. These sub-algorithms may prove useful for other purposes, which is the main reason for publishing this website. Among these are:&lt;/p&gt;&lt;ul&gt; &lt;li&gt;Polynomial classes for monomial and B&amp;eacute;zier representations: &lt;a href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/doxygen-html/classPolynomialStandard.html&quot;&gt;PolynomialStandard&lt;/a&gt; and &lt;a href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/doxygen-html/classPolynomialBezier.html&quot;&gt;PolynomialBezier&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Algorithms to convert from monomial to B&amp;eacute;zier representation and vice versa: PolynomialStandard::toBezier() and PolynomialBezier::toStandard().&lt;/li&gt; &lt;li&gt;Evaluation algorithms for both representations: Horner&amp;apos;s Schema and the Algorithm of de Casteljau.&lt;/li&gt; &lt;li&gt;Another version of de Casteljau&amp;apos;s Algorithm to split a polynomial in B&amp;eacute;zier representation into two parts.&lt;/li&gt; &lt;li&gt;Jarvis&amp;apos; March aka gift wrapping (run time O(hn)) to calculate the convex hull of the B&amp;eacute;zier polygon: PolynomialBezier::getConvexHull().&lt;/li&gt; &lt;li&gt;Cardano&amp;apos;s formulas to find all real roots of any cubic polynomial: PolynomialStandard::findRoots().&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;For the lab course I wrote two documents, both in German: one is an abstract &lt;a title=&quot;Download Kurzfassung.pdf (84.9 KiB)&quot; href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/Kurzfassung.pdf&quot;&gt;Kurzfassung.pdf &lt;img width=&quot;13&quot; height=&quot;17&quot; style=&quot;margin-bottom: -2px&quot; src=&quot;/img/filelink-pdf.png&quot; alt=&quot;Kurzfassung.pdf&quot;/&gt;&lt;/a&gt; (1 page), which is translated into English below, and the other a short report &lt;a title=&quot;Download Ausarbeitung.pdf (277 KiB)&quot; href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/Ausarbeitung.pdf&quot;&gt;Ausarbeitung.pdf &lt;img width=&quot;13&quot; height=&quot;17&quot; style=&quot;margin-bottom: -2px&quot; src=&quot;/img/filelink-pdf.png&quot; alt=&quot;Ausarbeitung.pdf&quot;/&gt;&lt;/a&gt; (6 pages). The report contains a short description of the algorithms together with execution and convergence speed measurements, which verify the original authors experiments. For presenting the lab work I created these &lt;a title=&quot;Download Slides.pdf (468 KiB)&quot; href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/Slides.pdf&quot;&gt;Slides.pdf &lt;img width=&quot;13&quot; height=&quot;17&quot; style=&quot;margin-bottom: -2px&quot; src=&quot;/img/filelink-pdf.png&quot; alt=&quot;Slides.pdf&quot;/&gt;&lt;/a&gt;, which however are not self-explanatory due to my minimum-text presentation style.&lt;/p&gt;&lt;div style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/&quot;&gt;This blog entry continues on the next page ...&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;</description><guid isPermaLink="true">http://panthema.net/2012/0320-Finding-Roots-of-Polynomials-by-Clipping/index.html</guid><pubDate>Tue, 20 Mar 2012 22:29:00 +0100</pubDate></item></channel></rss>
