Just a quick update. I have moved the repository for PyMollom over to GitHub. If you are unfamiliar with git, you can check out the library using git clone git://github.com/itkovian/PyMollom.git. At this point, the library is still the same, but I do plan on integrating the new language detection ASAP.
PyMollom moved to GitHub
February 8th, 2010New door opens, old door closes
February 1st, 2010For the past three years, I guess, Veerle has been trying to get me to agree to close up the living room door and make a new door in the dining room. Last fall, I finally caved. I saw the light, she would put it
We discussed our plans with the guy who will be taking charge in redoing our attic, and signed the agreement last November. Half December he gave us a call, and we agreed upon having the job done by the end of January. On the 29th, the demolition man came by and tore a new hole in our wall. Using a water to limit the amount of dust the wall was split apart at the right spot. Three hours later, everything was cleared and the big cleanup could begin. Depsite the water, dust had spread throughout the living room, partially spoiling my plans for getting some work done. With spurious amounts of water and soap, the living room was cleaned, and along with it, all the toys, books, and decoration.
I tore down the old door, with Nathan’s help — he’s scared of loud noise, so I figured that if I could get him to help me out, he would at least stop crying. And he did. with combined forces, we yanked the door from the wall, and replaced it with a brand new wooden framework, against which I put 18mm of multiplex, covered with 9.5 mm of plaster plate (gyproc). A bucket of Knauf Goldband later and the doorway was effectively sealed shut.
I plastered the new door hole, using metal corners to work against — my plastering skillz are pretty good, but only for small areas, i.e., no wider than my tool is. So, without further ado, behold the new door and the sealed old door.
Needless to say, we had to move some furniture around
Mind, a new glass sliding door will be installed shortly. when the money for it arrives on my bank account.
Flemish programming contest
January 28th, 2010On March 17th, 2010, Ghent University organises the second Flemish Programming Contest, in collaboration with the universities of Leuven, Hasselt, Brussels, and Antwerp.
For three different categories of participating teams (of at most three participants), we offer five problems that need to be solved in three hours. The goal is to have at least one fairly easy and one quite hard problem for each category, but as we all know, a problem is only hard when you do not see the solution. So YMMV. Still, we aim to have each team solve at least one problem. For this, teams can use any one of twelve programming languages: C, C++, C#, Haskell, Java, Pascal, PHP, Prolog, Python, Ruby, Scheme, and VB.NET.
Programming should be fun, but to add an extra incentive, we have up to €2500 in prizes. So, without further ado, head over to the contest website and register.
Automated Just-in-time Compiler Tuning
January 25th, 2010I’m pretty excited about our paper to get accepted at CGO 2010, which takes place in Toronto, Canada.
Automated Just-in-time Compiler tuning, Kenneth Hoste, Andy Georges, and Lieven Eeckhout.
The abstract of the paper reads as follows.
Managed runtime systems, such as a Java virtual machine (JVM), are complex pieces of software with many interacting components. The Just-In-Time (JIT) compiler is at the core of the virtual machine, however, tuning the compiler for optimum performance is a challenging task. There are (i) many compiler optimizations and options, (ii) there may be multiple optimization levels (e.g., -O0, -O1, -O2), each with a specific optimization plan consisting of a collection of optimizations, (iii) the Adaptive Optimization System (AOS) that decides which method to optimize to which optimization level requires fine-tuning, and (iv) the effectiveness of the optimizations depends on the application as well as on the hardware platform. Current practice is to manually tune the JIT compiler which is both tedious and very time-consuming, and in addition may lead to suboptimal performance.
This paper proposes automated tuning of the JIT compiler through multi-objective evolutionary search. The proposed framework (i) identifies optimization plans that are Pareto-optimal in terms of compilation time and code quality, (ii) assigns these plans to optimization levels, and (iii) fine-tunes the AOS accordingly. The key benefit of our framework is that it automates the entire exploration process, which enables tuning the JIT compiler for a given hardware platform and/or application at very low cost.
By automatically tuning Jikes RVM using our framework for average performance across the DaCapo and SPECjvm98 benchmark suites, we achieve similar performance to the hand-tuned default Jikes RVM. When optimizing the JIT compiler for individual benchmarks, we achieve statistically significant speedups for most benchmarks, up to 40% for start-up and up to 19% for steady-state performance. We also show that tuning the JIT compiler for a new hardware platform can yield significantly better performance compared to using a JIT compiler that was tuned for another platform.
You can get a preprint of the paper. We also plan to make our tool available, so it can be used to automagically tune other VMs *cough* J9 *cough*.
PTLSim/KVM-QEMU on Ubuntu Karmic Koala
September 28th, 2009Ubuntu Karmic Koala comes with GCC 4.4.1, and the set of include files clashes a bit with what the KVM/QEMU port of PTLSim expects for its string library. Mostly, the issues involve missing ‘const’ modifiers to pointers. Here’s a diff though.
index a0c25f4..87d70d6 100644
— a/Makefile
+++ b/Makefile
@@ -56,7 +56,7 @@ BUILDHOST = $(shell hostname -f)
BUILDUSER = $(USER)
CC = g++
-CFLAGS = -O99 -g3 -fpic -march=core2 -falign-functions=16 -funit-at-a-time -minline-all-stringops -fno-trapping-math -fno-exceptions -fno-rtti -mpreferred-stack-boundary=4 -fno-strict-aliasing -Wreturn-type -mno-red-zone
+CFLAGS = -O99 -g3 -fpic -march=core2 -falign-functions=16 -funit-at-a-time -minline-all-stringops -fno-trapping-math -fno-exceptions -fno-rtti -mpreferred-stack-boundary=4 -fno-strict-aliasing -Wreturn-type -mno-red-zone -fno-stack-protector
KVMVER = 86
QEMUDIR = ../qemu-kvm-ptlsim
INCFLAGS = -Iinclude -I. -I$(QEMUDIR)/kvm/include -I$(QEMUDIR)/kvm/include/x86 -DPTLSIM_HYPERVISOR -DPTLSIM_KVM -D__INSIDE_PTLSIM__ \
diff –git a/lib/klibc.cpp b/lib/klibc.cpp
index ae0f75b..3b5c5c0 100644
— a/lib/klibc.cpp
+++ b/lib/klibc.cpp
@@ -842,7 +842,7 @@ int strncmp(const char *cs, const char *ct, size_t count)
* @s: The string to be searched
* @c: The character to search for
*/
-char *strchr(const char *s, int c)
+const char *strchr(const char *s, int c)
{
for (; *s != (char)c; ++s)
if (*s == ‘\0′)
@@ -858,7 +858,7 @@ char *strchr(const char *s, int c)
* @s: The string to be searched
* @c: The character to search for
*/
-char *strrchr(const char *s, int c)
+const char *strrchr(const char *s, int c)
{
const char *p = s + strlen(s);
do {
@@ -877,7 +877,7 @@ char *strrchr(const char *s, int c)
* @count: The number of characters to be searched
* @c: The character to search for
*/
-char *strnchr(const char *s, size_t count, int c)
+const char *strnchr(const char *s, size_t count, int c)
{
for (; count– && *s != ‘\0′; ++s)
if (*s == (char)c)
@@ -976,7 +976,7 @@ size_t strcspn(const char *s, const char *reject)
* @cs: The string to be searched
* @ct: The characters to search for
*/
-char *strpbrk(const char *cs, const char *ct)
+const char *strpbrk(const char *cs, const char *ct)
{
const char *sc1, *sc2;
@@ -1146,7 +1146,7 @@ void *memscan(void *addr, int c, size_t size)
* @s1: The string to be searched
* @s2: The string to search for
*/
-char *strstr(const char *s1, const char *s2)
+const char *strstr(const char *s1, const char *s2)
{
int l1, l2;
@@ -1175,7 +1175,7 @@ char *strstr(const char *s1, const char *s2)
* returns the address of the first occurrence of @c, or %NULL
* if @c is not found
*/
-void *memchr(const void *s, int c, size_t n)
+const void *memchr(const void *s, int c, size_t n)
{
const unsigned char *p = (unsigned char*)s;
while (n– != 0) {
As you can tell, the CFLAGS definition in the Makefile was also extended with “-fno-stack-protector”, as the build with otherwise fail with a few undefined reference to `__stack_chk_fail’ errors, see also this lkml entry for more information.
The Xen hypervisor with Ubuntu Karmic Koala
September 26th, 2009Setting up Xen should be pretty trivial these days, however if you want to get on the edge of development, there might be some trickery involved, which is not documented very well. Overall, the PvOps wiki post on xensource does a fine job, yet several things seemed to be missing. A lot of the flotsam on blogs and mailings lists that can be googled is too old to be remotely useful.
Here is a more or less detailed report on how to set up Xen op Ubuntu Karmic Koala (9.10) on a corei7 system (amd64). Obviously the first thing to do is upgrade the Ubuntu installation to the latest version, or deploy a fresh Karmic installation.
The Linux kernel source code to use resides in a git repository, so git should be installed, using apt-get install git. You should not use the plain kernel sources, as they do not contain the Xen code for dom0, at least. Instead, use the following steps to obtain the source code
$ cd /usr/src
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git linux-2.6-xen
$ cd linux-2.6-xen
$ git checkout origin/xen/master -b xen/master
$ git pull
After you checked out the kernel, you might want to copy an existing configuration file so you can re-use the result of your hard work you did on previous installations. You should also check the file at include/linux/swiotlb.h and if necessary apply the following patch (it may already be in the git tip, so check first before applying
index cb1a663..f4ebffb 100644
— a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -2,6 +2,7 @@
#define __LINUX_SWIOTLB_H
#include <linux/types.h>
+#include <linux/dma-mapping.h>
struct device;
struct dma_attrs;
Configure the kernel, e.g., make menuconfig. Do make sure that you switch ACPI on, because the Xen dom0 config option has a dependency on ACPI and will not show up in the configuration menu’s otherwise. In the end you should have the following configuration options set:
CONFIG_XEN_MAX_DOMAIN_MEMORY=32
CONFIG_XEN_SAVE_RESTORE=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PRIVILEGED_GUEST=y
CONFIG_XEN_DOM0_PCI=y
CONFIG_XEN_BLKDEV_FRONTEND=y
CONFIG_XEN_NETDEV_FRONTEND=y
CONFIG_XEN_KBDDEV_FRONTEND=m
CONFIG_XEN_FBDEV_FRONTEND=m
CONFIG_XEN_BALLOON=y
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=y
CONFIG_XEN_BACKEND=y
CONFIG_XEN_BLKDEV_BACKEND=y
CONFIG_XEN_NETDEV_BACKEND=y
CONFIG_XENFS=y
CONFIG_XEN_COMPAT_XENFS=y
CONFIG_XEN_SYS_HYPERVISOR=y
CONFIG_XEN_XENBUS_FRONTEND=y
CONFIG_XEN_S3=y
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_XEN=y
Build the kernel as usual using make bzImage && make modules && make modules_install && make install. If you have the options above set right, you should be able to use the same kernel for both dom0 and domU VMs, dom0 being the privileged guest referred to in the above configuration options. If you are using grub, the entry for booting in the Xen VMM and creating the dom0 is the following, if the kernel name extension is set to xen-3.4.1:
uuid ac5415fa-73d8-47dc-bf6e-eb6004d0484c
kernel /boot/xen-3.4.2-rc1-pre.gz dom0_mem=1048576
module /boot/vmlinuz-2.6.31-xen-3.4.1 root=UUID=ac5415fa-73d8-47dc-bf6e-eb6004d0484c ro console=tty0 earlyprintk=vga
module /boot/initrd.img-2.6.31-xen-3.4.1
quiet
It is advisable to also make an initramfs image, as follows
The next step is then to get the Xen sources. I have used the Xen 3.4-testing mercurial repository. If you do not have mercurial installed on your machine, you might want to apt-get install mercurial. The Xen repository resides at http://xenbits.xensource.com/xen-3.4-testing.hg:
Before you proceed, make sure that your Python installation is up to par, if necessary apt-get install python. Since several OS releases, neither Debian, nor Ubuntu have been using the site-packages directory to drop Python packages. They instead use dist-packages, residing in the /usr/lib/python2.X directory. Karmic Koala comes with Python 2.6, so you should symlink /usr/lib/python2.6/dist-packages to /usr/lib.python2.6/site-packages. If you do not, installing xen will create the site-packages directory, which will obviously be missing from your sys.path when running Python. It might also be wise, if you have an older Xen version installed to purge the python directories. I noticed that a reinstall with the latest Xen sources failed to overwrite a bunch of files, rather symlinking to file under /usr/shared. I have no idea why this was the case, but other than that, those files contained code that pointed to obsolete file locations, resulting in the xend daemon not working correctly.
Now, you should be all set up. Build the Xen hypervisor, the Xen tools and if necessary the Xen stub domains:
$ make install-tools
$ make install-stubdom
If everything went well, you can now boot into your dom0 domain. After booting the hypervisor, the Linux kernel is started from the location you stated in /boot/grub/menu.lst on the line specifying the module for the kernel.
The final task is setting up a domU. I have chosen to host my guest domain(s) on LVM, so the first task is setting up the LVM itself. This is pretty straighforward. It involves creating at least one physical volume, a volume group, and a logical volume. The first thing to do is assign the correct parition type. In this example I will be using 500GB of a 1TB disk, using the first primary partition, i.e., /dev/sdd1. With fdisk, create a new partition and assign it the type 8e (Linux LVM). The following commands set up a 16GB logical volume. First we create a physical volume, then we set up a volume group named vg0 with physical extends of 8M, and finally, we set up the LV called ubuntu-vm1-disk.
$ vgcreate -s 8M vg0 /dev/sdd1
$ lvcreate -L 16G -n ubuntu-vm1-disk
The volume needs a file system, which is created using the normal mke2fs tools.
For the domU, we also use a Ubunut Karnmic Koala, set up by debootstrap. Note that older Ubuntu distributions should also work, though some of the changes that need to be made might differ. Depending on your location, you might want to use a different mirror.
$ mount /dev/vg0/ubuntu-vm1-disk /mnt/ubuntu
$ debootstrap –arch=amd64 karmic /mnt/ubuntu http://ubuntu.mirror.cambrium.nl/ubuntu/
Several files need to be altered or made. etc/fstab should read
/dev/xvda1 / ext3 defaults,errors=remount-ro 0 1
In etc/hosts you should put something like
127.0.0.1 problemchild
Finally, you need to fix the Ubuntu upstart process to allow connection to hvc0, otherwise you will end up with domU stating it cannot find a file descriptor for the console. The way to fix is is by copying /etc/init/tty1.conf to /etc/init/hvc0.conf and change the line to read
stop on runlevel [!2345]
respawn
exec /sbin/getty -8 38400 hvc0
The newly created ubuntu installation can now be unmounted, and the domU guest can be set up. This is an example of the domU config for the ProblemChild domU domain, saved to /etc/xen/problem_child.
ramdisk = “/boot/initrd.img-2.6.31-xen-3.4.1″
memory = 1024
maxmem = 1024
name = “ProblemChild”
disk = [ 'phy:vg0/ubuntu-vm1-disk,xvda1,w' ]
#dhcp=”dhcp”
ip = “192.168.0.101″
netmask = “255.255.255.0″
gateway = “192.168.0.1″
hostname = “problemchild”
root = “/dev/xvda1 ro”
You should now be able to start the domain by executing
The result should be something like this at the end of the domU boot process:
* Setting up console font and keymap… [ OK ]
* Starting OpenBSD Secure Shell server sshd [ OK ]
Ubuntu karmic (development branch) assail hvc0
problemchild login:
Performance, scalability, and real-time response from the Linux kernel
July 20th, 2009The most interesting course, as well as the one I enjoyed most, was on the
performance, scalability and real-time response of the Linux kernel.
He opened the course by dropping a question into the unsuspecting audience:
what decades old tech can keep a multi-core busy and yet be easy to program
against. I thought Paul had the idea of a time-sharing machine in his mind, but
the solution was far easier than that: SQL. Given that the frequency increase
of the CPUs is stabilising at naught, we need to find a good way to easily
program against multi-core architectures. Something that rivals the ease of
SQL, where under the hood a lot of stuff is going on, but to the user, it
remains fairly simple. Unlike most of the other people talking about
parallelism, Paul stressed multiple times that if one does not need it, it’s
best to run single threaded. I wholeheartedly agree! On the other hand, if we
parallelise, we should be considering high-level approaches prior to trying to
get the nitty-gritty details right. So, first: get your algorithm in shape. I
think that’s a very good point, given the fact that research papers publishing
tweaks, rather than new algorithms seldom succeed in increasing the performance
with a factor or even a large percentage. Conversely, any performance lost at
the base OS level, cannot be made up by the higher levels, no matter the
algorithm. Context-switches, locks, etc. take a (more-or-less) fixed amount of
time, and that time will be spent anyhow.
The major issue with RT-processes seems that they need to interact with non-RT
processes, I/O (disk, network, etc.). As such, the RT approach has to be
applied across the entire execution stack, if we want to gdet it right.
However, we still need to keep a fair responsiveness for non-RT processes.
Essentially, Paul argues for making tradeoffs, rather that going for the
best-for-a-single-goal apporoach and ignore the rest.
The question raised was why we need to enhance performance. The answer is that
people time is much more costly than machine time these days. So it does no
lomnger pay off to get an engineer trying to enhance the solution. It should be
done automnagically as much as possible. Moreover, general solutions help to
spread the cost over multiple users.
One of the major problem when parallelising programs is that people either do
not grok the issues fully, or try to tackle the problems in the wrong order.
Paul argued that we first need to understand how we can split up the problem
into parts where there is little interaction between the data (as to avoid
excess locking). Only then can we partition the work that is done on that data.
The final step then is to determine which parts can have actual access to the
data, i.e., assign the locks. The matra that was repeated here was that
low-level details really do matter, and that it is important to get them right.
Building on this, the argument was raised what we rely on people who implement
things to have detailed knowledge of the underlying hardware. Unfortunately,
this is not always the case.
The takeaway lesson from the first lecture was this: parallel programming is
bloody hard, because it was designed that way.
Lesson 2 discussed Linux kernel programming environments dealing with: response
times, preemption inside the kernel, non-maskable interrupts, etc. Point made:
if an algorithm runs at a low level, you need interruptible locks. The kernel
comes with a broad aaray of synchronisation primitives, so it is important to
use the right primitives for the right job. For example, use locks that allow
looping in the reader if there are potentially (multiple) writers. Once more,
Paul stressed that synchronisation primitivies are not the first thing to
decide on. We should associate locks and other primitives with each data
partition (that was agreed upon earlier in the design stage). Clearly, it is
not good to have too many data partitions, as that means more locks, and a
higher risk of lock contention. The example used throughout this lesson was
that of a linked list. Should we lock the header? Lock each node? Keep the
locks in the data structure or in some hash array of locks? Key point: provide
protection for each way in which the data can be accessed! A per-cpu locking
mechanism can be used; if done right it scales pretty well.
In lesson 3, Paul tackled the performance and scalability of Linux
applications. Most frameworks (200+) that we once in use have now either faded,
merged, or discontinued. Advice is given not to use or rely on signal handlers.
POSIX primitives were discussed, as were per-thread variables, spinlocks, etc.
Important point was that the use of per-cpu state to lock onto, does not
translate well from kernel to user space. Some remarks were made about the RT
aspects of user-space applications. Should this be enforced? The issue here is
that opening RT behaviour to user-space clears the way for abuse. During the class, he used the (adapted) illustration of the blind philosophers and the elephant:
Lesson 4 was fully dedicated to real time systems, discussing some of the
implementations in the Linux kernel for dealing with this. Main topics of the
day were timers, high-resolution and others, interrupt handlers that can be
threaded, etc. It was stressed that real time has a broad range of meanings,
going from a few nanoseconds up to 10ms, the latter amounting basically to the
context switch time. Apparently, as a first step, some parts of the kernel can
be preempted, some cannot. The consequence is a reduction of schedular latency,
but nowhere near enough for a RT system at the hiogh end of the scale. Timer
wheels were added to improve locality and queueing, but still certain cascading
operations on this data structure can take a long time. Long enough to warrant
implementing high-resolution timers using RB trees, along with preemptible
spinlocks. Still: greater power means greater responsibility, so care must be
taken. Priority inversion was discussed, and adequaltely illustrated using the
dancing processes.
RCU once more came to the rescue, and it was shown how this can be used in the RT scenario, with priority inversion.
In the final lesson, Paul discussed RT Linux applications.
I guess the above illustration really says it all. The class discussed the
meaning of a hard RT system, and most of use were proven wrong. In some cases
knowing that failure is imminent is more important that guaranteed making the
deadline. (This eems to have some resemblance to writing research papers.) A
combination of an accurate system that is allowed to fail and can indicate it
with a less accurate systemn that guarentees deadline meeting seems to be the
way to go. In any case, maths cannot describe RT systems in practive, and QoS
is more important that hard/soft RT distinction.
RT applications can be divided into three classes: search for life
(medical/industrial control systems), search for death (military) and search
for money (financial). In todays interconnected machine web, the slowest
machine determines the RT nature of the complete system. Multiple serialised
machines have a large impact on this fact. Funny fact: in the Linux kernel,
real time used to mean real life time, rather than deadline meeting. So beware
of the code you rely on!
Spamming for patent journals
March 31st, 2009I have just received a rather interesting email — interesting, not for its contents, but for the idiocy it spells out. I shall refrain from posting it in full, instead focusing on the small interesting part.
An exciting journal entitled “Recent Patents on Computer Science (CSENG)” was launched in January 2008. This journal publishes review articles written by experts on recent patents in the field of Computer Science. Please visit the journal‘s website at www dot compscieng dot org for the Editorial Board, first journal issue, abstracts of recent issues and other details.
Recent Patents on Computer Science (CSENG) is indexed in Genamics JournalSeek, Compendex.
If you would like to submit a review article to the journal on an important patent area in Computer Science, then please provide us the title of your proposed article and a tentative date of submission at editorial@compscieng.org. Moreover in your reply, could you please suggest some specific keywords, keyword phrases related to your topic, so that detailed patents may be sent to you for the preparation of your manuscript.
Please do not hesitate to contact me if you have any queries. I look forward to receiving your positive response.
I went to their website (if you really wish to go there, you’ll have to manually copy and paste the url; I do not link to CS patent promoting people), and skimmed the current issue and found the following.
Recent Patents on Genetic Programming
Michael O’Neill and Anthony Brabazon
Genetic Programming is a form of Natural Computing which adopts principles from neo-Darwinian evolution to automatically solve problems. It is a model induction method in that both the structure and parameters of the solution are explored simultaneously. Genetic Programming is a particularly interesting method as it is claimed to be an invention machine, producing solutions to problems that are competitive and in some cases superior to those produced by human experts. Its best solutions have become patentable inventions in their own right. In this article, we overview some of the recent patents relating to Genetic Programming over the past three years. In light of the number and diversity of patent applications during this period, it is clear that Genetic Programming is a vibrant field of research, which is having a significant impact on real-world applications, and is demonstrating clear commercial potential.
I kid you not. A vibrant field of research these people are trying to kill off, by patenting solutions produced by a bloody search algorithm.
Needless to say, my response was very positive: http://endsoftpatents.org/. I hope they get the bloody clue.














