PyMollom moved to GitHub

February 8th, 2010

Just a quick update. I have moved the repository for PyMollom over to GitHub. If you are unfamiliar with git, you can check out the library using git clone git://github.com/itkovian/PyMollom.git. At this point, the library is still the same, but I do plan on integrating the new language detection ASAP.

New door opens, old door closes

February 1st, 2010

For the past three years, I guess, Veerle has been trying to get me to agree to close up the living room door and make a new door in the dining room. Last fall, I finally caved. I saw the light, she would put it :-) We discussed our plans with the guy who will be taking charge in redoing our attic, and signed the agreement last November. Half December he gave us a call, and we agreed upon having the job done by the end of January. On the 29th, the demolition man came by and tore a new hole in our wall. Using a water to limit the amount of dust the wall was split apart at the right spot. Three hours later, everything was cleared and the big cleanup could begin. Depsite the water, dust had spread throughout the living room, partially spoiling my plans for getting some work done. With spurious amounts of water and soap, the living room was cleaned, and along with it, all the toys, books, and decoration.

I tore down the old door, with Nathan’s help — he’s scared of loud noise, so I figured that if I could get him to help me out, he would at least stop crying. And he did. with combined forces, we yanked the door from the wall, and replaced it with a brand new wooden framework, against which I put 18mm of multiplex, covered with 9.5 mm of plaster plate (gyproc). A bucket of Knauf Goldband later and the doorway was effectively sealed shut.

I plastered the new door hole, using metal corners to work against — my plastering skillz are pretty good, but only for small areas, i.e., no wider than my tool is. So, without further ado, behold the new door and the sealed old door.

Changing doors
Changing doors

Needless to say, we had to move some furniture around :-)

Changing doors

Mind, a new glass sliding door will be installed shortly. when the money for it arrives on my bank account.

Flemish programming contest

January 28th, 2010

On March 17th, 2010, Ghent University organises the second Flemish Programming Contest, in collaboration with the universities of Leuven, Hasselt, Brussels, and Antwerp.

For three different categories of participating teams (of at most three participants), we offer five problems that need to be solved in three hours. The goal is to have at least one fairly easy and one quite hard problem for each category, but as we all know, a problem is only hard when you do not see the solution. So YMMV. Still, we aim to have each team solve at least one problem. For this, teams can use any one of twelve programming languages: C, C++, C#, Haskell, Java, Pascal, PHP, Prolog, Python, Ruby, Scheme, and VB.NET.

Programming should be fun, but to add an extra incentive, we have up to €2500 in prizes. So, without further ado, head over to the contest website and register.

Automated Just-in-time Compiler Tuning

January 25th, 2010

I’m pretty excited about our paper to get accepted at CGO 2010, which takes place in Toronto, Canada.

Automated Just-in-time Compiler tuning, Kenneth Hoste, Andy Georges, and Lieven Eeckhout.

The abstract of the paper reads as follows.

Managed runtime systems, such as a Java virtual machine (JVM), are complex pieces of software with many interacting components. The Just-In-Time (JIT) compiler is at the core of the virtual machine, however, tuning the compiler for optimum performance is a challenging task. There are (i) many compiler optimizations and options, (ii) there may be multiple optimization levels (e.g., -O0, -O1, -O2), each with a specific optimization plan consisting of a collection of optimizations, (iii) the Adaptive Optimization System (AOS) that decides which method to optimize to which optimization level requires fine-tuning, and (iv) the effectiveness of the optimizations depends on the application as well as on the hardware platform. Current practice is to manually tune the JIT compiler which is both tedious and very time-consuming, and in addition may lead to suboptimal performance.

This paper proposes automated tuning of the JIT compiler through multi-objective evolutionary search. The proposed framework (i) identifies optimization plans that are Pareto-optimal in terms of compilation time and code quality, (ii) assigns these plans to optimization levels, and (iii) fine-tunes the AOS accordingly. The key benefit of our framework is that it automates the entire exploration process, which enables tuning the JIT compiler for a given hardware platform and/or application at very low cost.

By automatically tuning Jikes RVM using our framework for average performance across the DaCapo and SPECjvm98 benchmark suites, we achieve similar performance to the hand-tuned default Jikes RVM. When optimizing the JIT compiler for individual benchmarks, we achieve statistically significant speedups for most benchmarks, up to 40% for start-up and up to 19% for steady-state performance. We also show that tuning the JIT compiler for a new hardware platform can yield significantly better performance compared to using a JIT compiler that was tuned for another platform.

You can get a preprint of the paper. We also plan to make our tool available, so it can be used to automagically tune other VMs *cough* J9 *cough*.

PTLSim/KVM-QEMU on Ubuntu Karmic Koala

September 28th, 2009

Ubuntu Karmic Koala comes with GCC 4.4.1, and the set of include files clashes a bit with what the KVM/QEMU port of PTLSim expects for its string library. Mostly, the issues involve missing ‘const’ modifiers to pointers. Here’s a diff though.

diff –git a/Makefile b/Makefile
index a0c25f4..87d70d6 100644
— a/Makefile
+++ b/Makefile
@@ -56,7 +56,7 @@ BUILDHOST = $(shell hostname -f)
BUILDUSER = $(USER)

CC = g++
-CFLAGS = -O99 -g3 -fpic -march=core2 -falign-functions=16 -funit-at-a-time -minline-all-stringops -fno-trapping-math -fno-exceptions -fno-rtti -mpreferred-stack-boundary=4 -fno-strict-aliasing -Wreturn-type -mno-red-zone
+CFLAGS = -O99 -g3 -fpic -march=core2 -falign-functions=16 -funit-at-a-time -minline-all-stringops -fno-trapping-math -fno-exceptions -fno-rtti -mpreferred-stack-boundary=4 -fno-strict-aliasing -Wreturn-type -mno-red-zone -fno-stack-protector
KVMVER = 86
QEMUDIR = ../qemu-kvm-ptlsim
INCFLAGS = -Iinclude -I. -I$(QEMUDIR)/kvm/include -I$(QEMUDIR)/kvm/include/x86 -DPTLSIM_HYPERVISOR -DPTLSIM_KVM -D__INSIDE_PTLSIM__ \
diff –git a/lib/klibc.cpp b/lib/klibc.cpp
index ae0f75b..3b5c5c0 100644
— a/lib/klibc.cpp
+++ b/lib/klibc.cpp
@@ -842,7 +842,7 @@ int strncmp(const char *cs, const char *ct, size_t count)
* @s: The string to be searched
* @c: The character to search for
*/
-char *strchr(const char *s, int c)
+const char *strchr(const char *s, int c)
{
for (; *s != (char)c; ++s)
if (*s == ‘\0′)
@@ -858,7 +858,7 @@ char *strchr(const char *s, int c)
* @s: The string to be searched
* @c: The character to search for
*/
-char *strrchr(const char *s, int c)
+const char *strrchr(const char *s, int c)
{
const char *p = s + strlen(s);
do {
@@ -877,7 +877,7 @@ char *strrchr(const char *s, int c)
* @count: The number of characters to be searched
* @c: The character to search for
*/
-char *strnchr(const char *s, size_t count, int c)
+const char *strnchr(const char *s, size_t count, int c)
{
for (; count– && *s != ‘\0′; ++s)
if (*s == (char)c)
@@ -976,7 +976,7 @@ size_t strcspn(const char *s, const char *reject)
* @cs: The string to be searched
* @ct: The characters to search for
*/
-char *strpbrk(const char *cs, const char *ct)
+const char *strpbrk(const char *cs, const char *ct)
{
const char *sc1, *sc2;

@@ -1146,7 +1146,7 @@ void *memscan(void *addr, int c, size_t size)
* @s1: The string to be searched
* @s2: The string to search for
*/
-char *strstr(const char *s1, const char *s2)
+const char *strstr(const char *s1, const char *s2)
{
int l1, l2;

@@ -1175,7 +1175,7 @@ char *strstr(const char *s1, const char *s2)
* returns the address of the first occurrence of @c, or %NULL
* if @c is not found
*/
-void *memchr(const void *s, int c, size_t n)
+const void *memchr(const void *s, int c, size_t n)
{
const unsigned char *p = (unsigned char*)s;
while (n– != 0) {

As you can tell, the CFLAGS definition in the Makefile was also extended with “-fno-stack-protector”, as the build with otherwise fail with a few undefined reference to `__stack_chk_fail’ errors, see also this lkml entry for more information.

The Xen hypervisor with Ubuntu Karmic Koala

September 26th, 2009

Setting up Xen should be pretty trivial these days, however if you want to get on the edge of development, there might be some trickery involved, which is not documented very well. Overall, the PvOps wiki post on xensource does a fine job, yet several things seemed to be missing. A lot of the flotsam on blogs and mailings lists that can be googled is too old to be remotely useful.

Here is a more or less detailed report on how to set up Xen op Ubuntu Karmic Koala (9.10) on a corei7 system (amd64). Obviously the first thing to do is upgrade the Ubuntu installation to the latest version, or deploy a fresh Karmic installation.

The Linux kernel source code to use resides in a git repository, so git should be installed, using apt-get install git. You should not use the plain kernel sources, as they do not contain the Xen code for dom0, at least. Instead, use the following steps to obtain the source code

$ sudo su -
$ cd /usr/src
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git linux-2.6-xen
$ cd linux-2.6-xen
$ git checkout origin/xen/master -b xen/master
$ git pull

After you checked out the kernel, you might want to copy an existing configuration file so you can re-use the result of your hard work you did on previous installations. You should also check the file at include/linux/swiotlb.h and if necessary apply the following patch (it may already be in the git tip, so check first before applying :-)

diff –git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index cb1a663..f4ebffb 100644
— a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -2,6 +2,7 @@
#define __LINUX_SWIOTLB_H

#include <linux/types.h>
+#include <linux/dma-mapping.h>

struct device;
struct dma_attrs;

Configure the kernel, e.g., make menuconfig. Do make sure that you switch ACPI on, because the Xen dom0 config option has a dependency on ACPI and will not show up in the configuration menu’s otherwise. In the end you should have the following configuration options set:

CONFIG_XEN=y
CONFIG_XEN_MAX_DOMAIN_MEMORY=32
CONFIG_XEN_SAVE_RESTORE=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PRIVILEGED_GUEST=y
CONFIG_XEN_DOM0_PCI=y
CONFIG_XEN_BLKDEV_FRONTEND=y
CONFIG_XEN_NETDEV_FRONTEND=y
CONFIG_XEN_KBDDEV_FRONTEND=m
CONFIG_XEN_FBDEV_FRONTEND=m
CONFIG_XEN_BALLOON=y
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=y
CONFIG_XEN_BACKEND=y
CONFIG_XEN_BLKDEV_BACKEND=y
CONFIG_XEN_NETDEV_BACKEND=y
CONFIG_XENFS=y
CONFIG_XEN_COMPAT_XENFS=y
CONFIG_XEN_SYS_HYPERVISOR=y
CONFIG_XEN_XENBUS_FRONTEND=y
CONFIG_XEN_S3=y
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_XEN=y

Build the kernel as usual using make bzImage && make modules && make modules_install && make install. If you have the options above set right, you should be able to use the same kernel for both dom0 and domU VMs, dom0 being the privileged guest referred to in the above configuration options. If you are using grub, the entry for booting in the Xen VMM and creating the dom0 is the following, if the kernel name extension is set to xen-3.4.1:

title Xen 3.4.2-rc2 / Ubuntu karmic (development branch), kernel 2.6.31
uuid ac5415fa-73d8-47dc-bf6e-eb6004d0484c
kernel /boot/xen-3.4.2-rc1-pre.gz dom0_mem=1048576
module /boot/vmlinuz-2.6.31-xen-3.4.1 root=UUID=ac5415fa-73d8-47dc-bf6e-eb6004d0484c ro console=tty0 earlyprintk=vga
module /boot/initrd.img-2.6.31-xen-3.4.1
quiet

It is advisable to also make an initramfs image, as follows

$ mkinitramfs -o /boot/initrd.img-2.6.31-xen-3.4.1 2.6.31-xen-3.4.1

The next step is then to get the Xen sources. I have used the Xen 3.4-testing mercurial repository. If you do not have mercurial installed on your machine, you might want to apt-get install mercurial. The Xen repository resides at http://xenbits.xensource.com/xen-3.4-testing.hg:

$ hg clone http://xenbits.xensource.com/xen-3.4-testing.hg

Before you proceed, make sure that your Python installation is up to par, if necessary apt-get install python. Since several OS releases, neither Debian, nor Ubuntu have been using the site-packages directory to drop Python packages. They instead use dist-packages, residing in the /usr/lib/python2.X directory. Karmic Koala comes with Python 2.6, so you should symlink /usr/lib/python2.6/dist-packages to /usr/lib.python2.6/site-packages. If you do not, installing xen will create the site-packages directory, which will obviously be missing from your sys.path when running Python. It might also be wise, if you have an older Xen version installed to purge the python directories. I noticed that a reinstall with the latest Xen sources failed to overwrite a bunch of files, rather symlinking to file under /usr/shared. I have no idea why this was the case, but other than that, those files contained code that pointed to obsolete file locations, resulting in the xend daemon not working correctly.

Now, you should be all set up. Build the Xen hypervisor, the Xen tools and if necessary the Xen stub domains:

$ make install-xen
$ make install-tools
$ make install-stubdom

If everything went well, you can now boot into your dom0 domain. After booting the hypervisor, the Linux kernel is started from the location you stated in /boot/grub/menu.lst on the line specifying the module for the kernel.

The final task is setting up a domU. I have chosen to host my guest domain(s) on LVM, so the first task is setting up the LVM itself. This is pretty straighforward. It involves creating at least one physical volume, a volume group, and a logical volume. The first thing to do is assign the correct parition type. In this example I will be using 500GB of a 1TB disk, using the first primary partition, i.e., /dev/sdd1. With fdisk, create a new partition and assign it the type 8e (Linux LVM). The following commands set up a 16GB logical volume. First we create a physical volume, then we set up a volume group named vg0 with physical extends of 8M, and finally, we set up the LV called ubuntu-vm1-disk.

$ pvcreate /dev/sdd1
$ vgcreate -s 8M vg0 /dev/sdd1
$ lvcreate -L 16G -n ubuntu-vm1-disk

The volume needs a file system, which is created using the normal mke2fs tools.

$ mke2fs -t ext3 -m 2 -v /dev/vg0/ubuntu-vm1-disk

For the domU, we also use a Ubunut Karnmic Koala, set up by debootstrap. Note that older Ubuntu distributions should also work, though some of the changes that need to be made might differ. Depending on your location, you might want to use a different mirror.

$ mkdir /mnt/ubuntu
$ mount /dev/vg0/ubuntu-vm1-disk /mnt/ubuntu
$ debootstrap –arch=amd64 karmic /mnt/ubuntu http://ubuntu.mirror.cambrium.nl/ubuntu/

Several files need to be altered or made. etc/fstab should read

proc /proc proc defaults 0 0
/dev/xvda1 / ext3 defaults,errors=remount-ro 0 1

In etc/hosts you should put something like

127.0.0.1 localhost localhost.localdomain
127.0.0.1 problemchild

Finally, you need to fix the Ubuntu upstart process to allow connection to hvc0, otherwise you will end up with domU stating it cannot find a file descriptor for the console. The way to fix is is by copying /etc/init/tty1.conf to /etc/init/hvc0.conf and change the line to read

start on stopped rc RUNLEVEL=[2345]
stop on runlevel [!2345]

respawn
exec /sbin/getty -8 38400 hvc0

The newly created ubuntu installation can now be unmounted, and the domU guest can be set up. This is an example of the domU config for the ProblemChild domU domain, saved to /etc/xen/problem_child.

kernel = “/boot/vmlinuz-2.6.31-xen-3.4.1″
ramdisk = “/boot/initrd.img-2.6.31-xen-3.4.1″
memory = 1024
maxmem = 1024
name = “ProblemChild”
disk = [ 'phy:vg0/ubuntu-vm1-disk,xvda1,w' ]
#dhcp=”dhcp”
ip = “192.168.0.101″
netmask = “255.255.255.0″
gateway = “192.168.0.1″
hostname = “problemchild”
root = “/dev/xvda1 ro”

You should now be able to start the domain by executing

$ xm create problem_child -c

The result should be something like this at the end of the domU boot process:

* Setting preliminary keymap… [ OK ]
* Setting up console font and keymap… [ OK ]
* Starting OpenBSD Secure Shell server sshd [ OK ]

Ubuntu karmic (development branch) assail hvc0

problemchild login:

Performance, scalability, and real-time response from the Linux kernel

July 20th, 2009

The most interesting course, as well as the one I enjoyed most, was on the
performance, scalability and real-time response of the Linux kernel.

Paul McKenney

He opened the course by dropping a question into the unsuspecting audience:
what decades old tech can keep a multi-core busy and yet be easy to program
against. I thought Paul had the idea of a time-sharing machine in his mind, but
the solution was far easier than that: SQL. Given that the frequency increase
of the CPUs is stabilising at naught, we need to find a good way to easily
program against multi-core architectures. Something that rivals the ease of
SQL, where under the hood a lot of stuff is going on, but to the user, it
remains fairly simple. Unlike most of the other people talking about
parallelism, Paul stressed multiple times that if one does not need it, it’s
best to run single threaded. I wholeheartedly agree! On the other hand, if we
parallelise, we should be considering high-level approaches prior to trying to
get the nitty-gritty details right. So, first: get your algorithm in shape. I
think that’s a very good point, given the fact that research papers publishing
tweaks, rather than new algorithms seldom succeed in increasing the performance
with a factor or even a large percentage. Conversely, any performance lost at
the base OS level, cannot be made up by the higher levels, no matter the
algorithm. Context-switches, locks, etc. take a (more-or-less) fixed amount of
time, and that time will be spent anyhow.

The major issue with RT-processes seems that they need to interact with non-RT
processes, I/O (disk, network, etc.). As such, the RT approach has to be
applied across the entire execution stack, if we want to gdet it right.
However, we still need to keep a fair responsiveness for non-RT processes.
Essentially, Paul argues for making tradeoffs, rather that going for the
best-for-a-single-goal apporoach and ignore the rest.

The question raised was why we need to enhance performance. The answer is that
people time is much more costly than machine time these days. So it does no
lomnger pay off to get an engineer trying to enhance the solution. It should be
done automnagically as much as possible. Moreover, general solutions help to
spread the cost over multiple users.

One of the major problem when parallelising programs is that people either do
not grok the issues fully, or try to tackle the problems in the wrong order.
Paul argued that we first need to understand how we can split up the problem
into parts where there is little interaction between the data (as to avoid
excess locking). Only then can we partition the work that is done on that data.
The final step then is to determine which parts can have actual access to the
data, i.e., assign the locks. The matra that was repeated here was that
low-level details really do matter, and that it is important to get them right.
Building on this, the argument was raised what we rely on people who implement
things to have detailed knowledge of the underlying hardware. Unfortunately,
this is not always the case.

The takeaway lesson from the first lecture was this: parallel programming is
bloody hard, because it was designed that way.

Lesson 2 discussed Linux kernel programming environments dealing with: response
times, preemption inside the kernel, non-maskable interrupts, etc. Point made:
if an algorithm runs at a low level, you need interruptible locks. The kernel
comes with a broad aaray of synchronisation primitives, so it is important to
use the right primitives for the right job. For example, use locks that allow
looping in the reader if there are potentially (multiple) writers. Once more,
Paul stressed that synchronisation primitivies are not the first thing to
decide on. We should associate locks and other primitives with each data
partition (that was agreed upon earlier in the design stage). Clearly, it is
not good to have too many data partitions, as that means more locks, and a
higher risk of lock contention. The example used throughout this lesson was
that of a linked list. Should we lock the header? Lock each node? Keep the
locks in the data structure or in some hash array of locks? Key point: provide
protection for each way in which the data can be accessed! A per-cpu locking
mechanism can be used; if done right it scales pretty well.

In lesson 3, Paul tackled the performance and scalability of Linux
applications. Most frameworks (200+) that we once in use have now either faded,
merged, or discontinued. Advice is given not to use or rely on signal handlers.
POSIX primitives were discussed, as were per-thread variables, spinlocks, etc.
Important point was that the use of per-cpu state to lock onto, does not
translate well from kernel to user space. Some remarks were made about the RT
aspects of user-space applications. Should this be enforced? The issue here is
that opening RT behaviour to user-space clears the way for abuse. During the class, he used the (adapted) illustration of the blind philosophers and the elephant:

The five blind penguins and the elephant

Lesson 4 was fully dedicated to real time systems, discussing some of the
implementations in the Linux kernel for dealing with this. Main topics of the
day were timers, high-resolution and others, interrupt handlers that can be
threaded, etc. It was stressed that real time has a broad range of meanings,
going from a few nanoseconds up to 10ms, the latter amounting basically to the
context switch time. Apparently, as a first step, some parts of the kernel can
be preempted, some cannot. The consequence is a reduction of schedular latency,
but nowhere near enough for a RT system at the hiogh end of the scale. Timer
wheels were added to improve locality and queueing, but still certain cascading
operations on this data structure can take a long time. Long enough to warrant
implementing high-resolution timers using RB trees, along with preemptible
spinlocks. Still: greater power means greater responsibility, so care must be
taken. Priority inversion was discussed, and adequaltely illustrated using the
dancing processes.

Real time processes

RCU once more came to the rescue, and it was shown how this can be used in the RT scenario, with priority inversion.

In the final lesson, Paul discussed RT Linux applications.

Real time vs. the Hammer

I guess the above illustration really says it all. The class discussed the
meaning of a hard RT system, and most of use were proven wrong. In some cases
knowing that failure is imminent is more important that guaranteed making the
deadline. (This eems to have some resemblance to writing research papers.) A
combination of an accurate system that is allowed to fail and can indicate it
with a less accurate systemn that guarentees deadline meeting seems to be the
way to go. In any case, maths cannot describe RT systems in practive, and QoS
is more important that hard/soft RT distinction.

RT applications can be divided into three classes: search for life
(medical/industrial control systems), search for death (military) and search
for money (financial). In todays interconnected machine web, the slowest
machine determines the RT nature of the complete system. Multiple serialised
machines have a large impact on this fact. Funny fact: in the Linux kernel,
real time used to mean real life time, rather than deadline meeting. So beware
of the code you rely on!

WCET analysis

July 14th, 2009

One of the courses I am attending deals with worst case execution time analysis or WCET analysis for short.

Peter Puschner

I think that the following are the main points of interest href="http://www.vmars.tuwien.ac.at/people/puschner.html">Peter Puschner
tried to get drilled into our skulls. The course focuses on the worst case
analysis, and for that there are three major items one needs to take into
account: the piece of code to analyse, the input it consumes and the hardware
it runs on. Giving the fact the most code consumes input means that there is
some state associated with it, that determines the paths takes in the static
code graph. Peter clearly stated that the input should be part of the problem
specification, which often complicates the analysis. Additionally, the hardware
has some associated state as well (think caches, branch prediciton, etc.) that
influences the timing of each task we want to analyse. Different instances of
actions in the task have different durations, and as such the sequence of the
actions carries out during each task becomes the important factor. It was
clearly illustrated that measurements are not adequatefor dealing with worst
case scenario’s. I think the lesson to take away from the first class is that
there is no way to measure WCET, one must therefore analyse the program.

The second lesson dealt with a number of approaches for determining the WCET:
tree-based (from the program graph), path-based (potential dynamic execution
paths), and IPET based. Tree-based WCET does not scale well, path=based does,
but that has the disadavantage of getting quite complex as the program size
increases. To deal with this, the IPET technqiue can be used, in which we
desribe the flow of control by using a number of equations that constrain the
possibilities: program flow going ionto a node, must come out of it, so this
yields a number of equations that state how many times an edge can be taken.
The drawback here is that solving the equation using ILP is NP hard in general.
Another technique discussed was the modelling of execution time by mapping a
sequence of instructions to an execution time. This however requires
information on the hardware timing to deal with the various hardware
enhancements to the processor (think pipelines, caches, out=of-order execution,
…). I think this seems to have quite some non-deterministic aspects. Peter
then moved on to show how one can model pipelines, to be used in the WCET
analysis, as well as how one can model caches within the IPET framework. The
takeway lesson here is that there is no straighforward technique to estimate
the WCET, and the most powerful techniques must use hardware timing
information, which is a hard problem in itself.

The third lesson dealt almost exclusively with the modeling of the caches,
moving from a concrete state to an abstract state, which can be reasoned about.
Peter identified 4 essential categories for the analysis in a cache model:
always hit, always miss, globally persistent, locally persistent, plus one
leftover category. For each of these categories, there is a different semantics
for the start state, the update function and the join function on the abstarct
cache model. Finally, we received an introduction to potential timing anolalies
that can occur and how to deal with those.

In the fourth lesson, the focus moved to measuring execution times. Here the
key takeway was that it is very unlikely that the hardware will be set to the
worst case state for executing the application, leading to (at times gross)
underestimates of the WCET. For a WCET bound, more systematic techniques are
required. One of the best performing WCET measurements tools relies on genetic
algorithms for generating the input that drives the WC scenario. Still, for the
benchmarks shown, there can be quite a large discrepancy between the
analytically obtained WCET and the WCET obtained through use of a GA. Other
potential techniques include a probabilisitc approach and program segmentation
with a path based analysis.

In the final class on Friday, Peter discussed some issues with time-predicatble
sofware and hardware. Predicates, transforming execution paths, etc. can be
used to get more reliable WCET bounds. The major takeaway here was that the
classical analysis where one focuses on the mean execution time (with its
distribution, like we did in href="http://www.itkovian.net/base/statistically-rigorous-java-performance-evaluation">our
Java stats paper at OOPSLA) is not
going to work at all for a WCET analysis. Peter also argued for better WCET
bounded hardware, or at least hardware that could make the WCET analysis
easier.

Final words

ACACES 2009 Summer School in Barcelona

July 14th, 2009

Like the previous years, our research group, which is part of the HiPEAC network, organises the ACACES Summer School. Due to the earthquake hitting L’Aquila a few months ago, we’ve relocated to Terrassa, near Barcelona, Spain.


View Larger Map

The facilities are sweet, the food is good (so far), and there is a neat swimming pool. And free WiFi.

Case in point: my room. It had WiFi, and UTP jacks at the desk and the toilet :-)

Room

The building had a very neat design, using rusty metal and metal moveable shutters.

Construction
Back side
Design

The pool was really enjoyable, even at 1:00 AM, though people tended to dive in with their clothes on at that point.

Pool

Next to the main conference hotel, there was an older building, where we held our poster session. And where the wedding party we crashed on Friday was held.

Poster session location

Over the next few days, I’m going to post the notes I have taken in the courses I am attending. Nothing fancy, just what I jotted together.

If you are on twitter, you can also grab the posts tagged with #acaces09, since Alasdair Hawsthorne (previously with Transitive) asked us to use them during his invited talk on Monday evening.

As I have been appointed as the official photographer of the event, pictures will be available in large quantities on flickr. There is also a flickr ACACES group pool where you can add notes, tag the pictures and add your own.

Spamming for patent journals

March 31st, 2009

I have just received a rather interesting email — interesting, not for its contents, but for the idiocy it spells out. I shall refrain from posting it in full, instead focusing on the small interesting part.


An exciting journal entitled “Recent Patents on Computer Science (CSENG)” was launched in January 2008. This journal publishes review articles written by experts on recent patents in the field of Computer Science. Please visit the journal‘s website at www dot compscieng dot org for the Editorial Board, first journal issue, abstracts of recent issues and other details.

Recent Patents on Computer Science (CSENG) is indexed in Genamics JournalSeek, Compendex.

If you would like to submit a review article to the journal on an important patent area in Computer Science, then please provide us the title of your proposed article and a tentative date of submission at editorial@compscieng.org. Moreover in your reply, could you please suggest some specific keywords, keyword phrases related to your topic, so that detailed patents may be sent to you for the preparation of your manuscript.

Please do not hesitate to contact me if you have any queries. I look forward to receiving your positive response.

I went to their website (if you really wish to go there, you’ll have to manually copy and paste the url; I do not link to CS patent promoting people), and skimmed the current issue and found the following.


Recent Patents on Genetic Programming
Michael O’Neill and Anthony Brabazon

Genetic Programming is a form of Natural Computing which adopts principles from neo-Darwinian evolution to automatically solve problems. It is a model induction method in that both the structure and parameters of the solution are explored simultaneously. Genetic Programming is a particularly interesting method as it is claimed to be an invention machine, producing solutions to problems that are competitive and in some cases superior to those produced by human experts. Its best solutions have become patentable inventions in their own right. In this article, we overview some of the recent patents relating to Genetic Programming over the past three years. In light of the number and diversity of patent applications during this period, it is clear that Genetic Programming is a vibrant field of research, which is having a significant impact on real-world applications, and is demonstrating clear commercial potential.

I kid you not. A vibrant field of research these people are trying to kill off, by patenting solutions produced by a bloody search algorithm.

Needless to say, my response was very positive: http://endsoftpatents.org/. I hope they get the bloody clue.