On April 300, 2008 I successfully held my public PhD defense. You can find the PDF of my dissertation here. I have also put the presentation slides online in PDF format.
This is probably the most important slide I showed:
On April 30th, I am holding my public PhD defense.
Three Pitfalls in Java Performance Evaluation.
Executing a Java application is a complex matter when looking at what happens under the hood. The virtual machine (VM) runs the operations and takes care of class loading, compiling and optimising code, and garbage collection. Due to the interaction going on between the VM and the application and the non-determinism (e.g., due to time-based sampling) within the VM, no two executions will ever behave exactly alike. Performance analysis in this context should not be underestimated. In this dissertation, we uncover three pitfalls that have not been taken into consideration prior to this research.
First, we show that one should not extrapolate performance results from one VM to another, and that small input sets do not necessarily yield behaviour that is representative for large(r) input sets. Second, we demonstrate that prevalent data analysis is falling short of the mark in many cases and can results in erroneous conclusions when making performance comparisons. We propose a rigorous statistical approach that deals with the problems posed by non-determinism. We also add rigour to one particular experimental design, namely, replay compilation.Finally, we illustrate that Java applications exhibit phase behaviour at the method level. We exploit this feature to allow a programmer to gain insight in the performance of his application by allowing him to locate bottlenecks and thus optimise his program by removing them.
As the examination is a public event, everybody can attend. I do ask that you drop a note, or leave a comment if you will be attending. Otherwise, you and several other people might find themselves without drinks at the reception afterward. The event takes place in the Jozef Plateauzaal of the Faculty of Engineering , J. Plateaustraat 22, 9000 Gent and it starts at 14:00.
While I think Michiel did a great job creating a flyer for a PhD public defense announcement/invitation in Microsoft Word, I think the world can use a Pages’08 template as well. So, based on Michiel’s work, I present the flyer template.
This is the first template I ever built in Pages, and I am much indebted to these nice guidlines on macworld.com. Of course, the template probably needs some polishing, so updated versions might become available in the near future.
I have submitted my PhD dissertation — titled ‘Three pitfalls in Java performance evaluation’ — to the Faculty of Engineering who convened on February 20th, 2008 and decided I was to be allowed to the first (internal) defense. I have sent my dissertation to my jury members, who are (alphabetically ordered):
Basically, the internal defense amounts to a two-hour session in which I have to answer a barrage of questions from my jury. If I pass this test, I will be allowed to the public and second defense. April 11 is D-day for now.
I am a co-author for a paper accepted at PACT 2006, titled “Performance Prediction based on Inherent Program Similarity”, by K. Hoste, A. Phansalkar, L. Eeckhout, A. Georges, L.K. John and K. De Bosschere. Kenneth Hoste will present it tomorrow.
The paper abstract reads as follows.
A key challenge in benchmarking is to predict the performance of an application of interest on a number of platforms in order to determine which platform yields the best performance. This paper proposes an approach for doing this.
We measure a number of microarchitecture-independent characteristics from the application of interest, and relate these characteristics to the characteristics of the programs from a previously profiled benchmark suite. Based on the similarity of the application of interest with programs in the benchmark suite, we make a performance prediction of the application of interest.
We propose and evaluate three approaches (normalization, principal components analysis and genetic algorithm) to transform the raw data set of microarchitecture-independent characteristics into a benchmark space in which the relative distance is a measure for the relative performance differences.
We evaluate our approach using all of the SPEC CPU2000 benchmarks and real hardware performance numbers from the SPEC website. Our framework estimates per-benchmark machine ranks with a 0.89 average and a 0.80 worst case rank correlation coefficient.
We show that it is possible to make a very good prediction of which machine will be the best performing machine for an application of which we know only the microarchitecture independent characteristics. You can find a pdf of the paper here.