Fork me on GitHub

Quantifying Student Projects

Posted on February 3, 2024 by Troels Henriksen

Futhark’s main developers work as researchers and teachers at the University of Copenhagen. One of the perks of working at a university is a steady supply of free labour students looking for projects. Each undergraduate student must at least write a bachelor’s thesis, and each master’s student must at least write a master’s thesis. Beyond these, students can also do elective projects if they wish. When I was a student, I always found it very interesting to participate in real research projects, and I would like current generations of students to have that experience as well. Allowing students to participate in Futhark development benefits all: the students get to spend their time building stuff that (potentially) actually matters, and Futhark improves in ways we wouldn’t have time for otherwise. I wrote a little bit about how we make use of student projects when I wrote my PhD reflections, and again later, but I thought maybe people would be interested in some numbers about how much student work has gone into the Futhark compiler.

Anyone who has worked in academia has experience with software built though student-based development. Such software is maintained by generations of students, each of which contribute a few pieces that are integrated in whatever way is possible, without any major thought towards long-term maintainability or coherent design. The quality is usually poor, documentation often absent, and problems fixed only by throwing more students at it. This is a risk that I was acutely aware of when we began inviting students to work on the compiler, and we have been quite picky regarding what we ultimately merge.

There are of course students who simply do not manage to produce a contribution of the required quality. That is perfectly fine and expected - they are still likely to have passed their project, which is evaluated on their completion on learning goals, not how much we were able to exploit their labour.

Second, some exploratory projects may produce a contribution that does work, but which is simply too complex compared to what it offers. The best example is the work by Steffen Holst Larsen on Multi-GPU Execution and a Vulkan backend. Absolutely top notch work, but neither the compiler infrastructure nor the surrounding software environment (in the case of Vulkan) was ready at the time. Merging these contributions would have imposed a nontrivial maintenance burden on us, without truly benefiting users. These were both successful projects, in that we learned things we have made use of since, and Steffen went on to work on compilers at Codeplay.

Let’s talk numbers. First, some caveats. I only have numbers for the projects I have supervised or co-supervised myself. Cosmin Oancea, who founded the Futhark project, has supervised several compiler-related projects, which are not included here. Martin Elsman has also supervised many projects, but mostly about data parallel programming, and less about the compiler itself. I am also leaving out PhD students, as their contributions are on a very different scale.

But as for me, I have (co-)supervised a total of 52 projects: 15 MSc theses, 31 BSc theses, and 6 auxiliary projects. Of these, 15 were not directly related to the compiler or its tooling, but involved such things as implementing parallel algorithms or porting benchmarks. I will not be considering these further, but several of them resulted in work that we still use, or intend to use.

That leaves 37 projects that directly worked on the compiler or its tooling. Of these, 22 resulted in contributions that were integrated in the main compiler code base. I can list a few of the more noteworthy ones:

These range from significant new parts of the compiler (such as new backends), to rewrites of pre-existing older passes (the fusion engine, locality optimisations). The latter are actually the hardest to integrate, as we want to avoid performance regressions, and the current behaviour is often a mix of hacks and ad-hoc implementation quirks.

Futhark has definitely benefited from student work in the past, and it is certain that we will continue to do so in the future.