Last weekend I competed with a team of VUW students in the ICFP programming contest. This is an annual, international contest lasting 72 hours. It is run by a different university each year, so the format tends to vary from year to year. There are no restrictions on team size, programming languages or computing resources.
Our team (called ‘interfacers’, for want of a better name) consisted in the end (after several people did not end up participating for various reasons, and one person who happened to be in the lab joined us) of Andrew Childs (hereafter referred to as lorne), Timothy Goddard, Clinton Scott, Samuel Hegarty, Michael Welsh (yomcat) and me.
This year, it ran from 10:00 pm on Friday 20th July until 10:00 pm on Monday 23th July. The story behind this year’s task was that an alien (of the Funn species) named Endo had been dumped on Earth by an Interstellar Garbage Collector and then hit by a cargo container. Endo was unconscious, and could not survive on Earth in his then-current form. Therefore our help was urgently needed to provide the necessary modifications to his DNA to adapt him to Earth’s environment. Our proposed modifications were to be evaluated by his spaceship Arrow and the proposal most likely to ensure Endo’s survival would be performed by Arrow.
We were given a copy of Endo’s DNA (a 7.2 MiB string of the letters I, C, F and P), and a specification of how Funn DNA works. The DNA works by repeatedly modifying itself through a long series of matching and replacing according to certain rules (which we were given) until it is all consumed, and in the process producing RNA. We were also provided with a specification of how to transform the RNA into a 600×600 px image. We were given a source image (which is produced by running Endo’s DNA as provided), and a target image (which we were to attempt to reproduce by constructing a prefix to be prepended to Endo’s DNA):
We started by looking at the specification on Friday night when it was released, and talking about it on IRC for a short time. We then started work the next day in one of the computer labs at VUW, with various people turning up between 9:00 am and about 2:30 pm. I started by writing a Java program to render RNA to am image, while Timothy started writing a Haskell program to run DNA and output RNA. When lorne turned up he took over writing the DNA executer, while Timothy wrote various Ruby scripts to generate test cases. yomcat was also writing some test RNA at around this point, as my RNA renderer was mostly finished (apart from a few minor bugfixes and improvements).
This left us mostly waiting on lorne to get the DNA executer working (so that we could start working on creating a prefix), which proved difficult to do efficiently. The problem was that executing the DNA requires around 2 million iterations, each of which would typically involve searching the whole (several million base) string, cutting it up and then joining it together again in a different order and with some parts added and removed. I learnt that I did not know Haskell as well as I had thought, and unfortunately our other team member who knew Haskell well did not end up participating. This meant the rest of us were stuck with little useful to do. I ended up leaving at about 10:00 pm, after eating pizza.
That night I considered starting an alternative implementation of the DNA executer, in C++ or Java. By the time I got in the next day, Timothy, Clinton and Samuel had already started doing so in Java. I joined them, among other things spending quite a bit of time trying to create a Rope-based datastructure to store the DNA string and effeciently perform the operations needed (it should have had θ(log n) time seek, split and concatenation, I believe). Unfortunately I never got this finished in time, and ended up giving up on it on Sunday night (or was it Monday morning?). We ended up using a linked list of arrays, with each list node keeping an offset and length for the part of the array it was actually using. From my understanding, this would have had θ(1) split and append but θ(n) seek (though with a fairly small constant factor).
By this point lorne had his Haskell DNA executer working, but it was very slow and tended to run out of memory. By Monday morning he had it running better, but still taking almost an hour to run on his laptop or several hours on the lab computers.
With this we managed to access various pages of a ‘Funn Field Repair Guide’, the first page of which we found by using a prefix which was written out but then erased when the DNA ran normally. The first page of this gave us a prefix for another image explaining (somewhat cryptically) how to find more pages, and also one which made the Endo image output sunny as a first step towards the target output. We submitted this ‘make-it-sunny’ prefix, which ended up being the furthest we ever got towards the target. I also attempted to work out how to generate prefixes to call gene ‘functions’ in the provided DNA based on some information provided in the repair guide, but unfortunately never got this working.
We (well, Timothy) eventually got our Java DNA executer working on Monday night, but it was not much faster than our Haskell implementation. I ended up leaving for home (and dinner) at around 9:30 pm, as there was no way I could make any progress towards producing a better prefix in the remaining half hour.
We ended up coming 91st-equal, along with 74 other teams who found the same ‘make-it-sunny’ prefix. A large number of teams acheived a slightly better result by reducing the ‘sunny’ prefix from 28 to 27 bases, while still having exactly the same effect.
Shortly after the end of the contest, lorne found some problems which were causing his Haskell to be so slow and managed to get it running in about 3 minutes.
After the contest ended, and after some hints on IRC, I was able to extract a PNG and MP3 from the DNA, which were encoded with a base for each bit after some ASCII-art text visible when viewing the DNA in a 16-character wide column (e.g. in a hex editor).
A lot of other teams also had similar problems getting their DNA executer program working in reasonable time, though some did create some very fast implementations. One team had a C implementation which took about 10 seconds to run.
All in all, our result was a little dissapointing, but it was still fun and I think that I learned something from the experience (at least, that writing a key part of a system in a language that only one member of the team knows well is a bad idea).
In case anybody is interested, our code is available from our git repository.