Thursday, February 21, 2008

Topping the charts

Long time no blog, sorry!

So what have I been doing? Well, what I'd like to share with you is some hacking of the charting facility in ECJ.

ECJ's distribution contains some references to the inclusion of JFreeChart and iText, which will enable the production of "publication quality charts". OK, I'm all for it, but then come the caveats. "We don't use the Console app much." "We don't really use that."

So what we've got is the linkage to a good free package for charting, but only the sketchiest pointers on how to integrate it into your ECJ experiments.

There were two things about "publication quality" charts that I wanted to do right away. First, charts in journal articles often show multiple data series on the same chart. Second, data published in journal articles is usually averaged over a significant number of GA runs. ECJ doesn't contain an example of either of these things out of the box.

I've hacked together an example of the first functionality. Now I can chart the best of generation, worst, mean, and standard deviation all on the same chart.

I ran my BinInt experiment with the new chart to show that it worked. Wow! So much nicer than looking at the simple statistics!

One thing that jumps out immediately is that the experiment is wasting more than half its function evaluations because the population has converged. While we can guess this is happening when we see the best of generation fitness stall at a particular value, charting the worst, mean and standard deviation makes it completely obvious that before generation 45 (of 100) the entire population is a single genotype.

(Remember, this is a selectorecombinative GA experiment, so there is no mutation to rescue the population. We expect the population to converge well short of the optimum due to domino convergence and drift, and it does.)

Function evaluations are the currency of evolutionary algorithms. While many systems (including ECJ) let you stop a run if a known optimum is reached, stopping on a convergence criterion seems to be just as useful. I'm not sure why this is not more common, since the bookkeeping necessary is not that great an overhead. In general, you can trust the computational cost of the function evaluations to swamp the cost of the algorithm itself.

Now I'll start tinkering with the second issue, averaging over several runs. I'm going to hack this using the subpopulation mechanism of ECJ. Each subpopulation will be an independent run with a different random seed. However, the statistics gathering (and charting) mechanism can look across the values of each subpopulation and collect the statistics and average them. Is this the right way to do this? I'm not sure, but I'll try it!

1 comment:

Anonymous said...

Great stuff. Any chance of including code samples, or attaching source to your post? Would be fantastic if we could see how you've done this.