Thursday, February 21, 2008

Topping the charts

Long time no blog, sorry!

So what have I been doing? Well, what I'd like to share with you is some hacking of the charting facility in ECJ.

ECJ's distribution contains some references to the inclusion of JFreeChart and iText, which will enable the production of "publication quality charts". OK, I'm all for it, but then come the caveats. "We don't use the Console app much." "We don't really use that."

So what we've got is the linkage to a good free package for charting, but only the sketchiest pointers on how to integrate it into your ECJ experiments.

There were two things about "publication quality" charts that I wanted to do right away. First, charts in journal articles often show multiple data series on the same chart. Second, data published in journal articles is usually averaged over a significant number of GA runs. ECJ doesn't contain an example of either of these things out of the box.

I've hacked together an example of the first functionality. Now I can chart the best of generation, worst, mean, and standard deviation all on the same chart.

I ran my BinInt experiment with the new chart to show that it worked. Wow! So much nicer than looking at the simple statistics!

One thing that jumps out immediately is that the experiment is wasting more than half its function evaluations because the population has converged. While we can guess this is happening when we see the best of generation fitness stall at a particular value, charting the worst, mean and standard deviation makes it completely obvious that before generation 45 (of 100) the entire population is a single genotype.

(Remember, this is a selectorecombinative GA experiment, so there is no mutation to rescue the population. We expect the population to converge well short of the optimum due to domino convergence and drift, and it does.)

Function evaluations are the currency of evolutionary algorithms. While many systems (including ECJ) let you stop a run if a known optimum is reached, stopping on a convergence criterion seems to be just as useful. I'm not sure why this is not more common, since the bookkeeping necessary is not that great an overhead. In general, you can trust the computational cost of the function evaluations to swamp the cost of the algorithm itself.

Now I'll start tinkering with the second issue, averaging over several runs. I'm going to hack this using the subpopulation mechanism of ECJ. Each subpopulation will be an independent run with a different random seed. However, the statistics gathering (and charting) mechanism can look across the values of each subpopulation and collect the statistics and average them. Is this the right way to do this? I'm not sure, but I'll try it!

Tuesday, February 12, 2008

The Illusion of Life

The Illusion of Life is actually the name of a beautiful book about the classic Disney style of animation and how it was acheived.

I'm going to use the phrase as a jumping off point to talk about GAs. Genetic Algorithms are explicitly designed to imitate the successful process of optimization seen all around us.

Simple GAs try to tear down the bells and wistles of real biology and use just the core ideas of evolution. Take a finite population, evaluate the fitness of each member. Preferentially select higher fitness members to reproduce or be copied forward. Allow some source of variation to create new population members during reproduction. The most common reproductive operators are mutation and crossover.

A choice of population size, selection algorithm, operators and representation that all work together will take even a simple GA a long way. But GAs are sometimes criticised as being unable (even in principle) to replicate the diversity we see in the natural world.

Most GAs working on an optimization problem are like a microbe evolving drug resistance in a test tube environment. No one is even attempting to recreate the variety of species and niches of real life.

Here are some things that are sometimes brought up as examples of how GAs can be made more "biological", in the belief that this will somehow make them better:
  • haploid genomes
  • inversion operator
  • introns

Now it may be that there are specific optimization problems where each of these tricks might be helpful, but none of them has been shown to be extremely useful.

Notice that these ideas focus on the nitty-gritty of DNA. I have my own list of things I think might be important to bring in to our GA models from biology, but these aren't on my list. Instead, here is my list:
  • embodiment - in time and space
  • coevolution - a fitness measure in which the other members of the population are explicitly considered, including the possibility of parasitism
  • development - genetic cascades, and some form of lifetime fitness measure
  • self regulation - in the form of hormones or neurons
  • sexual selection

I'm much more interested in the possibilities of a boost from ecology than biology. It may be that one day, very powerful GAs will drop some things that are now considered "essential" such as the population itself! Estimation of distribution algorithms are a step in that direction. Until then, I think we do have a lot to learn from Mother Nature, but I think we need to look in some different directions.

Monday, February 11, 2008

Sex, Lies, and Genetic Algorithms

Moving on from the BinInt problem, the next basic research problem to code in ECJ is the classic "deceptive" problem. In this kind of problem, the optimum point is surrounded by very low fitness points, while the points furthest from the global optimum have the second highest fitness. A hill climbing algorithm is tempted to follow the slope away from the global best towards the suboptimal points.
However, a selectorecombinative GA is not a hill climber. Its ability to find the optimum point is based on shuffling combinations of alleles, trying to find good sets.

If the deceptive subproblem is too big, the GA has little better than random chance of finding the optimum, so here also we will rely on building up a problem by concatenating several subproblems.

Based on calculations shown in The Design of Innovation, p. 160, I verified that my code for the trap function was in fact deceptive. Then I looked at a few other sources, including Franz Rothlauf's Representations for Genetic and Evolutionary Algorithms, and I saw that my code for the trap was basically the same as his.

Tuesday, February 5, 2008

Testing my BinInt implementation

As a sanity check, I scaled the size of my subproblems down to 1, and compared the performance of the BinInt and MaxOnes problems with identical parameters. Ta dah! Exactly the same results, as you would expect.

Now I'm dialing up the BinInt size gradually. The effect of scaling is obvious, as many bits converge too early, carried along on the coattails of the high value bits. This series of experiments is with a 0.0 mutation rate, to test the powers of a selectorecombinative GA in isolation. In a problem l bits in size, with population and number of generations also held at l for l^2 function evaluations in total, the best fitness over a run is not even reaching half the known optimum.

I'm also still learning lot's of basic things about ECJ, such as how to chart statistics using the gui console. Woot!