Thursday, January 31, 2008

ECJ - building a BinInt problem

I've gotten a little farther with my learning of ECJ.

The tutorial 1 provided with ECJ uses the MaxOnes as a sample problem. MaxOnes simply adds up the number of "1" bits in the genome of an individual to create a fitness for that individual.

BinInt changes the problem by reinterpreting the genome as a binary integer. In my version, I've written it so that you can cut the genome up into several integers. For example, a genome of 150 bits can be interpreted as 30 5-bit integers, 5 30-bit integers, or even 16 9-bit integers with one 6-bit integer.

BinInt is a class of problems that imposes two new kinds of difficulty on the GA. First, BinInt shows scaling. Different bits are worth exponentially more than other bits. Secondly, bits are related to each other to form sub-problems. All the bits of a particular sub-problem contribute to the solution of that part of the overall problem.

Scaling in particular challenges a GA. The large fitness contribution from just a few bits can cause individuals lucky enough to have those bits turned on early in the run to quickly dominate the population, even before the population has found the best arrangement for the other bits. This problem is called premature convergence.

I had fun building the BinInt problem in ECJ. It was straightforward to modify the MaxOnes problem fitness evaluation function. The learning came in when I wanted to create a custom parameter to control the sizes of the integers. I invoked the OTSOG (On The Shoulders Of Giants) principle, and asked for help from the ECJ discussion list, with very helpful and immediate answers. Thanks guys!

Thursday, January 24, 2008

Sketch of comics hero team

Oh yeah, I like comics and manga, too. Sorry I forgot to tell you earlier.

I visited an Oscar site yesterday, and was happy to see that Cate Blanchett was nominated for Elizabeth: The Golden Age. I feel really stupid for not realizing that this movie is a sequel.

Anyway, it got me thinking about a hero team set in the Elizabethean Age.

In the backstory, Elizabeth really is the Faerie Queen. The last test of a new Queen of Faerie is to seduce a mortal lover, but Elizabeth instead falls for the mortal and renounces her throne in Faerie. This choice angers a large number of powerful spirits, who were planning on using Elizabeth's control of two thrones (Faerie and England) to transform England into the New Atlantis. (Not the New Atlantis of Bacon, sorry.)

Our heroes:
Gloriana - Queen Elizabeth in disguise. Or is Elizabeth the disguise, and Gloriana her real identity?
Sir Francis Walsingham - the spymaster of Elizabeth in real life. Here, a bit more "hands on".
Dr. John Dee - a link between the scientific and the supernatural. Capable of a bit of wizardry with the help of the right artifacts.
The Golem of Prague - on loan to Dee from the Maharal of Prague, who made him. Think Ben Grimm with a Yiddish accent.
Thomas Phelippes - asistant of Walsingham. Inventor, cryptographer, etc. The Q to Walsingham's M. Deciphered the Book of Soyga for Dr Dee.

Our villains:
Phillip II of Spain - wants to conquer England and create his own version of the New Atlantis.
Duetia - beautiful faery, would be Queen of Faery if she could eliminate Gloriana.
John Kelley/Talbot - worked with Dr Dee but has fallen under dark influences.
Quetztlcoatl - the Aztec god, making a deal to stay alive in the New World's New World Order.
Talus - a mechanical man/animated suit of armor. Sometimes comic relief for being too literal.

I know you're asking already, who would win a Golem vs. Talus smackdown. Is the occult science of Dr Dee any match for the pure pagan fury of a God? Can Walsingham outwit Phillip? Can Gloriana save England from drowning beneath the waves if Duetia has her way? And what about Naomi?

Find out in our next issue!

Tuesday, January 22, 2008

ECJ - tutorial 1

I set up the Evolutionary Computation in Java (ECJ) system to run under Eclipse. Here is a brief report on using it to run the first tutorial provided with the system. This is the same tutorial covered in the YouTube videos mentioned previously on this blog.



The tutorial description assumes that you are typing in every line of the tutorial1.params file, but since it is provided I went straight to tweaking the different parameters.



The optimization problem in this tutorial is trivial to state. The genome is a set of unrelated bits which are counted, so maximum fitness occurs when all bits are 1, hence the name MaxOnes. As trivial as that sounds, a lot of basic research in GAs is conducted on this and similarly simple functions.



Out of the box, the parameters of interest are


  • crossover probability = 1.0 (we will always cross to get a new individual)

  • mutation probability = 0.01 (every bit has a 1% chance of changing)

  • genome size = 40 bits

  • population size = 40

  • generations = 100



With this setup, we get





Generation 32
Found Ideal Individual


It took 32*40=1280 function evaluations to find the ideal individual of all ones.


Upping the ante, if we double the genome size to 80


Generation 87
Found Ideal Individual

We found it, but doubling the size of the problem more than doubled the effort to solve it. It almost tripled it. That is disturbing.


Doubling once again to a genome of 160 bits, and the system cannot find the ideal individual, even in 1000 generations. Even if the population size is quadrupled to 160, the system fails.


What has happened? Have we reached the edge of evolution already? Is this all that GAs are capable of, solving trivial problems?


Happily, the answer is no. We are very far from the edge of evolution (if there is such a thing), but we are at the beginning of understanding how GAs work.


One of the great themes of search and optimization algorithms is the tradeoff between exploration and exploitation. When do we look in new areas for an optimum, and when do we try to find the best value in the local area of a solution we already know is "pretty good"?


In GA work, the operators of crossover and mutation are often characterized as typifying exploration and exploitation, respectively. I think this is an acceptable characterization wrt mutation, though it isn't the greatest way to think about genetic operators. Another way to think about mutation is to look at it as a kind of heat source, as in simulated annealing.

What happens if we scale back the mutation rate to 0.001 and go back to our original population size of 40?


Generation 219
Found Ideal Individual

Very interesting! What was happening in our earlier runs? The mutation rate was 1 out of 100. When the genome size was less than 100 bits, individuals could get crossed and then never suffer mutation on their way into the new population. But when the genome size went to 160, we were almost guaranteeing that every individual would suffer, on average, 1.6 bits changed. So the population was constantly dancing around the optimum, but never hitting it exactly.

If there is a final lesson to be drawn from this small example, it is that GAs are not about parameter tweaking. There is always a coupling between the probem and the parameters that needs to be understood.

Chipping Away at the Wierdness

Over at the Mindful Hack, Denyse O'Leary defends the study of prayer thusly

Anyone familiar with the placebo effect would consider $2.3 million to
study the power of prayer money well spent.

How should we understand this? Does Denyse really think God is nothing more than the placebo effect? If intercessionary prayer "works", there should be some measurable difference between the placebo effect and people who are prayed for.

I think Denyse would like to help people believe in the objective reality of God. She is, after all, not a journalist but an apologist. This kind of post does not advance that agenda.

However, it is interesting when looked at in the context of many of Denyse's posts on Mindful Hack, Uncommon Descent, and her Design of Life blog. In posts where Denyse tries to channel Carl Zimmer to actually write about science, she winds up writing sentences that carry assumptions such as the reality of deep time, an old earth, and common descent. Is she with Michael Behe on this, and against most of her crank readership?

Also more out of the closet recently is DaveScot, Uncommon Descent's resident dominatrix. Dave fessed up to the same trio in a few posts over there recently, but has saved his crank street credibility by backing some form of front loaded evolution.

Are we seeing some of the loudest cheerleaders of ID talking themselves down? One can only hope.

A Model of Error

Over at UD, math (sic) professor Granville Sewell resurrects one of the silliest posts of 2006, Gil Dodgen's misapprehension of what makes a model. No, the errors in our OS are not part of the model, any more than they are part of your PDE solver or Dr Dembski's MESA or Dr Marks' EIL work.

It is however true that if you are running intense numerical calculations, you need to take into account the quality of your (pseudo)random number generator, your floating point hardware's aptitude for error due to heat, cosmic rays, bad programming by Intel, etc. Taking these things into account is very different from intending them to be part of the model system. No one is going to be impressed by results that need to be run on Windows ME with a lump of uranium lying on the desk next to the CPU.

But it is from this support the Dr Sewell makes a sweeping statement, "Unintelligent forces simply can’t do intelligent things..." Sorry, Dr Sewell. In the words of the prophet, pwned! May I be the first to introduce you to the Humies.

If you're not following the link, I'll just summarize that the Humies are cash based awards for achievement equal or better than human at some intellectual task, by an EA based system. That we can give away awards like this is a testament to the power that GA, GP, etc have already achieved.

Sunday, January 20, 2008

Paprika

I'm not a complete otaku, which is probably why it has taken me until now to notice this movie.

If you appreciated the way The Matrix played with "is this reality?" than you will love++ this anime. The mix of traditional cel animation and cgi was well done and the music was excellent. The playfullness of the OP (opening credits) was amazing and set the tone for the whole movie. Buy it on DVD!

Can't wait for Spring!

My recent look into science standards here in New Jersey popped up the factoid that the Garden Sate has a state dinosaur (as do several other states). Now I've found that the best place to look for fossils here is Poricy Park. Thank you NJGS!

Since you have to wade out into the stream to seive for fossils along the sandbanks, I'll have to wait for the spring before dragging the family out for the fun. This is gonna be great!