Sunday, October 28, 2012

Nice Work If You Can Get It - Review

Nice Work If You Can Get It is a frothy bit of Broadway magic. The show features songs by George Gershwin, and a plot loosely (very loosely) based on his 1926 musical, Oh, Kay. Other material from the film Delicious, including the immortal love song Blah, Blah, Blah.

The show stars Matthew Broderick, Kellie O'hara and a great cast. I saw a matinee last Saturday with my wife, my son Daniel, and a friend of his. We all laughed and clapped up a storm.

The thin plot involves Broderick as Jimmy Winter, a dissolute young rich boy who prefers chorus girls to nice girls. But he's marrying a nice girl to secure his inheritance. Until he meets a bootlegging dame, Billie Bendix. Hilarity ensues.

Apres theater dinner at Barbetta was excellent.

Thursday, October 18, 2012

WAMP Step 2: Install PHP

As detailed in our previous post, we are building towards a WordPress installation on a desktop PC. In step 1, we successfully installed and tested the Apache web server.

As mentioned last time, some documents that are served by the web server to your web browser are static documents - just files sitting on the hard drive. Other documents are actually queries into a database, where the data has been dressed up and presented as a web page. Still other documents are mostly static, but with some customization by a script running on the server.

To support WordPress, we are going to need both a database and a script engine. A very popular scripting language for web pages is PHP, and that is what WordPress requires. The rest of this post will walk through installing PHP.

Wednesday, October 17, 2012

WAMP step 1: the Apache httpd server

Our goal in this process is to set up WordPress to serve the test content of a website and blog. Therefore the very first thing we are going to need is a web server.

I should say at the beginning that you don't need a web server to test basic web pages, you can write HTML files with any text editor, and then point your web browser at the file. You can teach yourself a lot of HTML and client-side JavaScript programming in this way. I think it is better to learn the fundamentals of how web pages work by typing it out for yourself instead of starting with a web development environment - you really understand what is going on.

Web servers deliver content to browsers. This content (web pages, music, images, etc.) is either a file on your machine's hard drive, or data stored in a database (on the hard drive). If it is data from the database, then a program (a script) must be run first to turn it into a document that is served out by the web server.

So even before we set up our Apache web server, we need to set up some space on our hard drive where this content will live. Importantly, we want this space to be independent of the Apache server file structure, so that we can back it up separately, and update Apache if necessary without moving our content. We could even change web servers entirely and our content would still be in the same place.

Setting Up WordPress on a Desktop

My wife is setting up her business and wants to have a website with blog, etc. Great! She'll be getting her domain name and contracting with a web hosting company - we haven't decided which at the moment.

For the blog support, I'm going to test WordPress, since it is very popular. Being the geek that I am, this means setting up WordPress on our home desktop machine.

WordPress (WP) runs on top of several other popular Web tools. Apache is a free web server, PHP is a free scripting engine, and MySQL is a free database. If these tools were running together on a machine using the Linux operating system the whole stack would go by the acronym LAMP. Since I'm using a Windows PC, it will be a WAMP stack for me.

While there are products that claim to install WP and all of these prerequisite pieces of software, I'm going to do it myself and write series of blog posts about the process.

Wish me luck!

ps - The process I will follow is based on Jessie Forrest's blog. Thank you Jessie for putting so much in one place! My posts will include a little background info and changes to the process Jessie posted a few years ago, based on the problems I encountered and how I solved them.

Friday, October 5, 2012

How can a virus see?

Proteorhodopsin genes in giant viruses

This is teh awesome! Rhodopsin is a key protein used in the eye, the first step in capturing light and turning it into sensory information. It can be found even in single celled creatures that swim towards or away from light as a feeding signal. Here, the virus is modifying the behavior of the infected cell with its own rhodopsin gene.

There are many examples of parasites modifying the behavior of their host. A creepy example (aren't all parasites creepy?) is the fungus that makes zombie ants leave their nest and climb up where they will get eaten, which helps the fungus spread.

So how do these giant viruses see with their rhodopsin gene? By making their host see for them. Giant viruses are very different from very small viruses, like the famous Ebola virus. Ebola is less than 20,000 nucleotides long (single stranded RNA) and makes only eight proteins, according to Wikipedia. In contrast, mimivirus has a genome of almost 1.2 million base pairs of DNA. While Ebola just replicates as fast as possible before destroying the host cell, giant viruses are more long lived parasites.

Thursday, October 4, 2012

Evolution: A view from the 21st Century by James A. Shapiro, reviewed by David vun Kannon

In this slim book, eminent bacteriologist James Shapiro attempts to communicate his view of the most important drivers of evolution for a non-specialist readership. The material is dense, mostly because the text does not rise much above an outline of all discovered processes of genetic variation, no matter how obscure.

Shapiro succeeds in conveying the idea that variations arising from genetic change other than uniformly distributed single nucleotide changes are the most important to understanding the diversity of life today. However, his other agendas, such as displacing Crick's Central Dogma of Biology, are not successful.

Let's deal with some of the positives of the book first.

Shapiro is writing for a wide audience, and does not shy away from addressing some issues related to the "Intelligent Design" controversy. Some in the ID community initially took Shapiro to be their friend, in the "enemy of my enemy" sense. However, Shapiro takes the age of the planet and the evolution of life as ground facts.

The book makes extensive use of online appendices and additional reference material. I read the book on the Nook e-reader from Barnes & Noble, and opening the book using the PC version of the Nook reader application made these materials easy to access. Much of the online reference material is linked directly to Pubmed. Online additional readings link mostly to articles from Scientific American - not the primary literature, but an accessible source for the expected audience. These articles span 60 years of publication and many are of historical interest only.

The book is very complete in its coverage of genomic change processes. And while Shapiro's main point is to focus on sources of variation other than point mutation, when it comes to discuss mutation, the book includes intriguing sources such as viruses (in which mutation can happen at much higher frequencies).

Most readers will probably be quite surprised by the importance of genomic processes other than mutation in shaping the course of evolution. Even if you've been following the developments as an interested non-professional, as I have, the variety of processes discussed is sure to teach you something new.

One insight I got was that the machinery that is used to guide protein production inevitably interacts with the machinery of cell duplication (in single celled organisms) and germ line continuation in metazoa. I failed to see previously how often the genome is opened up and read, and how that necessary process creates the chances for things to break and be repaired differently.

However, it must also be said that the book is far from perfect. Indeed, there are many irritations that spoil the enjoyment of learning.

Shapiro seems to feel that it is his job to carry forward the mantle of "unorthodox biologist" worn by Lynn Margulis and Barbara McClintock, among others. He takes several shots at "evolutionists" for missing the importance of jumping genes and symbiosis, while focusing on the population genetics of single random changes.

With all due credit to McClintock and Margulis, Shapiro's rhetorical stance is unhelpful. He does play into the hands of those that would willfully misrepresent his position by using loaded terms such as Darwinism and evolutionist, without defining them and apparently without concern with how these terms have been used in the popular press. If Shapiro means "evolutionary biologist" when he says evolutionist, he should use the less charged term.

Shapiro repeatedly uses a "microprocessor" metaphor that is painfully inappropriate. A computer CPU is a piece of hardware that can execute any series of instructions stored in memory, given a link to the first address of where to fetch the data. The information of the genome is the data, not the hardware that reads or  acts on the data. The closest thing in the cell to a CPU is the set of molecules that read, transcribe and translate, DNA into protein - the ribosome.

There are many ribosomes working in parallel in the cell, one of several failures of the analogy. As a system, the protein production and genetic machinery are more like a "production system" - a set of if-then rules that work in parallel. This is software, not hardware, but it fits better.

Scientists frequently stretch to find an analogy which will work for a lay reader, to help the reader understand their work. If that is what Shapiro was trying to do, it doesn't work. If he actually thinks the genome instantiated in a cell is a microcircuit, he is sadly mistaken about microelectronics.

Few, if any, people would call a microprocessor "aware", "intelligent", or capable of cognition, yet this book does use such aggressively telic language with respect to the cell and the genome. However, we should only be willing to talk about "cell cognition" if we are also willing to talk about "thermostat cognition". The feedback loops elaborated in the cell are only marginally more complex than your friendly household appliance.

Darwin comes in for some criticism that seems unnecessary, sort of like criticising Newton for not discussing relativity. Yes, Darwin's uniformitarianism was/is a simplification of what we know today, and did reflect philosophical debates of his time. So what? Does this need to be criticised or simply acknowledged?

Repeatedly when dismissing the random mutation of single nucleotides, Shapiro seems to confuse random with 'uniformly distributed' - or read that confusion onto others. We know that SNPs (single nucleotide polymorphisms) are not randomly distributed. They are more likely in some parts of the DNA string than in others. However, they are random, in the sense that we don't know in advance where a change will take place, even if we know they take place at different frequencies. As an analogy, we know that a sample of a radioactive element has a half-life, but we don't know which atom will decay next.

Shapiro seems unconcerned with the Darwinian distinction between selection and sources of variation, giving all the credit to the multiple sources of variation, and little or none to the various forms of selection that can act on an organism. There is also no discussion of the "evolution of evolvability" as a framework for understanding the many mechanisms that are cataloged in the book.

A weakness in the writing is to describe the genetic machinery as "indescribably complex" before launching into a description of it! Phrases like 'indescribably complex' are just more fodder for quote mining by creationists. Similarly, Shapiro's overall anti-reductionist stance obscures the fact that all of the data and research are based on a reductionist paradigm - the genetic machinery of the cell is entirely the arrangement of atoms and the forces acting upon them, as is everything else in the cell. There is no vital elixir or special sauce that defies reduction to these terms.

While referring to it several times, Shapiro never successfully attacks the Central Dogma of Biology, that information flows in the direction of DNA to RNA to protein, but not in reverse. There are ample examples given of proteins attaching to and regulating the genome, but those proteins are always created by the genome.

Relevance to the ID debate

The book does mention Intelligent Design. However, it also treats evolution as a fact. Natural genetic engineering, as Shapiro calls it, is the source of variation used by evolution. These large scale additions and rearrangements are the driver of metazoan evolution - not new sequences.

It has been the mistake of ID supporters to try to find an ally in Shapiro. Obviously, they did not read the whole book, or if they did their memory is quite selective as to its contents.

It should be mentioned that Shapiro published two papers with controversial ID figure Dr. Richard von Sternberg in 2005. Sternberg is thanked in the acknowledgements.

Shapiro also indulges in some 'bignum' argumentation. This is the sort of handwaving probability calculation that concludes that "there is not enough time" for base-by-base change to create the evolutionary results we see. This kind of reasoning has often been proved wrong, usually by pointing out that sexual reproduction allows many changes to be selected in parallel throughout a population and combined, and high rates of reproduction and HGT (horizontal gene transfer) accomplishing the same thing in bacteria.

Indeed, Shapiro's own discussion of viruses as sources of variation for other life does not examine the sources of variation in viruses - uncorrected random mutation.

The telic language, anti-"Darwinism", and "gee, its complicated" attitude are all ID friendly, but in the end Shapiro has a clear vision of who the Intelligent Designer is, and it is the cell itself.

Relation to the GA software paradigm

Much of conventional Genetic Algorithm software is explicitly point mutation based. Mutation and crossover are often the only operators used, and usually mutation is uniform along the genome. This is the strawman view of genetics that Shapiro criticizes most sharply.

I think the book can be read as a set of suggestions for improving our GA algorithm design if we want to achieve more than numerical optimization with GA. Here are some taking off points that I see:

  • exon/intron distinctions, and redundant representations
  • germ/soma distinctions
  • both of the above presuppose a more robust genotype/phenotype distinction
  • development of that phenotype aka evo-devo GAs
  • fitness testing at multiple points during a phenotype's lifetime
  • gene regulatory networks - genetic operators for regulation of the genome

In conclusion, a book well worth reading and thinking about, even with the annoyances and idiosyncrasies of the author.

Monday, October 1, 2012


I wrote previously that my wife and I were attending the Arthur Murray "Freestyles" event that was held yesterday, so this post is report on that event.

Wow! It has a lot of fun, a whole day of dancing and socializing with our friends from the Arthur Murray dance studio in Montclair. At a Freestyles event, several Arthur Murray studios get together to give students a venue to perform for a judge, in order to receive impartial feedback. It is not a competitive event, and the emphasis for beginning students (like us) is on performing the syllabus figures for your level, either without additional choreography (closed) or with (open).

Yesterday's Freestyles was held at the Bridgewater Marriott, a centrally located venue for the Arthur Murray studios of Montclair, Kenilworth, Whitehouse Station, and Princeton. The event was well organized. The dancing was by 'heats' of several couples at once on the floor, all dancing the same dance, though each couple was dancing their own choice of steps. Each heat also had a variety of skill levels, and while there were no collisions it was clearly the responsibility of the better dancers and instructors to avoid the less skilled.

Blanka and I had chosen to dance four dances (Rumba, Cha-Cha, Foxtrot, and Tango) both together and with our instructors. As a result, we were each on the floor eight times, across twelve different heats. It was very nice that we were not in the same heat while dancing with our instructors, because we got to enjoy watching each other dance.

The actual dancing was wonderful, without much anxiety about forgetting steps, losing timing, or stepping on the dress of the lady next to me. We had practiced so much leading up to the event that it was almost second nature to dance the dances that we had chosen, and we did dance them 'freestyle', without repeating a set sequence.

As the day progressed, it was obvious that we had taken a very cautious approach to participating in the event. Many students danced multiple heats of the same dance, open and closed, for many more than four dances! There were 166 (!!) heats during the day, so we were sitting and cheering our friends on for most of the day. Next time, we'll know to push ourselves to dance more.

The day included lunch, dinner, a pro show by the instructors from each studio, and casual dancing after dinner. We were quite tired by the end of the evening, and I can't imagine where some of the instructors got their energy, dancing heat after heat with their students.

We had a great time.