Laws of engineering are meant to be broken

A reader pointed out a very interesting passage in Richard Dawkins’ The Greatest Show on Earth: The Evidence for Evolution on the future possibilities of genome sequencing. Since the book was published in the middle of 2009, it is quite possible the passage was written in 2008, or even earlier.

Unfortunately for Dawkins’ prognostication track-record, but fortunately for science, he was writing at the worst time to make a prediction:

…the doubling time [data produced for a given fixed input] is a bit more than two years, where the Moore’s Law doubling time is a bit less than two years. DNA technology is intensely dependent on computers, so it’s a good guess that Hodgkin’s Law is at least partly dependent on Moore’s Law. The arrows on the right indicate the genome sizes of various creatures. If you follow the arrow towards the left until it hits the sloping line of Hodgkin’s Law, you can read off an estimate of when it will be possible to sequence a gnome the same size as the creature concerned for only £1,000 (of today’s money). For the genome the size of yeast’s, we need to wait only till about 2020. For a new mammal genome…the estimated date is just this side of 2040

Obsolete plot from The Greatest Show on Earth

The cost for a sequence here is somewhat fuzzy. The first assembly of a genome sequence of an organism is much more difficult than subsequent alignments of later organisms (though more in computation than in the sequencing). But, the upshot is that Dawkins was writing when “Hodgkin’s Law” was collapsing. From 2008 to 2011 Moore’s Law was destroyed by the sequencing revolution pushed forward by Illumina.

Though you can get a $1,000 consumer human sequence today, the reality is that this is for 30× coverage. For lower coverage, which means you aren’t as sure of the validity of any given variant, the price drops rapidly. And for the type of evolutionary questions Dawkins is interested in, the coverage needed is far lower than 30× (you probably want to get a larger number of samples than a single high-quality sample).