Intelligent Design The Definitive Source on Intelligent Design

Molecular Machines

Editor's Note: This article presents an overview of the key ideas in biochemist Michael Behe's book Darwin's Black Box: The Biochemical Challenge to Evolution. A more detailed discussion of these ideas can be found in the book itself. Those interested in the debate over intelligent design in biology should also check out Michael Behe's extensive responses to various critics.

Darwinism's Prosperity

Within a short time after Charles Darwin published The Origin of Species the explanatory power of the theory of evolution was recognized by the great majority of biologists. The hypothesis readily resolved the problems of homologous resemblance, rudimentary organs, species abundance, extinction, and biogeography. The rival theory of the time, which posited creation of species by a supernatural being, appeared to most reasonable minds to be much less plausible, since it would have a putative Creator attending to details that seemed to be beneath His dignity.

As time went on the theory of evolution obliterated the rival theory of creation, and virtually all working scientists studied the biological world from a Darwinian perspective. Most educated people now lived in a world where the wonder and diversity of the biological kingdom were produced by the simple, elegant principle of natural selection.

However, in science a successful theory is not necessarily a correct theory. In the course of history there have also been other theories which achieved the triumph that Darwinism achieved, which brought many experimental and observational facts into a coherent framework, and which appealed to people's intuitions about how the world should work. Those theories also promised to explain much of the universe with a few simple principles. But, by and large, those other theories are now dead.

A good example of this is the replacement of Newton's mechanical view of the universe by Einstein's relativistic universe. Although Newton's model accounted for the results of many experiments in his time, it failed to explain aspects of gravitation. Einstein solved that problem and others by completely rethinking the structure of the universe.

Similarly, Darwin's theory of evolution prospered by explaining much of the data of his time and the first half of the 20th century, but my article will show that Darwinism has been unable to account for phenomena uncovered by the efforts of modern biochemistry during the second half of this century. I will do this by emphasizing the fact that life at its most fundamental level is irreducibly complex and that such complexity is incompatible with undirected evolution.

A Series of Eyes

How do we see?

In the 19th century the anatomy of the eye was known in great detail and the sophisticated mechanisms it employs to deliver an accurate picture of the outside world astounded everyone who was familiar with them. Scientists of the 19th century correctly observed that if a person were so unfortunate as to be missing one of the eye's many integrated features, such as the lens, or iris, or ocular muscles, the inevitable result would be a severe loss of vision or outright blindness. Thus it was concluded that the eye could only function if it were nearly intact.

As Charles Darwin was considering possible objections to his theory of evolution by natural selection in The Origin of Species he discussed the problem of the eye in a section of the book appropriately entitled "Organs of extreme perfection and complication." He realized that if in one generation an organ of the complexity of the eye suddenly appeared, the event would be tantamount to a miracle. Somehow, for Darwinian evolution to be believable, the difficulty that the public had in envisioning the gradual formation of complex organs had to be removed.

Darwin succeeded brilliantly, not by actually describing a real pathway that evolution might have used in constructing the eye, but rather by pointing to a variety of animals that were known to have eyes of various constructions, ranging from a simple light sensitive spot to the complex vertebrate camera eye, and suggesting that the evolution of the human eye might have involved similar organs as intermediates.

But the question remains, how do we see? Although Darwin was able to persuade much of the world that a modern eye could be produced gradually from a much simpler structure, he did not even attempt to explain how the simple light sensitive spot that was his starting point actually worked. When discussing the eye Darwin dismissed the question of its ultimate mechanism by stating: "How a nerve comes to be sensitive to light hardly concerns us more than how life itself originated."

He had an excellent reason for declining to answer the question: 19th century science had not progressed to the point where the matter could even be approached. The question of how the eye works-that is, what happens when a photon of light first impinges on the retina-simply could not be answered at that time. As a matter of fact, no question about the underlying mechanism of life could be answered at that time. How do animal muscles cause movement? How does photosynthesis work? How is energy extracted from food? How does the body fight infection? All such questions were unanswerable.

The Calvin and Hobbes Approach

Now, it appears to be a characteristic of the human mind that when it is lacks understanding of a process, then it seems easy to imagine simple steps leading from nonfunction to function. A happy example of this is seen in the popular comic strip Calvin and Hobbes. Little boy Calvin is always having adventures in the company of his tiger Hobbes by jumping in a box and traveling back in time, or grabbing a toy ray gun and "transmogrifying" himself into various animal shapes, or again using a box as a duplicator and making copies of himself to deal with worldly powers such as his mom and his teachers. A small child such as Calvin finds it easy to imagine that a box just might be able to fly like an airplane (or something), because Calvin doesn't know how airplanes work.

A good example from the biological world of complex changes appearing to be simple is the belief in spontaneous generation. One of the chief proponents of the theory of spontaneous generation during the middle of the 19th century was Ernst Haeckel, a great admirer of Darwin and an eager popularizer of Darwin's theory. From the limited view of cells that 19th century microscopes provided, Haeckel believed that a cell was a "simple little lump of albuminous combination of carbon", not much different from a piece of microscopic Jell-O. Thus it seemed to Haeckel that such simple life could easily be produced from inanimate material.

In 1859, the year of the publication of The Origin of Species, an exploratory vessel, the H.M.S. Cyclops, dredged up some curious-looking mud from the sea bottom. Eventually Haeckel came to observe the mud and thought that it closely resembled some cells he had seen under a microscope. Excitedly he brought this to the attention of no less a personage than Thomas Henry Huxley, Darwin's great friend and defender, who observed the mud for himself. Huxley, too, became convinced that it was Urschleim (that is, protoplasm), the progenitor of life itself, and Huxley named the mud Bathybius haeckelii after the eminent proponent of abiogenesis.

The mud failed to grow. In later years, with the development of new biochemical techniques and improved microscopes, the complexity of the cell was revealed. The "simple lumps" were shown to contain thousands of different types of organic molecules, proteins, and nucleic acids, many discrete subcellular structures, specialized compartments for specialized processes, and an extremely complicated architecture. Looking back from the perspective of our time, the episode of Bathybius haeckelii seems silly or downright embarrassing, but it shouldn't. Haeckel and Huxley were behaving naturally, like Calvin: since they were unaware of the complexity of cells, they found it easy to believe that cells could originate from simple mud.

Throughout history there have been many other examples, similar to that of Haeckel, Huxley, and the cell, where a key piece of a particular scientific puzzle was beyond the understanding of the age. In science there is even a whimsical term for a machine or structure or process that does something, but the actual mechanism by which it accomplishes its task is unknown: it is called a "black box." In Darwin's time all of biology was a black box: not only the cell, or the eye, or digestion, or immunity, but every biological structure and function because, ultimately, no one could explain how biological processes occurred.

Biology has progressed tremendously due to the model that Darwin put forth. But the black boxes Darwin accepted are now being opened, and our view of the world is again being shaken.

Take our modern understanding of proteins, for example.


In order to understand the molecular basis of life it is necessary to understand how things called "proteins" work. Proteins are the machinery of living tissue that builds the structures and carries out the chemical reactions necessary for life. For example, the first of many steps necessary for the conversion of sugar to biologically-usable forms of energy is carried out by a protein called hexokinase. Skin is made in large measure of a protein called collagen. When light impinges on your retina it interacts first with a protein called rhodopsin. A typical cell contains thousands and thousands of different types of proteins to perform the many tasks necessary for life, much like a carpenter's workshop might contain many different kinds of tools for various carpentry tasks.

What do these versatile tools look like? The basic structure of proteins is quite simple: they are formed by hooking together in a chain discrete subunits called amino acids. Although the protein chain can consist of anywhere from about 50 to about 1,000 amino acid links, each position can only contain one of 20 different amino acids. In this they are much like words: words can come in various lengths but they are made up from a discrete set of 26 letters.

Now, a protein in a cell does not float around like a floppy chain; rather, it folds up into a very precise structure which can be quite different for different types of proteins. Two different amino acid sequences-two different proteins-can be folded to structures as specific and different from each other as a three-eighths inch wrench and a jigsaw. And like the household tools, if the shape of the proteins is significantly warped then they fail to do their jobs.

The Eyesight of Man

In general, biological processes on the molecular level are performed by networks of proteins, each member of which carries out a particular task in a chain.

Let us return to the question, how do we see? Although to Darwin the primary event of vision was a black box, through the efforts of many biochemists an answer to the question of sight is at hand. The answer involves a long chain of steps that begin when light strikes the retina and a photon is absorbed by an organic molecule called 11-cis-retinal, causing it to rearrange itself within picoseconds. This causes a corresponding change to the protein, rhodopsin, which is tightly bound to it, so that it can react with another protein called transducin, which in turn causes a molecule called GDP to be exchanged with a molecule called GTP.

To make a long story short, this exchange begins a long series of further bindings between still more specialized molecular machinery, and scientists now understand a great deal about the system of gateways, pumps, ion channels, critical concentrations, and attenuated signals that result in a current to finally be transmitted down the optic nerve to the brain, interpreted as vision. Biochemists also understand the many chemical reactions involved in restoring all these changed or depleted parts to make a new cycle possible.

To Explain Life

Although space doesn't permit me to give the details of the biochemistry of vision here, I have given the steps in my talks. Biochemists know what it means to "explain" vision. They know the level of explanation that biological science eventually must aim for. In order to say that some function is understood, every relevant step in the process must be elucidated. The relevant steps in biological processes occur ultimately at the molecular level, so a satisfactory explanation of a biological phenomenon such as sight, or digestion, or immunity, must include a molecular explanation.

It is no longer sufficient, now that the black box of vision has been opened, for an "evolutionary explanation" of that power to invoke only the anatomical structures of whole eyes, as Darwin did in the 19th century and as most popularizers of evolution continue to do today. Anatomy is, quite simply, irrelevant. So is the fossil record. It does not matter whether or not the fossil record is consistent with evolutionary theory, any more than it mattered in physics that Newton's theory was consistent with everyday experience. The fossil record has nothing to tell us about, say, whether or how the interactions of 11-cis-retinal with rhodopsin, transducin, and phosphodiesterase could have developed, step by step.

"How a nerve comes to be sensitive to light hardly concerns us more than how life itself originated", said Darwin in the 19th century. But both phenomena have attracted the interest of modern biochemistry in the past few decades. The story of the slow paralysis of research on life's origin is quite interesting, but space precludes its retelling here. Suffice it to say that at present the field of origin-of-life studies has dissolved into a cacophony of conflicting models, each unconvincing, seriously incomplete, and incompatible with competing models. In private even most evolutionary biologists will admit that science has no explanation for the beginning of life.

The same problems which beset origin-of-life research also bedevil efforts to show how virtually any complex biochemical system came about. Biochemistry has revealed a molecular world which stoutly resists explanation by the same theory that has long been applied at the level of the whole organism. Neither of Darwin's black boxes—the origin of life or the origin of vision (or other complex biochemical systems)—has been accounted for by his theory.

Neither of Darwin's black boxes--the origin of life or the origin of vision (or other complex biochemical systems) — has been accounted for by his theory.

Irreducible Complexity

In The Origin of Species Darwin stated: "If it could be demonstrated that any complex organ existed which could not possibly have been formed by numerous, successive, slight modifications, my theory would absolutely break down."

A system which meets Darwin's criterion is one which exhibits irreducible complexity. By irreducible complexity I mean a single system which is composed of several interacting parts that contribute to the basic function, and where the removal of any one of the parts causes the system to effectively cease functioning. An irreducibly complex system cannot be produced directly by slight, successive modifications of a precursor system, since any precursor to an irreducibly complex system is by definition nonfunctional.

Since natural selection requires a function to select, an irreducibly complex biological system, if there is such a thing, would have to arise as an integrated unit for natural selection to have anything to act on. It is almost universally conceded that such a sudden event would be irreconcilable with the gradualism Darwin envisioned. At this point, however, "irreducibly complex" is just a term, whose power resides mostly in its definition. We must now ask if any real thing is in fact irreducibly complex, and, if so, then are any irreducibly complex things also biological systems?

Consider the humble mousetrap (Figure 1). The mousetraps that my family uses in our home to deal with unwelcome rodents consist of a number of parts. There are: 1) a flat wooden platform to act as a base; 2) a metal hammer, which does the actual job of crushing the little mouse; 3) a wire spring with extended ends to press against the platform and the hammer when the trap is charged; 4) a sensitive catch which releases when slight pressure is applied, and 5) a metal bar which holds the hammer back when the trap is charged and connects to the catch. There are also assorted staples and screws to hold the system together.

If any one of the components of the mousetrap (the base, hammer, spring, catch, or holding bar) is removed, then the trap does not function. In other words, the simple little mousetrap has no ability to trap a mouse until several separate parts are all assembled.

Because the mousetrap is necessarily composed of several parts, it is irreducibly complex. Thus, irreducibly complex systems exist.

Molecular Machines

Now, are any biochemical systems irreducibly complex? Yes, it turns out that many are.

Earlier we discussed proteins. In many biological structures proteins are simply components of larger molecular machines. Like the picture tube, wires, metal bolts and screws that comprise a television set, many proteins are part of structures that only function when virtually all of the components have been assembled.

A good example of this is a cilium. Cilia are hairlike organelles on the surfaces of many animal and lower plant cells that serve to move fluid over the cell's surface or to "row" single cells through a fluid. In humans, for example, epithelial cells lining the respiratory tract each have about 200 cilia that beat in synchrony to sweep mucus towards the throat for elimination.

A cilium consists of a membrane-coated bundle of fibers called an axoneme. An axoneme contains a ring of 9 double microtubules surrounding two central single microtubules. Each outer doublet consists of a ring of 13 filaments (subfiber A) fused to an assembly of 10 filaments (subfiber B). The filaments of the microtubules are composed of two proteins called alpha and beta tubulin. The 11 microtubules forming an axoneme are held together by three types of connectors: subfibers A are joined to the central microtubules by radial spokes; adjacent outer doublets are joined by linkers that consist of a highly elastic protein called nexin; and the central microtubules are joined by a connecting bridge. Finally, every subfiber A bears two arms, an inner arm and an outer arm, both containing the protein dynein.

But how does a cilium work? Experiments have indicated that ciliary motion results from the chemically-powered "walking" of the dynein arms on one microtubule up the neighboring subfiber B of a second microtubule so that the two microtubules slide past each other (Figure 2). However, the protein cross-links between microtubules in an intact cilium prevent neighboring microtubules from sliding past each other by more than a short distance. These cross-links, therefore, convert the dynein-induced sliding motion to a bending motion of the entire axoneme.

Now, let us sit back, review the workings of the cilium, and consider what it implies. Cilia are composed of at least a half dozen proteins: alpha-tubulin, beta-tubulin, dynein, nexin, spoke protein, and a central bridge protein. These combine to perform one task, ciliary motion, and all of these proteins must be present for the cilium to function. If the tubulins are absent, then there are no filaments to slide; if the dynein is missing, then the cilium remains rigid and motionless; if nexin or the other connecting proteins are missing, then the axoneme falls apart when the filaments slide.

What we see in the cilium, then, is not just profound complexity, but it is also irreducible complexity on the molecular scale. Recall that by "irreducible complexity" we mean an apparatus that requires several distinct components for the whole to work. My mousetrap must have a base, hammer, spring, catch, and holding bar, all working together, in order to function. Similarly, the cilium, as it is constituted, must have the sliding filaments, connecting proteins, and motor proteins for function to occur. In the absence of any one of those components, the apparatus is useless.

The components of cilia are single molecules. This means that there are no more black boxes to invoke; the complexity of the cilium is final, fundamental. And just as scientists, when they began to learn the complexities of the cell, realized how silly it was to think that life arose spontaneously in a single step or a few steps from ocean mud, so too we now realize that the complex cilium can not be reached in a single step or a few steps.

But since the complexity of the cilium is irreducible, then it can not have functional precursors. Since the irreducibly complex cilium can not have functional precursors it can not be produced by natural selection, which requires a continuum of function to work. Natural selection is powerless when there is no function to select. We can go further and say that, if the cilium can not be produced by natural selection, then the cilium was designed.

Natural selection is powerless when there is no function to select. We can go further and say that, if the cilium can not be produced by natural selection, then the cilium was designed.

A Non-Mechanical Example

A non-mechanical example of irreducible complexity can be seen in the system that targets proteins for delivery to subcellular compartments. In order to find their way to the compartments where they are needed to perform specialized tasks, certain proteins contain a special amino acid sequence near the beginning called a "signal sequence."

As the proteins are being synthesized by ribosomes, a complex molecular assemblage called the signal recognition particle or SRP, binds to the signal sequence. This causes synthesis of the protein to halt temporarily. During the pause in protein synthesis the SRP is bound by the trans-membrane SRP receptor, which causes protein synthesis to resume and which allows passage of the protein into the interior of the endoplasmic reticulum (ER). As the protein passes into the ER the signal sequence is cut off.

For many proteins the ER is just a way station on their travels to their final destinations. Proteins which will end up in a lysosome are enzymatically "tagged" with a carbohydrate residue called mannose-6-phosphate while still in the ER. An area of the ER membrane then begins to concentrate several proteins; one protein, clathrin, forms a sort of geodesic dome called a coated vesicle which buds off from the ER. In the dome there is also a receptor protein which binds to both the clathrin and to the mannose-6-phosphate group of the protein which is being transported. The coated vesicle then leaves the ER, travels through the cytoplasm, and binds to the lysosome through another specific receptor protein. Finally, in a maneuver involving several more proteins, the vesicle fuses with the lysosome and the protein arrives at its destination.

During its travels our protein interacted with dozens of macromolecules to achieve one purpose: its arrival in the lysosome. Virtually all components of the transport system are necessary for the system to operate, and therefore the system is irreducible. And since all of the components of the system are comprised of single or several molecules, there are no black boxes to invoke. The consequences of even a single gap in the transport chain can be seen in the hereditary defect known as I-cell disease. It results from a deficiency of the enzyme that places the mannose-6-phosphate on proteins to be targeted to the lysosomes. I-cell disease is characterized by progressive retardation, skeletal deformities, and early death.

Virtually all components of the transport system are necessary for the system to operate, and therefore the system is irreducible. And since all of the components of the system are comprised of single or several molecules, there are no black boxes to invoke.

The Study of "Molecular Evolution"

Other examples of irreducible complexity abound, including aspects of protein transport, blood clotting, closed circular DNA, electron transport, the bacterial flagellum, telomeres, photosynthesis, transcription regulation, and much more. Examples of irreducible complexity can be found on virtually every page of a biochemistry textbook. But if these things cannot be explained by Darwinian evolution, how has the scientific community regarded these phenomena of the past forty years?

A good place to look for an answer to that question is in the Journal of Molecular Evolution. JME is a journal that was begun specifically to deal with the topic of how evolution occurs on the molecular level. It has high scientific standards, and is edited by prominent figures in the field. In a recent issue of JME there were published eleven articles; of these, all eleven were concerned simply with the analysis of protein or DNA sequences. None of the papers discussed detailed models for intermediates in the development of complex biomolecular structures.

In the past ten years JME has published 886 papers. Of these, 95 discussed the chemical synthesis of molecules thought to be necessary for the origin of life, 44 proposed mathematical models to improve sequence analysis, 20 concerned the evolutionary implications of current structures, and 719 were analyses of protein or polynucleotide sequences. However, there weren't any papers discussing detailed models for intermediates in the development of complex biomolecular structures. This is not a peculiarity of JME. No papers are to be found that discuss detailed models for intermediates in the development of complex biomolecular structures in the Proceedings of the National Academy of Science, Nature, Science, the Journal of Molecular Biology or, to my knowledge, any journal whatsoever.

In the past ten years the Journal of Molecular Evolution has published 886 papers.... None discussed detailed models for intermediates in the development of complex biomolecular structures. This is not a peculiarity of JME. No papers are to be found that discuss detailed models for intermediates in the development of complex biomolecular structures in ... any journal whatsoever.

Sequence comparisons overwhelmingly dominate the literature of molecular evolution. But sequence comparisons simply can't account for the development of complex biochemical systems any more than Darwin's comparison of simple and complex eyes told him how vision worked. Thus in this area science is mute.

Detection of Design

What's going on? Imagine a room in which a body lies crushed, flat as a pancake. A dozen detectives crawl around, examining the floor with magnifying glasses for any clue to the identity of the perpetrator. In the middle of the room next to the body stands a large, gray elephant. The detectives carefully avoid bumping into the pachyderm's legs as they crawl, and never even glance at it. Over time the detectives get frustrated with their lack of progress but resolutely press on, looking even more closely at the floor. You see, textbooks say detectives must "get their man," so they never consider elephants.

There is an elephant in the roomful of scientists who are trying to explain the development of life. The elephant is labeled "intelligent design." To a person who does not feel obliged to restrict his search to unintelligent causes, the straightforward conclusion is that many biochemical systems were designed. They were designed not by the laws of nature, not by chance and necessity. Rather, they were planned. The designer knew what the systems would look like when they were completed; the designer took steps to bring the systems about. Life on earth at its most fundamental level, in its most critical components, is the product of intelligent activity.

To a person who does not feel obliged to restrict his search to unintelligent causes, the straightforward conclusion is that many biochemical systems were designed. They were designed not by the laws of nature, not by chance and necessity. Rather, they were planned.

The conclusion of intelligent design flows naturally from the data itself-not from sacred books or sectarian beliefs. Inferring that biochemical systems were designed by an intelligent agent is a humdrum process that requires no new principles of logic or science. It comes simply from the hard work that biochemistry has done over the past forty years, combined with consideration of the way in which we reach conclusions of design every day.

What is "design"? Design is simply the purposeful arrangement of parts. The scientific question is how we detect design. This can be done in various ways, but design can most easily be inferred for mechanical objects.

Systems made entirely from natural components can also evince design. For example, suppose you are walking with a friend in the woods. All of a sudden your friend is pulled high in the air and left dangling by his foot from a vine attached to a tree branch.

After cutting him down you reconstruct the trap. You see that the vine was wrapped around the tree branch, and the end pulled tightly down to the ground. It was securely anchored to the ground by a forked branch. The branch was attached to another vine-hidden by leaves-so that, when the trigger-vine was disturbed, it would pull down the forked stick, releasing the spring-vine. The end of the vine formed a loop with a slipknot to grab an appendage and snap it up into the air. Even though the trap was made completely of natural materials you would quickly conclude that it was the product of intelligent design.

Intelligent design is a good explanation for a number of biochemical systems, but I should insert a word of caution. Intelligent design theory has to be seen in context: it does not try to explain everything. We live in a complex world where lots of different things can happen. When deciding how various rocks came to be shaped the way they are a geologist might consider a whole range of factors: rain, wind, the movement of glaciers, the activity of moss and lichens, volcanic action, nuclear explosions, asteroid impact, or the hand of a sculptor. The shape of one rock might have been determined primarily by one mechanism, the shape of another rock by another mechanism.

Similarly, evolutionary biologists have recognized that a number of factors might have affected the development of life: common descent, natural selection, migration, population size, founder effects (effects that may be due to the limited number of organisms that begin a new species), genetic drift (spread of "neutral," nonselective mutations), gene flow (the incorporation of genes into a population from a separate population), linkage (occurrence of two genes on the same chromosome), and much more. The fact that some biochemical systems were designed by an intelligent agent does not mean that any of the other factors are not operative, common, or important.


It is often said that science must avoid any conclusions which smack of the supernatural. But this seems to me to be both bad logic and bad science. Science is not a game in which arbitrary rules are used to decide what explanations are to be permitted. Rather, it is an effort to make true statements about physical reality. It was only about sixty years ago that the expansion of the universe was first observed. This fact immediately suggested a singular event-that at some time in the distant past the universe began expanding from an extremely small size.

To many people this inference was loaded with overtones of a supernatural event-the creation, the beginning of the universe. The prominent physicist A.S. Eddington probably spoke for many physicists in voicing his disgust with such a notion:

Philosophically, the notion of an abrupt beginning to the present order of Nature is repugnant to me, as I think it must be to most; and even those who would welcome a proof of the intervention of a Creator will probably consider that a single winding-up at some remote epoch is not really the kind of relation between God and his world that brings satisfaction to the mind.

Nonetheless, the big bang hypothesis was embraced by physics and over the years has proven to be a very fruitful paradigm. The point here is that physics followed the data where it seemed to lead, even though some thought the model gave aid and comfort to religion. In the present day, as biochemistry multiplies examples of fantastically complex molecular systems, systems which discourage even an attempt to explain how they may have arisen, we should take a lesson from physics. The conclusion of design flows naturally from the data; we should not shrink from it; we should embrace it and build on it.

We are not inferring design from what we do not know, but from what we do know. We are not inferring design to account for a black box, but to account for an open box.

In concluding, it is important to realize that we are not inferring design from what we do not know, but from what we do know. We are not inferring design to account for a black box, but to account for an open box. A man from a primitive culture who sees an automobile might guess that it was powered by the wind or by an antelope hidden under the car, but when he opens up the hood and sees the engine he immediately realizes that it was designed. In the same way biochemistry has opened up the cell to examine what makes it run and we see that it, too, was designed.

It was a shock to the people of the 19th century when they discovered, from observations science had made, that many features of the biological world could be ascribed to the elegant principle of natural selection. It is a shock to us in the twentieth century to discover, from observations science has made, that the fundamental mechanisms of life cannot be ascribed to natural selection, and therefore were designed. But we must deal with our shock as best we can and go on. The theory of undirected evolution is already dead, but the work of science continues.

Abstract: "DNA and the Origin of Life" appears in the peer-reviewed* volume Darwinism, Design, and Public Education published with Michigan State University Press. Stephen C. Meyer contends that intelligent design provides a better explanation than competing chemical evolutionary models for the origin of the information present in large biomacromolecules such as DNA, RNA, and proteins. Meyer shows that the term information as applied to DNA connotes not only improbability or complexity but also specificity of function. He then argues that neither chance nor necessity, nor the combination of the two, can explain the origin of information starting from purely physical-chemical antecedents. Instead, he argues that our knowledge of the causal powers of both natural entities and intelligent agency suggests intelligent design as the best explanation for the origin of the information necessary to build a cell in the first place. Click here to download the article.

*Darwinism, Design, and Public Education is an interdisciplinary volume that was peer-reviewed by a professor of biological sciences, a professor of philosophy of science and a professor of rhetoric of science.

Intelligent design begins with a seemingly innocuous question: Can objects, even if nothing is known about how they arose, exhibit features that reliably signal the action of an intelligent cause? To see what’s at stake, consider Mount Rushmore. The evidence for Mount Rushmore’s design is direct — eyewitnesses saw the sculptor Gutzon Borglum spend the better part of his life designing and building this structure. But what if there were no direct evidence for Mount Rushmore’s design? What if humans went extinct and aliens, visiting the earth, discovered Mount Rushmore in substantially the same condition as it is now? In that case, what about this rock formation would provide convincing circumstantial evidence that it was due to a designing intelligence and not merely to wind and erosion? Designed objects like Mount Rushmore exhibit characteristic features or patterns that point to an intelligence. Such features or patterns constitute signs of intelligence. Proponents of intelligent design, known as design theorists, purport to study such signs formally, rigorously, and scientifically. Intelligent design may therefore be defined as the science that studies signs of intelligence. Because a sign is not the thing signified, intelligent design does not presume to identify the purposes of a designer. Intelligent design focuses not on the designer’s purposes (the thing signified) but on the artifacts resulting from a designer’s purposes (the sign). What a designer intends or purposes is, to be sure, an interesting question, and one may be able to infer something about a designer’s purposes from the designed objects that a designer produces. Nevertheless, the purposes of a designer lie outside the scope of intelligent design. As a scientific research program, intelligent design investigates the effects of intelligence and not intelligence as such. Intelligent design is controversial because it purports to find signs of intelligence in nature, and specifically in biological systems. According to the evolutionary biologist Francisco Ayala, Darwin’s greatest achievement was to show how the organized complexity of organisms could be attained apart from a designing intelligence. Intelligent design therefore directly challenges Darwinism and other naturalistic approaches to the origin and evolution of life. The idea that an intrinsic intelligence or teleology inheres in and is expressed through nature has a long history and is embraced by many religious traditions. The main difficulty with this idea since Darwin’s day, however, has been to discover a conceptually powerful formulation of design that can fruitfully advance science. What has kept design outside the scientific mainstream since the rise of Darwinism has been the lack of precise methods for distinguishing intelligently caused objects from unintelligently caused ones. For design to be a fruitful scientific concept, scientists have to be sure that they can reliably determine whether something is designed. Johannes Kepler, for instance, thought the craters on the moon were intelligently designed by moon dwellers. We now know that the craters were formed by purely material factors (like meteor impacts). This fear of falsely attributing something to design, only to have it overturned later, has hindered design from entering the scientific mainstream. But design theorists argue that they now have formulated precise methods for discriminating designed from undesigned objects. These methods, they contend, enable them to avoid Kepler’s mistake and reliably locate design in biological systems. As a theory of biological origins and development, intelligent design’s central claim is that only intelligent causes adequately explain the complex, information-rich structures of biology and that these causes are empirically detectable. To say intelligent causes are empirically detectable is to say there exist well-defined methods that, based on observable features of the world, can reliably distinguish intelligent causes from undirected natural causes. Many special sciences have already developed such methods for drawing this distinction — notably forensic science, cryptography, archeology, and the search for extraterrestrial intelligence (SETI). Essential to all these methods is the ability to eliminate chance and necessity. Astronomer Carl Sagan wrote a novel about SETI called Contact, which was later made into a movie. The plot and the extraterrestrials were fictional, but Sagan based the SETI astronomers’ methods of design detection squarely on scientific practice. Real-life SETI researchers have thus far failed to conclusively detect designed signals from distant space, but if they encountered such a signal, as the film’s astronomers’ did, they too would infer design. Why did the radio astronomers in Contact draw such a design inference from the signals they monitored from space? SETI researchers run signals collected from distant space through computers programmed to recognize preset patterns. These patterns serve as a sieve. Signals that do not match any of the patterns pass through the sieve and are classified as random. After years of receiving apparently meaningless, random signals, the Contact researchers discovered a pattern of beats and pauses that corresponded to the sequence of all the prime numbers between two and one-hundred and one. (Prime numbers are divisible only by themselves and by one.) That startled the astronomers, and they immediately inferred an intelligent cause. When a sequence begins with two beats and then a pause, three beats and then a pause, and continues through each prime number all the way to one-hundred and one beats, researchers must infer the presence of an extraterrestrial intelligence. Here’s the rationale for this inference: Nothing in the laws of physics requires radio signals to take one form or another. The prime sequence is therefore contingent rather than necessary. Also, the prime sequence is long and hence complex. Note that if the sequence were extremely short and therefore lacked complexity, it could easily have happened by chance. Finally, the sequence was not merely complex but also exhibited an independently given pattern or specification (it was not just any old sequence of numbers but a mathematically significant one — the prime numbers). Intelligence leaves behind a characteristic trademark or signature — what within the intelligent design community is now called specified complexity. An event exhibits specified complexity if it is contingent and therefore not necessary; if it is complex and therefore not readily repeatable by chance; and if it is specified in the sense of exhibiting an independently given pattern. Note that a merely improbable event is not sufficient to eliminate chance — by flipping a coin long enough, one will witness a highly complex or improbable event. Even so, one will have no reason to attribute it to anything other than chance. The important thing about specifications is that they be objectively given and not arbitrarily imposed on events after the fact. For instance, if an archer fires arrows at a wall and then paints bull’s-eyes around them, the archer imposes a pattern after the fact. On the other hand, if the targets are set up in advance (“specified”), and then the archer hits them accurately, one legitimately concludes that it was by design. The combination of complexity and specification convincingly pointed the radio astronomers in the movie Contact to an extraterrestrial intelligence. Note that the evidence was purely circumstantial — the radio astronomers knew nothing about the aliens responsible for the signal or how they transmitted it. Design theorists contend that specified complexity provides compelling circumstantial evidence for intelligence. Accordingly, specified complexity is a reliable empirical marker of intelligence in the same way that fingerprints are a reliable empirical marker of an individual’s presence. Moreover, design theorists argue that purely material factors cannot adequately account for specified complexity. In determining whether biological organisms exhibit specified complexity, design theorists focus on identifiable systems (e.g., individual enzymes, metabolic pathways, and molecular machines). These systems are not only specified by their independent functional requirements but also exhibit a high degree of complexity. In Darwin’s Black Box, biochemist Michael Behe connects specified complexity to biological design through his concept of irreducible complexity. Behe defines a system as irreducibly complex if it consists of several interrelated parts for which removing even one part renders the system’s basic function unrecoverable. For Behe, irreducible complexity is a sure indicator of design. One irreducibly complex biochemical system that Behe considers is the bacterial flagellum. The flagellum is an acid-powered rotary motor with a whip-like tail that spins at twenty-thousand revolutions per minute and whose rotating motion enables a bacterium to navigate through its watery environment. Behe shows that the intricate machinery in this molecular motor — including a rotor, a stator, O-rings, bushings, and a drive shaft — requires the coordinated interaction of approximately forty complex proteins and that the absence of any one of these proteins would result in the complete loss of motor function. Behe argues that the Darwinian mechanism faces grave obstacles in trying to account for such irreducibly complex systems. In No Free Lunch, William Dembski shows how Behe’s notion of irreducible complexity constitutes a particular instance of specified complexity. Once an essential constituent of an organism exhibits specified complexity, any design attributable to that constituent carries over to the organism as a whole. To attribute design to an organism one need not demonstrate that every aspect of the organism was designed. Organisms, like all material objects, are products of history and thus subject to the buffeting of purely material factors. Automobiles, for instance, get old and exhibit the effects of corrosion, hail, and frictional forces. But that doesn’t make them any less designed. Likewise design theorists argue that organisms, though exhibiting the effects of history (and that includes Darwinian factors such as genetic mutations and natural selection), also include an ineliminable core that is designed. Intelligent design’s main tie to religion is through the design argument. Perhaps the best-known design argument is William Paley’s. Paley published his argument in 1802 in a book titled Natural Theology. The subtitle of that book is revealing: Evidences of the Existence and Attributes of the Deity, Collected from the Appearances of Nature. Paley’s project was to examine features of the natural world (what he called “appearances of nature”) and from there draw conclusions about the existence and attributes of a designing intelligence responsible for those features (whom Paley identified with the God of Christianity). According to Paley, if one finds a watch in a field (and thus lacks all knowledge of how the watch arose), the adaptation of the watch’s parts to telling time ensures that it is the product of an intelligence. So too, according to Paley, the marvelous adaptations of means to ends in organisms (like the intricacy of the human eye with its capacity for vision) ensure that organisms are the product of an intelligence. The theory of intelligent design updates Paley’s watchmaker argument in light of contemporary information theory and molecular biology, purporting to bring this argument squarely within science. In arguing for the design of natural systems, intelligent design is more modest than the design arguments of natural theology. For natural theologians like Paley, the validity of the design argument did not depend on the fruitfulness of design-theoretic ideas for science but on the metaphysical and theological mileage one could get out of design. A natural theologian might point to nature and say, “Clearly, the designer of this ecosystem prized variety over neatness.” A design theorist attempting to do actual design-theoretic research on that ecosystem might reply, “Although that’s an intriguing theological possibility, as a design theorist I need to keep focused on the informational pathways capable of producing that variety.” In his Critique of Pure Reason, Immanuel Kant claimed that the most the design argument can establish is “an architect of the world who is constrained by the adaptability of the material in which he works, not a creator of the world to whose idea everything is subject.” Far from rejecting the design argument, Kant objected to overextending it. For Kant, the design argument legitimately establishes an architect (that is, an intelligent cause whose contrivances are constrained by the materials that make up the world), but it can never establish a creator who originates the very materials that the architect then fashions. Intelligent design is entirely consonant with this observation by Kant. Creation is always about the source of being of the world. Intelligent design, as the science that studies signs of intelligence, is about arrangements of preexisting materials that point to a designing intelligence. Creation and intelligent design are therefore quite different. One can have creation without intelligent design and intelligent design without creation. For instance, one can have a doctrine of creation in which God creates the world in such a way that nothing about the world points to design. The evolutionary biologist Richard Dawkins wrote a book titled The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design. Even if Dawkins is right about the universe revealing no evidence of design, it would not logically follow that it was not created. It is logically possible that God created a world that provides no evidence of design. On the other hand, it is logically possible that the world is full of signs of intelligence but was not created. This was the ancient Stoic view, in which the world was eternal and uncreated, and yet a rational principle pervaded the world and produced marks of intelligence in it. The implications of intelligent design for religious belief are profound. The rise of modern science led to a vigorous attack on all religions that treat purpose, intelligence, and wisdom as fundamental and irreducible features of reality. The high point of this attack came with Darwin’s theory of evolution. The central claim of Darwin’s theory is that an unguided material process (random variation and natural selection) could account for the emergence of all biological complexity and order. In other words, Darwin appeared to show that the design in biology (and, by implication, in nature generally) was dispensable. By showing that design is indispensable to the scientific understanding of the natural world, intelligent design is reinvigorating the design argument and at the same time overturning the widespread misconception that the only tenable form of religious belief is one that treats purpose, intelligence, and wisdom as byproducts of unintelligent material processes.


  • Beckwith, Francis J. Law, Darwinism, and Public Education: The Establishment Clause and the Challenge of Intelligent Design. Lanham, Md., 2003.
  • Behe, Michael J. Darwin’s Black Box: The Biochemical Challenge to Evolution. New York, 1996.
  • Dawkins, Richard. The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design. New York, 1986.
  • Dembski, William A. No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence. Lanham, Md., 2002.
  • Forrest, Barbara. “The Wedge at Work: How Intelligent Design Creationism Is Wedging Its Way into the Cultural and Academic Mainstream.” In Intelligent Design Creationism and Its Critics: Philosophical, Theological, and Scientific Perspectives, edited by Robert T. Pennock, pp. 5–53, Cambridge, Mass., 2001.
  • Giberson, Karl W. and Donald A. Yerxa. Species of Origins: America’s Search for a Creation Story. Lanham, Md., 2002.
  • Hunter, Cornelius G. Darwin’s God: Evolution and the Problem of Evil. Grand Rapids, Mich., 2002.
  • Manson, Neil A., ed. God and Design: The Teleological Argument and Modern Science. London, 2003.
  • Miller, Kenneth R. Finding Darwin’s God: A Scientist’s Search for Common Ground between God and Evolution. San Francisco, 1999.
  • Rea, Michael C. World without Design: The Ontological Consequences of Naturalism. Oxford, 2002.
  • Witham, Larry. By Design: Science and the Search for God. San Francisco, 2003.
  • Woodward, Thomas. Doubts about Darwin: A History of Intelligent Design. Grand Rapids, Mich., 2003.

Bibliographic Essay

Larry Witham provides the best overview of intelligent design, even-handedly treating its scientific, cultural, and religious dimensions. As a journalist, Witham has personally interviewed all the main players in the debate over intelligent design and allows them to tell their story. For intelligent design’s place in the science and religion dialogue, see Giberson and Yerxa. For histories of the intelligent design movement, see Woodward (a supporter) and Forrest (a critic). See Behe and Dembski to overview intelligent design’s scientific research program. For a critique of that program, see Miller. For an impassioned defense of Darwinism against any form of teleology or design, see Dawkins. Manson’s anthology situates intelligent design within broader discussions about teleology. Rea probes intelligent design’s metaphysical underpinnings. Hunter provides an interesting analysis of how intelligent design and Darwinism play off the problem of evil. Beckwith examines whether intelligent design is inherently religious and thus, on account of church-state separation, must be barred from public school science curricula.

On this page you can download an annotated bibliography of peer-reviewed and peer-edited scientific articles supporting, applying, or arising from the theory of intelligent design. You also can read a description of the intelligent design research community and its aims.


For those who are studying aspects of the origin of life, the question no longer seems to be whether life could have originated by chemical processes involving non-biological components but, rather, what pathway might have been followed.

— National Academy of Sciences (1996)

It is 1828, a year that encompassed the death of Shaka, the Zulu king, the passage in the United States of the Tariff of Abominations, and the battle of Las Piedras in South America. It is, as well, the year in which the German chemist Friedrich Wöhler announced the synthesis of urea from cyanic acid and ammonia.

Discovered by H.M. Roulle in 1773, urea is the chief constituent of urine. Until 1828, chemists had assumed that urea could be produced only by a living organism. Wöhler provided the most convincing refutation imaginable of this thesis. His synthesis of urea was noteworthy, he observed with some understatement, because "it furnishes an example of the artificial production of an organic, indeed a so-called animal substance, from inorganic materials."

Wöhler's work initiated a revolution in chemistry; but it also initiated a revolution in thought. To the extent that living systems are chemical in their nature, it became possible to imagine that they might be chemical in their origin; and if chemical in their origin, then plainly physical in their nature, and hence a part of the universe that can be explained in terms of "the model for what science should be."*

In a letter written to his friend, Sir Joseph Hooker, several decades after Wöhler's announcement, Charles Darwin allowed himself to speculate. Invoking "a warm little pond" bubbling up in the dim inaccessible past, Darwin imagined that given "ammonia and phosphoric salts, light, heat, electricity, etc. present," the spontaneous generation of a "protein compound" might follow, with this compound "ready to undergo still more complex changes" and so begin Darwinian evolution itself.

Time must now be allowed to pass. Shall we say 60 years or so? Working independently, J.B.S. Haldane in England and A.I. Oparin in the Soviet Union published influential studies concerning the origin of life. Before the era of biological evolution, they conjectured, there must have been an era of chemical evolution taking place in something like a pre-biotic soup. A reducing atmosphere prevailed, dominated by methane and ammonia, in which hydrogen atoms, by donating their electrons (and so "reducing" their number), promoted various chemical reactions. Energy was at hand in the form of electrical discharges, and thereafter complex hydrocarbons appeared on the surface of the sea.

The publication of Stanley Miller's paper, "A Production of Amino Acids Under Possible Primitive Earth Conditions," in the May 1953 issue of Science completed the inferential arc initiated by Friedrich Wöhler 125 years earlier. Miller, a graduate student, did his work at the instruction of Harold Urey. Because he did not contribute directly to the experiment, Urey insisted that his name not be listed on the paper itself. But their work is now universally known as the Miller-Urey experiment, providing evidence that a good deed can be its own reward.

By drawing inferences about pre-biotic evolution from ordinary chemistry, Haldane and Oparin had opened an imaginary door. Miller and Urey barged right through. Within the confines of two beakers, they re-created a simple pre-biotic environment. One beaker held water; the other, connected to the first by a closed system of glass tubes, held hydrogen cyanide, water, methane, and ammonia. The two beakers were thus assumed to simulate the pre-biotic ocean and its atmosphere. Water in the first could pass by evaporation to the gases in the second, with vapor returning to the original alembic by means of condensation.

Then Miller and Urey allowed an electrical spark to pass continually through the mixture of gases in the second beaker, the gods of chemistry controlling the reactions that followed with very little or no human help. A week after they had begun their experiment, Miller and Urey discovered that in addition to a tarry residue "its most notable product" their potent little planet had yielded a number of the amino acids found in living systems.

The effect among biologists (and the public) was electrifying, all the more so because of the experiment's methodological genius. Miller and Urey had done nothing. Nature had done everything. The experiment alone had parted the cloud of unknowing.

The Double Helix

In April 1953, just four weeks before Miller and Urey would report their results in Science, James Watson and Francis Crick published a short letter in Nature entitled "A Structure for Deoxyribose Nucleic Acid." The letter is now famous, if only because the exuberant Crick, at least, was persuaded that he and Watson had discovered the secret of life. In this he was mistaken: the secret of life, along with its meaning, remains hidden. But in deducing the structure of deoxyribose nucleic acid (DNA) from X-ray diffraction patterns and various chemical details, Watson and Crick had discovered the way in which life at the molecular level replicates itself.

Formed as a double helix, DNA, Watson and Crick argued, consists of two twisted strings facing each other and bound together by struts. Each string comprises a series of four nitrogenous bases: adenine (A), guanine (G), thymine (T), and cytosine (C). The bases are nitrogenous because their chemical activity is determined by the electrons of the nitrogen atom, and they are bases because they are one of two great chemical clans - the other being the acids, with which they combine to form salts.

Within each strand of DNA, the nitrogenous bases are bound to a sugar, deoxyribose. Sugar molecules are in turn linked to each other by a phosphate group. When nucleotides (A, G, T, or C) are connected in a sugar-phosphate chain, they form a polynucleotide. In living DNA, two such chains face each other, their bases touching fingers, A matched to T and C to G. The coincidence between bases is known now as Watson-Crick base pairing.

"It has not escaped our notice," Watson and Crick observed, "that the specific pairings we have postulated immediately suggests a possible copying mechanism for the genetic material"(emphasis added). Replication proceeds, that is, when a molecule of DNA is unzipped along its internal axis, dividing the hydrogen bonds between the bases. Base pairing then works to prompt both strands of a separated double helix to form a double helix anew.

So Watson and Crick conjectured, and so it has proved.

The Synthesis of Protein

Together with Francis Crick and Maurice Wilkins, James Watson received the Nobel Prize for medicine in 1962. In his acceptance speech in Stockholm before the king of Sweden, Watson had occasion to explain his original research goals. The first was to account for genetic replication. This, he and Crick had done. The second was to describe the "way in which genes control protein synthesis." This, he was in the course of doing.

DNA is a large, long, and stable molecule. As molecules go, it is relatively inert. It is the proteins, rather, that handle the day-to-day affairs of the cell. Acting as enzymes, and so as agents of change, proteins make possible the rapid metabolism characteristic of modern organisms.

Proteins are formed from the alpha-amino acids, of which there are twenty in living systems. The prefix "alpha" designates the position of the crucial carbon atom in the amino acid, indicating that it lies adjacent to (and is bound up with) a carboxyl group comprising carbon, oxygen, again oxygen, and hydrogen. And the proteins are polymers: like DNA, their amino-acid constituents are formed into molecular chains.

But just how does the cell manage to link amino acids to form specific proteins? This was the problem to which Watson alluded as the king of Sweden, lost in a fog of admiration, nodded amiably.

The success of Watson-Crick base pairing had persuaded a number of molecular biologists that DNA undertook protein synthesis by the same process, the formation of symmetrical patterns or "templates" that governed its replication. After all, molecular replication proceeded by the divinely simple separation-and-recombination of matching (or symmetrical) molecules, with each strand of DNA serving as the template for another. So it seemed altogether plausible that DNA would likewise serve a template function for the amino acids.

It was Francis Crick who in 1957 first observed that this was most unlikely. In a note circulated privately, Crick wrote that "if one considers the physico-chemical nature of the amino-acid side chains, we do not find complementary features on the nucleic acids. Where are the knobby hydrophobic . . . surfaces to distinguish valine from leucine and isoleucine? Where are the charged groups, in specific positions, to go with acidic and basic amino acids?"

Should anyone have missed his point, Crick made it again: "I don't think that anyone looking at DNA or RNA [ribonucleic acid] would think of them as templates for amino acids."

Had these observations been made by anyone but Francis Crick, they might have been regarded as the work of a lunatic; but in looking at any textbook in molecular biology today, it is clear that Crick was simply noticing what was under his nose. Just where are those "knobby hydrophobic surfaces"? To imagine that the nucleic acids form a template or pattern for the amino acids is a little like trying to imagine a glove fitting over a centipede. But if the nucleic acids did not form a template for the amino acids, then the information they contained - all of the ancient wisdom of the species, after all - could only be expressed by an indirect form of transmission: a code of some sort.

The idea was hardly new. The physicist Erwin Schrödinger had predicted in 1945 that living systems would contain what he called a "code script"; and his short, elegant book, What Is Life?, had exerted a compelling influence on every molecular biologist who read it. Ten years later, the ubiquitous Crick invoked the phrase "sequence hypothesis" to characterize the double idea that DNA sequences spell a message and that a code is required to express it. What remained obscure was both the spelling of the message and the mechanism by which it was conveyed.

The mechanism emerged first. During the late 1950's, Franςois Jacob and Jacques Monod advanced the thesis that RNA acts as the first in a chain of intermediates leading from DNA to the amino acids.

Single- rather than double-stranded, RNA is a nucleic acid: a chip from the original DNA block. Instead of thymine (T), it contains the base uracil (U), and the sugar that it employs along its backbone features an atom of oxygen missing from deoxyribose. But RNA, Jacob and Monod argued, was more than a mere molecule: it was a messenger, an instrument of conveyance, "transcribing" in one medium a message first expressed in another. Among the many forms of RNA loitering in the modern cell, the RNA bound for duties of transcription became known, for obvious reasons, as "messenger" RNA.

In transcription, molecular biologists had discovered a second fundamental process, a companion in arms to replication. Almost immediately thereafter, details of the code employed by the messenger appeared. In 1961, Marshall Nirenberg and J. Heinrich Matthaei announced that they had discovered a specific point of contact between RNA and the amino acids. And then, in short order, the full genetic code emerged. RNA (like DNA) is organized into triplets, so that adjacent sequences of three bases are mapped to a single amino acid. Sixty-four triplets (or codons) govern twenty amino acids. The scheme is universal, or almost so.

The elaboration of the genetic code made possible a remarkably elegant model of the modern cell as a system in which sequences of codons within the nucleic acids act at a distance to determine sequences of amino acids within the proteins: commands issued, responses undertaken. A third fundamental biological process thus acquired molecular incarnation. If replication served to divide and then to duplicate the cell's ancestral message, and transcription to re-express it in messenger RNA, "translation" acted to convey that message from messenger RNA to the amino acids.

For all the boldness and power of this thesis, the details remained on the level of what bookkeepers call general accounting procedures. No one had established a direct, a physical, connection between RNA and the amino acids.

Having noted the problem, Crick also indicated the shape of its solution. "I therefore proposed a theory," he would write retrospectively, "in which there were twenty adaptors (one for each amino acid), together with twenty special enzymes. Each enzyme would join one particular amino acid to its own special adaptor."

In early 1969, at roughly the same time that a somber Lyndon Johnson was departing the White House to return to the Pedernales, the adaptors whose existence Crick had predicted came into view. There were twenty, just as he had suggested. They were short in length; they were specific in their action; and they were nucleic acids. Collectively, they are now designated "transfer" RNA (tRNA).

Folded like a cloverleaf, transfer RNA serves physically as a bridge between messenger RNA and an amino acid. One arm of the cloverleaf is called the anti-coding region. The three nucleotide bases that it contains are curved around the arm's bulb-end; they are matched by Watson-Crick base pairing to bases on the messenger RNA. The other end of the cloverleaf is an acceptor region. It is here that an amino acid must go, with the structure of tRNA suggesting a complicated female socket waiting to be charged by an appropriate male amino acid.

The adaptors whose existence Crick had predicted served dramatically to confirm his hypothesis that such adaptors were needed. But although they brought about a physical connection between the nucleic and the amino acids, the fact that they were themselves nucleic acids raised a question: in the unfolding molecular chain, just what acted to adapt the adaptors to the amino acids? And this, too, was a problem Crick both envisaged and solved: his original suggestion mentioned both adaptors (nucleic acids) and their enzymes (proteins).

And so again it proved. The act of matching adaptors to amino acids is carried out by a family of enzymes, and thus by a family of proteins: the aminoacyl-tRNA synthetases. There are as many such enzymes as there are adaptors. The prefix "aminoacyl" indicates a class of chemical reactions, and it is in aminoacylation that the cargo of a carboxyl group is bonded to a molecule of transfer RNA.

Collectively, the enzymes known as synthetases have the power both to recognize specific codons and to select their appropriate amino acid under the universal genetic code. Recognition and selection are ordinarily thought to be cognitive acts. In psychology, they are poorly understood, but within the cell they have been accounted for in chemical terms and so in terms of "the model for what science should be."

With tRNA appropriately charged, the molecule is conveyed to the ribosome, where the task of assembling sequences of amino acids is then undertaken by still another nucleic acid, ribosomal RNA (rRNA). By these means, the modern cell is at last subordinated to a rich narrative drama. To repeat:

  • Replication duplicates the genetic message in DNA.
  • Transcription copies the genetic message from DNA to RNA.
  • Translation conveys the genetic message from RNA to the amino acids - whereupon, in a fourth and final step, the amino acids are assembled into proteins.

The Central Dogma

It was once again Francis Crick, with his remarkable gift for impressing his authority over an entire discipline, who elaborated these facts into what he called the central dogma of molecular biology. The cell, Crick affirmed, is a divided kingdom. Acting as the cell's administrators, the nucleic acids embody all of the requisite wisdom - where to go, what to do, how to manage - in the specific sequence of their nucleotide bases. Administration then proceeds by the transmission of information from the nucleic acids to the proteins.

The central dogma thus depicts an arrow moving one way, from the nucleic acids to the proteins, and never the other way around. But is anything ever routinely returned, arrow-like, from its target? This is not a question that Crick considered, although in one sense the answer is plainly no. Given the modern genetic code, which maps four nucleotides onto twenty amino acids, there can be no inverse code going in the opposite direction; an inverse mapping is mathematically impossible.

But there is another sense in which Crick's central dogma does engender its own reversal. If the nucleic acids are the cell's administrators, the proteins are its chemical executives: both the staff and the stuff of life. The molecular arrow goes one way with respect to information, but it goes the other way with respect to chemistry.

Replication, transcription, and translation represent the grand unfolding of the central dogma as it proceeds in one direction. The chemical activities initiated by the enzymes represent the grand unfolding of the central dogma as it goes in the other. Within the cell, the two halves of the central dogma combine to reveal a system of coded chemistry, an exquisitely intricate but remarkably coherent temporal tableau suggesting a great army in action.

From these considerations a familiar figure now emerges: the figure of a chicken and its egg. Replication, transcription, and translation are all under the control of various enzymes. But enzymes are proteins, and these particular proteins are specified by the cell's nucleic acids. DNA requires the enzymes in order to undertake the work of replication, transcription, and translation; the enzymes require DNA in order to initiate it. The nucleic acids and the proteins are thus profoundly coordinated, each depending upon the other. Without amino-acyl-tRNA synthetase, there is no translation from RNA; but without DNA, there is no synthesis of aminoacyl-tRNA synthetase.

If the nucleic acids and their enzymes simply chased each other forever around the same cell, the result would be a vicious circle. But life has elegantly resolved the circle in the form of a spiral. The aminoacyl-tRNA synthetase that is required to complete molecular translation enters a given cell from its progenitor or "maternal" cell, where it is specified by that cell's DNA. The enzymes required to make the maternal cell's DNA do its work enter that cell from its maternal line. And so forth.

On the level of intuition and experience, these facts suggest nothing more mysterious than the longstanding truism that life comes only from life. Omnia viva ex vivo, as Latin writers said. It is only when they are embedded in various theories about the origins of life that the facts engender a paradox, or at least a question: in the receding molecular spiral, which came first - the chicken in the form of DNA, or its egg in the form of various proteins? And if neither came first, how could life have begun?

The RNA World

It is 1967, the year of the Six-Day war in the Middle East, the discovery of the electroweak forces in particle physics, and the completion of a twenty-year research program devoted to the effects of fluoridation on dental caries in Evanston, Illinois. It is also the year in which Carl Woese, Leslie Orgel, and Francis Crick introduced the hypothesis that "evolution based on RNA replication preceded the appearance of protein synthesis" (emphasis added).

By this time, it had become abundantly clear that the structure of the modern cell was not only more complex than other physical structures but complex in poorly understood ways. And yet no matter how far back biologists traveled into the tunnel of time, certain features of the modern cell were still there, a message sent into the future by the last universal common ancestor. Summarizing his own perplexity in retrospect, Crick would later observe that "an honest man, armed with all the knowledge available to us now, could only state that, in some sense, the origin of life appears at the moment to be almost a miracle." Very wisely, Crick would thereupon determine never to write another paper on the subject, although he did affirm his commitment to the theory of "directed panspermia," according to which life originated in some other portion of the universe and, for reasons that Crick could never specify, was simply sent here.

But that was later. In 1967, the argument presented by Woesel, Orgel, and Crick was simple. Given those chickens and their eggs, something must have come first. Two possibilities were struck off by a process of elimination. DNA? Too stable and, in some odd sense, too perfect. The proteins? Incapable of dividing themselves, and so, like molecular eunuchs, useful without being fecund. That left RNA. While it was not obviously the right choice for a primordial molecule, it was not obviously the wrong choice, either.

The hypothesis having been advanced, if with no very great sense of intellectual confidence, biologists differed in its interpretation. But they did concur on three general principles. First: that at some time in the distant past, RNA rather than DNA controlled genetic replication. Second: that Watson-Crick base pairing governed ancestral RNA. And third: that RNA once carried on chemical activities of the sort that are now entrusted to the proteins. The paradox of the chicken and the egg was thus resolved by the hypothesis that the chicken was the egg.

The independent discovery in 1981 of the ribozyme, a ribonucleic enzyme, by Thomas Cech and Sidney Altman endowed the RNA hypothesis with the force of a scientific conjecture. Studying the ciliated protozoan Tetrahymena thermophila, Cech discovered to his astonishment a form of RNA capable of inducing cleavage. Where an enzyme might have been busy pulling a strand of RNA apart, there was a ribozyme doing the work instead. That busy little molecule served not only to give instructions: apparently it took them as well, and in any case it did what biochemists had since the 1920's assumed could only be done by an enzyme and hence by a protein.

In 1986, the biochemist Walter Gilbert was moved to assert the existence of an entire RNA "world," an ancestral state promoted by the magic of this designation to what a great many biologists would affirm as fact. Thus, when the molecular biologist Harry Noller discovered that protein synthesis within the contemporary ribosome is catalyzed by ribosomal RNA (rRNA), and not by any of the familiar, old-fashioned enzymes, it appeared "almost certain" to Leslie Orgel that "there once was an RNA world" (emphasis added).

From Molecular Biology to the Origins of Life

It is perfectly true that every part of the modern cell carries some faint traces of the past. But these molecular traces are only hints. By contrast, to everyone who has studied it, the ribozyme has appeared to be an authentic relic, a solid and palpable souvenir from the pre-biotic past. Its discovery prompted even Francis Crick to the admission that he, too, wished he had been clever enough to look for such relics before they became known.

Thanks to the ribozyme, a great many scientists have become convinced that the "model for what science should be" is achingly close to encompassing the origins of life itself. "My expectation," remarks David Liu, professor of chemistry and chemical biology at Harvard, "is that we will be able to reduce this to a very simple series of logical events." Although often overstated, this optimism is by no means irrational. Looking at the modern cell, biologists propose to reconstruct in time the structures that are now plainly there in space.

Research into the origins of life has thus been subordinated to a rational three-part sequence, beginning in the very distant past. First, the constituents of the cell were formed and assembled. These included the nucleotide bases, the amino acids, and the sugars. There followed next the emergence of the ribozyme, endowed somehow with powers of self-replication. With the stage set, a system of coded chemistry then emerged, making possible what the molecular biologist Paul Schimmel has called "the theater of the proteins." Thus did matters proceed from the pre-biotic past to the very threshold of the last universal common ancestor, whereupon, with inimitable gusto, life began to diversify itself by means of Darwinian principles.

This account is no longer fantasy. But it is not yet fact. That is one reason why retracing its steps is such an interesting exercise, to which we now turn.

Miller Time

It is perhaps four billion years ago. The first of the great eras in the formation of life has commenced. The laws of chemistry are completely in control of things - what else is there? It is Miller Time, the period marking the transition from inorganic to organic chemistry.

According to the impression generally conveyed in both the popular and the scientific literature, the success of the original Miller-Urey experiment was both absolute and unqualified. This, however, is something of an exaggeration. Shortly after Miller and Urey published their results, a number of experienced geochemists expressed reservations. Miller and Urey had assumed that the pre-biotic atmosphere was one in which hydrogen atoms gave up (reduced) their electrons in order to promote chemical activity. Not so, the geochemists contended. The pre-biotic atmosphere was far more nearly neutral than reductive, with little or no methane and a good deal of carbon dioxide.

Nothing in the intervening years has suggested that these sour geochemists were far wrong. Writing in the 1999 issue of Peptides, B.M. Rode observed blandly that "modern geochemistry assumes that the secondary atmosphere of the primitive earth (i.e., after diffusion of hydrogen and helium into space) . . . consisted mainly of carbon dioxide, nitrogen, water, sulfur dioxide, and even small amounts of oxygen." This is not an environment calculated to induce excitement.

Until recently, the chemically unforthcoming nature of the early atmosphere remained an embarrassing secret among evolutionary biologists, like an uncle known privately to dress in women's underwear; if biologists were disposed in public to acknowledge the facts, they did so by remarking that every family has one. This has now changed. The issue has come to seem troubling. A recent paper in Science has suggested that previous conjectures about the pre-biotic atmosphere were seriously in error. A few researchers have argued that a reducing atmosphere is not, after all, quite so important to pre-biotic synthesis as previously imagined.

In all this, Miller himself has maintained a far more unyielding and honest perspective. "Either you have a reducing atmosphere," he has written bluntly, "or you're not going to have the organic compounds required for life."

If the composition of the pre-biotic atmosphere remains a matter of controversy, this can hardly be considered surprising: geochemists are attempting to revisit an era that lies four billion years in the past. The synthesis of pre-biotic chemicals is another matter. Questions about them come under the discipline of laboratory experiments.

Among the questions is one concerning the nitrogenous base cytosine (C). Not a trace of the stuff has been found in any meteor. Nothing in comets, either, so far as anyone can tell. It is not buried in the Antarctic. Nor can it be produced by any of the common experiments in pre-biotic chemistry. Beyond the living cell, it has not been found at all.

When, therefore, M.P. Robertson and Stanley Miller announced in Nature in 1995 that they had specified a plausible route for the pre-biotic synthesis of cytosine from cyanoacetaldehyde and urea, the feeling of gratification was very considerable. But it has also been short-lived. In a lengthy and influential review published in the 1999 Proceedings of the National Academy of Science, the New York University chemist Robert Shapiro observed that the reaction on which Robertson and Miller had pinned their hopes, although active enough, ultimately went nowhere. All too quickly, the cytosine that they had synthesized transformed itself into the RNA base uracil (U) by a chemical reaction known as deamination, which is nothing more mysterious than the process of getting rid of one molecule by sending it somewhere else.

The difficulty, as Shapiro wrote, was that "the formation of cytosine and the subsequent deamination of the product to uracil occur[ed] at about the same rate." Robertson and Miller had themselves reported that after 120 hours, half of their precious cytosine was gone-and it went faster when their reactions took place in saturated urea. In Shapiro's words, "It is clear that the yield of cytosine would fall to 0 percent if the reaction were extended."

If the central chemical reaction favored by Robertson and Miller was self-defeating, it was also contingent on circumstances that were unlikely. Concentrated urea was needed to prompt their reaction; an outhouse whiff would not do. For this same reason, however, the pre-biotic sea, where concentrates disappear too quickly, was hardly the place to begin - as anyone who has safely relieved himself in a swimming pool might confirm with guilty satisfaction. Aware of this, Robertson and Miller posited a different set of circumstances: in place of the pre-biotic soup, drying lagoons. In a fine polemical passage, their critic Shapiro stipulated what would thereby be required:

An isolated lagoon or other body of seawater would have to undergo extreme concentration. . . .

  • It would further be necessary that the residual liquid be held in an impermeable vessel [in order to prevent cross-reactions].
  • The concentration process would have to be interrupted for some decades . . . to allow the reaction to occur.
  • At this point, the reaction would require quenching (perhaps by evaporation to dryness) to prevent loss by deamination.

At the end, one would have a batch of urea in solid form, containing some cytosine (and urea).

Such a scenario, Shapiro remarked, "cannot be excluded as a rare event on early earth, but it cannot be termed plausible."

Like cytosine, sugar must also make an appearance in Miller Time, and, like cytosine, it too is difficult to synthesize under plausible pre-biotic conditions.

In 1861, the German chemist Alexander Bulterow created a sugar-like substance from a mixture of formaldehyde and lime. Subsequently refined by a long line of organic chemists, Bulterow's so-called formose reaction has been an inspiration to origins-of-life researchers ever since.

The reaction is today initiated by an alkalizing agent, such as thallium or lead hydroxide. There follows a long induction period, with a number of intermediates bubbling up. The formose reaction is auto-catalytic in the sense that it keeps on going: the carbohydrates that it generates serve to prime the reaction in an exponentially growing feedback loop until the initial stock of formaldehyde is exhausted. With the induction over, the formose reaction yields a number of complex sugars.

Nonetheless, it is not sugars in general that are wanted from Miller Time but a particular form of sugar, namely, ribose, and not simply ribose but dextro ribose. Compounds of carbon are naturally right-handed or left-handed, depending on how they polarize light. The ribose in living systems is right-handed, hence the prefix "dextro." But the sugars exiting the formose reaction are racemic, that is, both left- and right-handed, and the yield of usable ribose is negligible.

While nothing has as yet changed the fundamental fact that it is very hard to get the right kind of sugar from any sort of experiment, in 1990 the Swiss chemist Albert Eschenmoser was able to change substantially the way in which the sugars appeared. Reaching with the hand of a master into the formose reaction itself, Eschenmoser altered two molecules by adding a phosphate group to them. This slight change prevented the formation of the alien sugars that cluttered the classical formose reaction. The products, Eschenmoser reported, included among other things a mixture of ribose-2,4,-diphosphate. Although the mixture was racemic, it did contain a molecule close to the ribose needed by living systems. With a few chemical adjustments, Eschenmoser could plausibly claim, the pre-biotic route to the synthesis of sugar would lie open.

It remained for skeptics to observe that Eschenmoser's ribose reactions were critically contingent on Eschenmoser himself, and at two points: the first when he attached phosphate groups to a number of intermediates in the formose reaction, and the second when he removed them.

What had given the original Miller-Urey experiment its power to excite the imagination was the sense that, having set the stage, Miller and Urey exited the theater. By contrast, Eschenmoser remained at center stage, giving directions and in general proving himself indispensable to the whole scene.

Events occurring in Miller Time would thus appear to depend on the large assumption, still unproved, that the early atmosphere was reductive, while two of the era's chemical triumphs, cytosine and sugar, remain for the moment beyond the powers of contemporary pre-biotic chemistry.

From Miller Time to Self-Replicating RNA

In the grand progression by which life arose from inorganic matter, Miller Time has been concluded. It is now 3.8 billion years ago. The chemical precursors to life have been formed. A limpid pool of nucleotides is somewhere in existence. A new era is about to commence.

The historical task assigned to this era is a double one: forming chains of nucleic acids from nucleotides, and discovering among them those capable of reproducing themselves. Without the first, there is no RNA; and without the second, there is no life.

In living systems, polymerization or chain-formation proceeds by means of the cell's invaluable enzymes. But in the grim inhospitable pre-biotic, no enzymes were available. And so chemists have assigned their task to various inorganic catalysts. J.P. Ferris and G. Ertem, for instance, have reported that activated nucleotides bond covalently when embedded on the surface of montmorillonite, a kind of clay. This example, combining technical complexity with general inconclusiveness, may stand for many others.

In any event, polymerization having been concluded, by whatever means, the result was (in the words of Gerald Joyce and Leslie Orgel) "a random ensemble of polynucleotide sequences": long molecules emerging from short ones, like fronds on the surface of a pond. Among these fronds, nature is said to have discovered a self-replicating molecule. But how?

Darwinian evolution is plainly unavailing in this exercise or that era, since Darwinian evolution begins with self-replication, and self-replication is precisely what needs to be explained. But if Darwinian evolution is unavailing, so, too, is chemistry. The fronds comprise "a random ensemble of polynucleotide sequences" (emphasis added); but no principle of organic chemistry suggests that aimless encounters among nucleic acids must lead to a chain capable of self-replication.

If chemistry is unavailing and Darwin indisposed, what is left as a mechanism? The evolutionary biologist's finest friend: sheer dumb luck.

Was nature lucky? It depends on the payoff and the odds. The payoff is clear: an ancestral form of RNA capable of replication. Without that payoff, there is no life, and obviously, at some point, the payoff paid off. The question is the odds.

For the moment, no one knows how precisely to compute those odds, if only because within the laboratory, no one has conducted an experiment leading to a self-replicating ribozyme. But the minimum length or "sequence" that is needed for a contemporary ribozyme to undertake what the distinguished geochemist Gustaf Arrhenius calls "demonstrated ligase activity" is known. It is roughly 100 nucleotides.

Whereupon, just as one might expect, things blow up very quickly. As Arrhenius notes, there are 4100 or roughly 1060 nucleotide sequences that are 100 nucleotides in length. This is an unfathomably large number. It exceeds the number of atoms contained in the universe, as well as the age of the universe in seconds. If the odds in favor of self-replication are 1 in 1060, no betting man would take them, no matter how attractive the payoff, and neither presumably would nature.

"Solace from the tyranny of nucleotide combinatorials," Arrhenius remarks in discussing this very point, "is sought in the feeling that strict sequence specificity may not be required through all the domains of a functional oligmer, thus making a large number of library items eligible for participation in the construction of the ultimate functional entity." Allow me to translate: why assume that self-replicating sequences are apt to be rare just because they are long? They might have been quite common.

They might well have been. And yet all experience is against it. Why should self-replicating RNA molecules have been common 3.6 billion years ago when they are impossible to discern under laboratory conditions today? No one, for that matter, has ever seen a ribozyme capable of any form of catalytic action that is not very specific in its sequence and thus unlike even closely related sequences. No one has ever seen a ribozyme able to undertake chemical action without a suite of enzymes in attendance. No one has ever seen anything like it.

The odds, then, are daunting; and when considered realistically, they are even worse than this already alarming account might suggest. The discovery of a single molecule with the power to initiate replication would hardly be sufficient to establish replication. What template would it replicate against? We need, in other words, at least two, causing the odds of their joint discovery to increase from 1 in 1060 to 1 in 10120. Those two sequences would have been needed in roughly the same place. And at the same time. And organized in such a way as to favor base pairing. And somehow held in place. And buffered against competing reactions. And productive enough so that their duplicates would not at once vanish in the soundless sea.

In contemplating the discovery by chance of two RNA sequences a mere 40 nucleotides in length, Joyce and Orgel concluded that the requisite "library" would require 1048 possible sequences. Given the weight of RNA, they observed gloomily, the relevant sample space would exceed the mass of the earth. And this is the same Leslie Orgel, it will be remembered, who observed that "it was almost certain that there once was an RNA world."

To the accumulating agenda of assumptions, then, let us add two more: that without enzymes, nucleotides were somehow formed into chains, and that by means we cannot duplicate in the laboratory, a pre-biotic molecule discovered how to reproduce itself.

From Self-Replicating RNA to Coded Chemistry

A new era is now in prospect, one that begins with a self-replicating form of RNA and ends with the system of coded chemistry characteristic of the modern cell. The modern cell, meaning one that divides its labors by assigning to the nucleic acids the management of information and to the proteins the execution of chemical activity. It is 3.6 billion years ago.

It is with the advent of this era that distinctively conceptual problems emerge. The gods of chemistry may now be seen receding into the distance. The cell's system of coded chemistry is determined by two discrete combinatorial objects: the nucleic acids and the amino acids. These objects are discrete because, just as there are no fractional sentences containing three-and-a-half words, there are no fractional nucleotide sequences containing three-and-a-half nucleotides, or fractional proteins containing three-and-a-half amino acids. They are combinatorial because both the nucleic acids and the amino acids are combined by the cell into larger structures.

But if information management and its administration within the modern cell are determined by a discrete combinatorial system, the work of the cell is part of a markedly different enterprise. The periodic table notwithstanding, chemical reactions are not combinatorial, and they are not discrete. The chemical bond, as Linus Pauling demonstrated in the 1930's, is based squarely on quantum mechanics. And to the extent that chemistry is explained in terms of physics, it is encompassed not only by "the model for what science should be" but by the system of differential equations that play so conspicuous a role in every one of the great theories of mathematical physics.

What serves to coordinate the cell's two big shots of information management and chemical activity, and so to coordinate two fundamentally different structures, is the universal genetic code. To capture the remarkable nature of the facts in play here, it is useful to stress the word code.

By itself, a code is familiar enough: an arbitrary mapping or a system of linkages between two discrete combinatorial objects. The Morse code, to take a familiar example, coordinates dashes and dots with letters of the alphabet. To note that codes are arbitrary is to note the distinction between a code and a purely physical connection between two objects. To note that codes embody mappings is to embed the concept of a code in mathematical language. To note that codes reflect a linkage of some sort is to return the concept of a code to its human uses.

In every normal circumstance, the linkage comes first and represents a human achievement, something arising from a point beyond the coding system. (The coordination of dot-dot-dot-dash-dash-dash-dot-dot-dot with the distress signal S-O-S is again a familiar example.) Just as no word explains its own meaning, no code establishes its own nature.

The conceptual question now follows. Can the origins of a system of coded chemistry be explained in a way that makes no appeal whatsoever to the kinds of facts that we otherwise invoke to explain codes and languages, systems of communication, the impress of ordinary words on the world of matter?

In this regard, it is worth recalling that, as Hubert Yockey observes in Information Theory, Evolution, and the Origin of Life (2005), "there is no trace in physics or chemistry of the control of chemical reactions by a sequence of any sort or of a code between sequences."

Writing in the 2001 issue of the journal RNA, the microbiologist Carl Woese referred ominously to the "dark side of molecular biology." DNA replication, Woese wrote, is the extraordinarily elegant expression of the structural properties of a single molecule: zip down, divide, zip up. The transcription into RNA follows suit: copy and conserve. In each of these two cases, structure leads to function. But where is the coordinating link between the chemical structure of DNA and the third step, namely, translation? When it comes to translation, the apparatus is baroque: it is incredibly elaborate, and it does not reflect the structure of any molecule.

These reflections prompted Woese to a somber conclusion: if "the nucleic acids cannot in any way recognize the amino acids," then there is no "fundamental physical principle" at work in translation (emphasis added).

But Woese's diagnosis of disorder is far too partial; the symptoms he regards as singular are in fact widespread. What holds for translation holds as well for replication and transcription. The nucleic acids cannot directly recognize the amino acids (and vice versa), but they cannot directly replicate or transcribe themselves, either. Both replication and translation are enzymatically driven, and without those enzymes, a molecule of DNA or RNA would do nothing whatsoever. Contrary to what Woese imagines, no fundamental physical principles appear directly at work anywhere in the modern cell.

The most difficult and challenging problem associated with the origins of life is now in view. One half of the modern system of coded chemistry, the genetic code and the sequences it conveys, is, from a chemical perspective, arbitrary. The other half of the system of coded chemistry, the activity of the proteins, is, from a chemical perspective, necessary. In life, the two halves are coordinated. The problem follows: how did that, the whole system, get here?

The prevailing opinion among molecular biologists is that questions about molecular-biological systems can only be answered by molecular-biological experiments. The distinguished molecular biologist Horoaki Suga has recently demonstrated the strengths and the limitations of the experimental method when confronted by difficult conceptual questions like the one I have just posed.

The goal of Suga's experiment was to show that a set of RNA catalysts (or ribozymes) could well have played the role now played in the modern cell by the protein family of aminoacyl synthetases. Until his work, Suga reports, there had been no convincing demonstration that a ribozyme was able to perform the double function of a synthetase - that is, recognizing both a form of transfer RNA and an amino acid. But in Suga's laboratory, just such a molecule made a now-celebrated appearance. With an amino acid attached to its tail, the ribozyme managed to cleave itself and, like a snake, affix its amino-acid cargo onto its head. What is more, it could conduct this exercise backward, shifting the amino acid from its head to its tail again. The chemical reactions involved acylation: precisely the reactions undertaken by synthetases in the modern cell.

Horoaki Suga's experiment was both interesting and ingenious, prompting a reaction perhaps best expressed as, "Well, would you look at that!" It has altered the terms of debate by placing a number of new facts on the table. And yet, as so often happens in experimental pre-biotic chemistry, it is by no means clear what interpretation the facts will sustain.

Do Suga's results really establish the existence of a primitive form of coded chemistry? Although unexpected in context, the coordination he achieved between an amino acid and a form of transfer RNA was never at issue in principle. The question is whether what was accomplished in establishing a chemical connection between these two molecules was anything like establishing the existence of a code. If so, then organic chemistry itself could properly be described as the study of codes, thereby erasing the meaning of a code as an arbitrary mapping between discrete combinatorial objects.

Suga, in summarizing the results of his research, captures rhetorically the inconclusiveness of his achievement. "Our demonstration indicates," he writes, "that catalytic precursor tRNA's could have provided the foundation of the genetic coding system." But if the association at issue is not a code, however primitive, it could no more be the "foundation" of a code than a feather could be the foundation of a building. And if it is the foundation of a code, then what has been accomplished has been accomplished by the wrong agent.

In Suga's experiment, there was no sign that the execution of chemical routines fell under the control of a molecular administration, and no sign, either, that the missing molecular administration had anything to do with executive chemical routines. The missing molecular administrator was, in fact, Suga himself, as his own account reveals. The relevant features of the experiment, he writes, "allow[ed] us to select active RNA molecules with selectivity toward a desired amino acid" (emphasis added). Thereafter, it was Suga and his collaborators who "applied stringent conditions" to the experiment, undertook "selective amplification of the self-modifying RNA molecules," and "screened" vigorously for "self-aminoacylation activity"(emphasis added throughout).

If nothing else, the advent of a system of coded chemistry satisfied the most urgent of imperatives: it was needed and it was found. It was needed because once a system of chemical reactions reaches a certain threshold of complexity, nothing less than a system of coded chemistry can possibly master the ensuing chaos. It was found because, after all, we are here.

Precisely these circumstances have persuaded many molecular biologists that the explanation for the emergence of a system of coded chemistry must in the end lie with Darwin's theory of evolution. As one critic has observed in commenting on Suga's experiments, "If a certain result can be achieved by direction in a laboratory by a Suga, surely it can also be achieved by chance in a vast universe."

A self-replicating ribozyme meets the first condition required for Darwinian evolution to gain purchase. It is by definition capable of replication. And it meets the second condition as well, for, by means of mistakes in replication, it introduces the possibility of variety into the biological world. On the assumption that subsequent changes to the system follow a law of increasing marginal utility, one can then envisage the eventual emergence of a system of coded chemistry - a system that can be explained in terms of "the model for what science should be."

It was no doubt out of considerations like these that, in coming up against what he called the "dark side of molecular biology," Carl Woese was concerned to urge upon the biological community the benefits of "an all-out Darwinian perspective." But the difficulty with "an all-out Darwinian perspective" is that it entails an all-out Darwinian impediment: notably, the assignment of a degree of foresight to a Darwinian process that the process could not possibly possess.

The hypothesis of an RNA world trades brilliantly on the idea that a divided modern system had its roots in some form of molecular symmetry that was then broken by the contingencies of life. At some point in the transition to the modern system, an ancestral form of RNA must have assigned some of its catalytic properties to an emerging family of proteins. This would have taken place at a given historical moment; it is not an artifact of the imagination. Similarly, at some point in the transition to a modern system, an ancestral form of RNA must have acquired the ability to code for the catalytic powers it was discarding. And this, too, must have taken place at a particular historical moment.

The question, of course, is which of the two steps came first. Without life acquiring some degree of foresight, neither step can be plausibly fixed in place by means of any schedule of selective advantages. How could an ancestral form of RNA have acquired the ability to code for various amino acids before coding was useful? But then again, why should "ribozymes in an RNA world," as the molecular biologists Paul Schimmel and Shana O. Kelley ask, "have expedited their own obsolescence?"

Could the two steps have taken place simultaneously? If so, there would appear to be very little difference between a Darwinian explanation and the frank admission that a miracle was at work. If no miracles are at work, we are returned to the place from which we started, with the chicken-and-egg pattern that is visible when life is traced backward now appearing when it is traced forward.

It is thus unsurprising that writings embodying Woese's "all-out Darwinian perspective" are dominated by references to a number of unspecified but mysteriously potent forces and obscure conditional circumstances. I quote without attribution because the citations are almost generic (emphasis added throughout):

  • The aminoacylation of RNA initially must have provided some selective advantage.
  • The products of this reaction must have conferred some selective advantage.
  • However, the development of a crude mechanism for controlling the diversity of possible peptides would have been advantageous.
  • [P]rogressive refinement of that mechanism would have provided further selective advantage.

And so forth - ending, one imagines, in reduction to the all-purpose imperative of Darwinian theory, which is simply that what was must have been.

Now It Is Now

At the conclusion of a long essay, it is customary to summarize what has been learned. In the present case, I suspect it would be more prudent to recall how much has been assumed:

First, that the pre-biotic atmosphere was chemically reductive; second, that nature found a way to synthesize cytosine; third, that nature also found a way to synthesize ribose; fourth, that nature found the means to assemble nucleotides into polynucleotides; fifth, that nature discovered a self-replicating molecule; and sixth, that having done all that, nature promoted a self-replicating molecule into a full system of coded chemistry.

These assumptions are not only vexing but progressively so, ending in a serious impediment to thought. That, indeed, may be why a number of biologists have lately reported a weakening of their commitment to the RNA world altogether, and a desire to look elsewhere for an explanation of the emergence of life on earth. "It's part of a quiet paradigm revolution going on in biology," the biophysicist Harold Morowitz put it in an interview in New Scientist, "in which the radical randomness of Darwinism is being replaced by a much more scientific law-regulated emergence of life."

Morowitz is not a man inclined to wait for the details to accumulate before reorganizing the vista of modern biology. In a series of articles, he has argued for a global vision based on the biochemistry of living systems rather than on their molecular biology or on Darwinian adaptations. His vision treats the living system as more fundamental than its particular species, claiming to represent the "universal and deterministic features of any system of chemical interactions based on a water-covered but rocky planet such as ours."

This view of things - metabolism first, as it is often called - is not only intriguing in itself but is enhanced by a firm commitment to chemistry and to "the model for what science should be." It has been argued with great vigor by Morowitz and others. It represents an alternative to the RNA world. It is a work in progress, and it may well be right. Nonetheless, it suffers from one outstanding defect. There is as yet no evidence that it is true.

It is now more than 175 years since Friedrich Wöhler announced the synthesis of urea. It would be the height of folly to doubt that our understanding of life's origins has been immeasurably improved. But whether it has been immeasurably improved in a way that vigorously confirms the daring idea that living systems are chemical in their origin and so physical in their nature, that is another question entirely.

In "On the Origins of the Mind," I tried to show that much can be learned by studying the issue from a computational perspective. Analogously, in contemplating the origins of life, much - in fact, more - can be learned by studying the issue from the perspective of coded chemistry. In both cases, however, what seems to lie beyond the reach of "the model for what science should be" is any success beyond the local. All questions about the global origins of these strange and baffling systems seem to demand answers that the model itself cannot by its nature provide.

It goes without saying that this is a tentative judgment, perhaps only a hunch. But let us suppose that questions about the origins of the mind and the origins of life do lie beyond the grasp of "the model for what science should be." In that case, we must either content ourselves with its limitations or revise the model. If a revision also lies beyond our powers, then we may well have to say that the mind and life have appeared in the universe for no very good reason that we can discern.

Worse things have happened. In the end, these are matters that can only be resolved in the way that all such questions are resolved. We must wait and see.

Searching for small targets in large spaces is a common problem in the sciences. Because blind search is inadequate for such searches, it needs to be supplemented with additional information, thereby transforming a blind search into an assisted search. This additional information can be quantified and indicates that assisted searches themselves result from searching higher-level search spaces–by conducting, as it were, a search for a search. Thus, the original search gets displaced to a higher-level search. The key result in this paper is a displacement theorem, which shows that successfully resolving such a higher-level search is exponentially more difficult than successfully resolving the original search. Leading up to this result, a measure-theoretic version of the No Free Lunch theorems is formulated and proven. The paper shows that stochastic mechanisms, though able to explain the success of assisted searches in locating targets, cannot, in turn, explain the source of assisted searches.

On August 4th, 2004 an extensive review essay by Dr. Stephen C. Meyer, Director of Discovery Institute's Center for Science & Culture appeared in the Proceedings of the Biological Society of Washington (volume 117, no. 2, pp. 213-239). The Proceedings is a peer-reviewed biology journal published at the National Museum of Natural History at the Smithsonian Institution in Washington D.C.

In the article, entitled "The Origin of Biological Information and the Higher Taxonomic Categories", Dr. Meyer argues that no current materialistic theory of evolution can account for the origin of the information necessary to build novel animal forms. He proposes intelligent design as an alternative explanation for the origin of biological information and the higher taxa.

Due to an unusual number of inquiries about the article, Dr. Meyer, the copyright holder, has decided to make the article available now in HTML format on this website. (Off prints are also available from Discovery Institute by writing to Rob Crowther at: Please provide your mailing address and we will dispatch a copy).


In a recent volume of the Vienna Series in a Theoretical Biology (2003), Gerd B. Muller and Stuart Newman argue that what they call the "origination of organismal form" remains an unsolved problem. In making this claim, Muller and Newman (2003:3-10) distinguish two distinct issues, namely, (1) the causes of form generation in the individual organism during embryological development and (2) the causes responsible for the production of novel organismal forms in the first place during the history of life. To distinguish the latter case (phylogeny) from the former (ontogeny), Muller and Newman use the term "origination" to designate the causal processes by which biological form first arose during the evolution of life. They insist that "the molecular mechanisms that bring about biological form in modern day embryos should not be confused" with the causes responsible for the origin (or "origination") of novel biological forms during the history of life (p.3). They further argue that we know more about the causes of ontogenesis, due to advances in molecular biology, molecular genetics and developmental biology, than we do about the causes of phylogenesis--the ultimate origination of new biological forms during the remote past.

In making this claim, Muller and Newman are careful to affirm that evolutionary biology has succeeded in explaining how preexisting forms diversify under the twin influences of natural selection and variation of genetic traits. Sophisticated mathematically-based models of population genetics have proven adequate for mapping and understanding quantitative variability and populational changes in organisms. Yet Muller and Newman insist that population genetics, and thus evolutionary biology, has not identified a specifically causal explanation for the origin of true morphological novelty during the history of life. Central to their concern is what they see as the inadequacy of the variation of genetic traits as a source of new form and structure. They note, following Darwin himself, that the sources of new form and structure must precede the action of natural selection (2003:3)--that selection must act on what already exists. Yet, in their view, the "genocentricity" and "incrementalism" of the neo-Darwinian mechanism has meant that an adequate source of new form and structure has yet to be identified by theoretical biologists. Instead, Muller and Newman see the need to identify epigenetic sources of morphological innovation during the evolution of life. In the meantime, however, they insist neo-Darwinism lacks any "theory of the generative" (p. 7).

As it happens, Muller and Newman are not alone in this judgment. In the last decade or so a host of scientific essays and books have questioned the efficacy of selection and mutation as a mechanism for generating morphological novelty, as even a brief literature survey will establish. Thomson (1992:107) expressed doubt that large-scale morphological changes could accumulate via minor phenotypic changes at the population genetic level. Miklos (1993:29) argued that neo-Darwinism fails to provide a mechanism that can produce large-scale innovations in form and complexity. Gilbert et al. (1996) attempted to develop a new theory of evolutionary mechanisms to supplement classical neo-Darwinism, which, they argued, could not adequately explain macroevolution. As they put it in a memorable summary of the situation: "starting in the 1970s, many biologists began questioning its (neo-Darwinism's) adequacy in explaining evolution. Genetics might be adequate for explaining microevolution, but microevolutionary changes in gene frequency were not seen as able to turn a reptile into a mammal or to convert a fish into an amphibian. Microevolution looks at adaptations that concern the survival of the fittest, not the arrival of the fittest. As Goodwin (1995) points out, 'the origin of species--Darwin's problem--remains unsolved'" (p. 361). Though Gilbert et al. (1996) attempted to solve the problem of the origin of form by proposing a greater role for developmental genetics within an otherwise neo-Darwinian framework,1 numerous recent authors have continued to raise questions about the adequacy of that framework itself or about the problem of the origination of form generally (Webster & Goodwin 1996; Shubin & Marshall 2000; Erwin 2000; Conway Morris 2000, 2003b; Carroll 2000; Wagner 2001; Becker & Lonnig 2001; Stadler et al. 2001; Lonnig & Saedler 2002; Wagner & Stadler 2003; Valentine 2004:189-194).

What lies behind this skepticism? Is it warranted? Is a new and specifically causal theory needed to explain the origination of biological form?

This review will address these questions. It will do so by analyzing the problem of the origination of organismal form (and the corresponding emergence of higher taxa) from a particular theoretical standpoint. Specifically, it will treat the problem of the origination of the higher taxonomic groups as a manifestation of a deeper problem, namely, the problem of the origin of the information (whether genetic or epigenetic) that, as it will be argued, is necessary to generate morphological novelty.

In order to perform this analysis, and to make it relevant and tractable to systematists and paleontologists, this paper will examine a paradigmatic example of the origin of biological form and information during the history of life: the Cambrian explosion. During the Cambrian, many novel animal forms and body plans (representing new phyla, subphyla and classes) arose in a geologically brief period of time. The following information-based analysis of the Cambrian explosion will support the claim of recent authors such as Muller and Newman that the mechanism of selection and genetic mutation does not constitute an adequate causal explanation of the origination of biological form in the higher taxonomic groups. It will also suggest the need to explore other possible causal factors for the origin of form and information during the evolution of life and will examine some other possibilities that have been proposed.

The Cambrian Explosion

The "Cambrian explosion" refers to the geologically sudden appearance of many new animal body plans about 530 million years ago. At this time, at least nineteen, and perhaps as many as thirty-five phyla of forty total (Meyer et al. 2003), made their first appearance on earth within a narrow five- to ten-million-year window of geologic time (Bowring et al. 1993, 1998a:1, 1998b:40; Kerr 1993; Monastersky 1993; Aris-Brosou & Yang 2003). Many new subphyla, between 32 and 48 of 56 total (Meyer et al. 2003), and classes of animals also arose at this time with representatives of these new higher taxa manifesting significant morphological innovations. The Cambrian explosion thus marked a major episode of morphogenesis in which many new and disparate organismal forms arose in a geologically brief period of time.

To say that the fauna of the Cambrian period appeared in a geologically sudden manner also implies the absence of clear transitional intermediate forms connecting Cambrian animals with simpler pre-Cambrian forms. And, indeed, in almost all cases, the Cambrian animals have no clear morphological antecedents in earlier Vendian or Precambrian fauna (Miklos 1993, Erwin et al. 1997:132, Steiner & Reitner 2001, Conway Morris 2003b:510, Valentine et al. 2003:519-520). Further, several recent discoveries and analyses suggest that these morphological gaps may not be merely an artifact of incomplete sampling of the fossil record (Foote 1997, Foote et al. 1999, Benton & Ayala 2003, Meyer et al. 2003), suggesting that the fossil record is at least approximately reliable (Conway Morris 2003b:505).

As a result, debate now exists about the extent to which this pattern of evidence comports with a strictly monophyletic view of evolution (Conway Morris 1998a, 2003a, 2003b:510; Willmer 1990, 2003). Further, among those who accept a monophyletic view of the history of life, debate exists about whether to privilege fossil or molecular data and analyses. Those who think the fossil data provide a more reliable picture of the origin of the Metazoan tend to think these animals arose relatively quickly--that the Cambrian explosion had a "short fuse." (Conway Morris 2003b:505-506, Valentine & Jablonski 2003). Some (Wray et al. 1996), but not all (Ayala et al. 1998), who think that molecular phylogenies establish reliable divergence times from pre-Cambrian ancestors think that the Cambrian animals evolved over a very long period of time--that the Cambrian explosion had a "long fuse." This review will not address these questions of historical pattern. Instead, it will analyze whether the neo-Darwinian process of mutation and selection, or other processes of evolutionary change, can generate the form and information necessary to produce the animals that arise in the Cambrian. This analysis will, for the most part, 2 therefore, not depend upon assumptions of either a long or short fuse for the Cambrian explosion, or upon a monophyletic or polyphyletic view of the early history of life.

Defining Biological Form and Information

Form, like life itself, is easy to recognize but often hard to define precisely. Yet, a reasonable working definition of form will suffice for our present purposes. Form can be defined as the four-dimensional topological relations of anatomical parts. This means that one can understand form as a unified arrangement of body parts or material components in a distinct shape or pattern (topology)--one that exists in three spatial dimensions and which arises in time during ontogeny.

Insofar as any particular biological form constitutes something like a distinct arrangement of constituent body parts, form can be seen as arising from constraints that limit the possible arrangements of matter. Specifically, organismal form arises (both in phylogeny and ontogeny) as possible arrangements of material parts are constrained to establish a specific or particular arrangement with an identifiable three dimensional topography--one that we would recognize as a particular protein, cell type, organ, body plan or organism. A particular "form," therefore, represents a highly specific and constrained arrangement of material components (among a much larger set of possible arrangements).

Understanding form in this way suggests a connection to the notion of information in its most theoretically general sense. When Shannon (1948) first developed a mathematical theory of information he equated the amount of information transmitted with the amount of uncertainty reduced or eliminated in a series of symbols or characters. Information, in Shannon's theory, is thus imparted as some options are excluded and others are actualized. The greater the number of options excluded, the greater the amount of information conveyed. Further, constraining a set of possible material arrangements by whatever process or means involves excluding some options and actualizing others. Thus, to constrain a set of possible material states is to generate information in Shannon's sense. It follows that the constraints that produce biological form also imparted information. Or conversely, one might say that producing organismal form by definition requires the generation of information.

In classical Shannon information theory, the amount of information in a system is also inversely related to the probability of the arrangement of constituents in a system or the characters along a communication channel (Shannon 1948). The more improbable (or complex) the arrangement, the more Shannon information, or information-carrying capacity, a string or system possesses.

Since the 1960s, mathematical biologists have realized that Shannon's theory could be applied to the analysis of DNA and proteins to measure the information-carrying capacity of these macromolecules. Since DNA contains the assembly instructions for building proteins, the information-processing system in the cell represents a kind of communication channel (Yockey 1992:110). Further, DNA conveys information via specifically arranged sequences of nucleotide bases. Since each of the four bases has a roughly equal chance of occurring at each site along the spine of the DNA molecule, biologists can calculate the probability, and thus the information-carrying capacity, of any particular sequence n bases long.

The ease with which information theory applies to molecular biology has created confusion about the type of information that DNA and proteins possess. Sequences of nucleotide bases in DNA, or amino acids in a protein, are highly improbable and thus have large information-carrying capacities. But, like meaningful sentences or lines of computer code, genes and proteins are also specified with respect to function. Just as the meaning of a sentence depends upon the specific arrangement of the letters in a sentence, so too does the function of a gene sequence depend upon the specific arrangement of the nucleotide bases in a gene. Thus, molecular biologists beginning with Crick equated information not only with complexity but also with "specificity," where "specificity" or "specified" has meant "necessary to function" (Crick 1958:144, 153; Sarkar, 1996:191).3 Molecular biologists such as Monod and Crick understood biological information--the information stored in DNA and proteins--as something more than mere complexity (or improbability). Their notion of information associated both biochemical contingency and combinatorial complexity with DNA sequences (allowing DNA's carrying capacity to be calculated), but it also affirmed that sequences of nucleotides and amino acids in functioning macromolecules possessed a high degree of specificity relative to the maintenance of cellular function.

The ease with which information theory applies to molecular biology has also created confusion about the location of information in organisms. Perhaps because the information carrying capacity of the gene could be so easily measured, it has been easy to treat DNA, RNA and proteins as the sole repositories of biological information. Neo-Darwinists in particular have assumed that the origination of biological form could be explained by recourse to processes of genetic variation and mutation alone (Levinton 1988:485). Yet if one understands organismal form as resulting from constraints on the possible arrangements of matter at many levels in the biological hierarchy--from genes and proteins to cell types and tissues to organs and body plans--then clearly biological organisms exhibit many levels of information-rich structure.

Thus, we can pose a question, not only about the origin of genetic information, but also about the origin of the information necessary to generate form and structure at levels higher than that present in individual proteins. We must also ask about the origin of the "specified complexity," as opposed to mere complexity, that characterizes the new genes, proteins, cell types and body plans that arose in the Cambrian explosion. Dembski (2002) has used the term "complex specified information" (CSI) as a synonym for "specified complexity" to help distinguish functional biological information from mere Shannon information--that is, specified complexity from mere complexity. This review will use this term as well.

The Cambrian Information Explosion

The Cambrian explosion represents a remarkable jump in the specified complexity or "complex specified information" (CSI) of the biological world. For over three billions years, the biological realm included little more than bacteria and algae (Brocks et al. 1999). Then, beginning about 570-565 million years ago (mya), the first complex multicellular organisms appeared in the rock strata, including sponges, cnidarians, and the peculiar Ediacaran biota (Grotzinger et al. 1995). Forty million years later, the Cambrian explosion occurred (Bowring et al. 1993). The emergence of the Ediacaran biota (570 mya), and then to a much greater extent the Cambrian explosion (530 mya), represented steep climbs up the biological complexity gradient.

One way to estimate the amount of new CSI that appeared with the Cambrian animals is to count the number of new cell types that emerged with them (Valentine 1995:91-93). Studies of modern animals suggest that the sponges that appeared in the late Precambrian, for example, would have required five cell types, whereas the more complex animals that appeared in the Cambrian (e.g., arthropods) would have required fifty or more cell types. Functionally more complex animals require more cell types to perform their more diverse functions. New cell types require many new and specialized proteins. New proteins, in turn, require new genetic information. Thus an increase in the number of cell types implies (at a minimum) a considerable increase in the amount of specified genetic information. Molecular biologists have recently estimated that a minimally complex single-celled organism would require between 318 and 562 kilobase pairs of DNA to produce the proteins necessary to maintain life (Koonin 2000). More complex single cells might require upward of a million base pairs. Yet to build the proteins necessary to sustain a complex arthropod such as a trilobite would require orders of magnitude more coding instructions. The genome size of a modern arthropod, the fruitfly Drosophila melanogaster, is approximately 180 million base pairs (Gerhart & Kirschner 1997:121, Adams et al. 2000). Transitions from a single cell to colonies of cells to complex animals represent significant (and, in principle, measurable) increases in CSI.

Building a new animal from a single-celled organism requires a vast amount of new genetic information. It also requires a way of arranging gene products--proteins--into higher levels of organization. New proteins are required to service new cell types. But new proteins must be organized into new systems within the cell; new cell types must be organized into new tissues, organs, and body parts. These, in turn, must be organized to form body plans. New animals, therefore, embody hierarchically organized systems of lower-level parts within a functional whole. Such hierarchical organization itself represents a type of information, since body plans comprise both highly improbable and functionally specified arrangements of lower-level parts. The specified complexity of new body plans requires explanation in any account of the Cambrian explosion.

Can neo-Darwinism explain the discontinuous increase in CSI that appears in the Cambrian explosion--either in the form of new genetic information or in the form of hierarchically organized systems of parts? We will now examine the two parts of this question.

Novel Genes and Proteins

Many scientists and mathematicians have questioned the ability of mutation and selection to generate information in the form of novel genes and proteins. Such skepticism often derives from consideration of the extreme improbability (and specificity) of functional genes and proteins.

A typical gene contains over one thousand precisely arranged bases. For any specific arrangement of four nucleotide bases of length n, there is a corresponding number of possible arrangements of bases, 4n. For any protein, there are 20n possible arrangements of protein-forming amino acids. A gene 999 bases in length represents one of 4999 possible nucleotide sequences; a protein of 333 amino acids is one of 20333 possibilities.

Since the 1960s, some biologists have thought functional proteins to be rare among the set of possible amino acid sequences. Some have used an analogy with human language to illustrate why this should be the case. Denton (1986, 309-311), for example, has shown that meaningful words and sentences are extremely rare among the set of possible combinations of English letters, especially as sequence length grows. (The ratio of meaningful 12-letter words to 12-letter sequences is 1/1014, the ratio of 100-letter sentences to possible 100-letter strings is 1/10100.) Further, Denton shows that most meaningful sentences are highly isolated from one another in the space of possible combinations, so that random substitutions of letters will, after a very few changes, inevitably degrade meaning. Apart from a few closely clustered sentences accessible by random substitution, the overwhelming majority of meaningful sentences lie, probabilistically speaking, beyond the reach of random search.

Denton (1986:301-324) and others have argued that similar constraints apply to genes and proteins. They have questioned whether an undirected search via mutation and selection would have a reasonable chance of locating new islands of function--representing fundamentally new genes or proteins--within the time available (Eden 1967, Shutzenberger 1967, Lovtrup 1979). Some have also argued that alterations in sequencing would likely result in loss of protein function before fundamentally new function could arise (Eden 1967, Denton 1986). Nevertheless, neither the extent to which genes and proteins are sensitive to functional loss as a result of sequence change, nor the extent to which functional proteins are isolated within sequence space, has been fully known.

Recently, experiments in molecular biology have shed light on these questions. A variety of mutagenesis techniques have shown that proteins (and thus the genes that produce them) are indeed highly specified relative to biological function (Bowie & Sauer 1989, Reidhaar-Olson & Sauer 1990, Taylor et al. 2001). Mutagenesis research tests the sensitivity of proteins (and, by implication, DNA) to functional loss as a result of alterations in sequencing. Studies of proteins have long shown that amino acid residues at many active positions cannot vary without functional loss (Perutz & Lehmann 1968). More recent protein studies (often using mutagenesis experiments) have shown that functional requirements place significant constraints on sequencing even at non-active site positions (Bowie & Sauer 1989, Reidhaar-Olson & Sauer 1990, Chothia et al. 1998, Axe 2000, Taylor et al. 2001). In particular, Axe (2000) has shown that multiple as opposed to single position amino acid substitutions inevitably result in loss of protein function, even when these changes occur at sites that allow variation when altered in isolation. Cumulatively, these constraints imply that proteins are highly sensitive to functional loss as a result of alterations in sequencing, and that functional proteins represent highly isolated and improbable arrangements of amino acids -arrangements that are far more improbable, in fact, than would be likely to arise by chance alone in the time available (Reidhaar-Olson & Sauer 1990; Behe 1992; Kauffman 1995:44; Dembski 1998:175-223; Axe 2000, 2004). (See below the discussion of the neutral theory of evolution for a precise quantitative assessment.)

Of course, neo-Darwinists do not envision a completely random search through the set of all possible nucleotide sequences--so-called "sequence space." They envision natural selection acting to preserve small advantageous variations in genetic sequences and their corresponding protein products. Dawkins (1996), for example, likens an organism to a high mountain peak. He compares climbing the sheer precipice up the front side of the mountain to building a new organism by chance. He acknowledges that his approach up "Mount Improbable" will not succeed. Nevertheless, he suggests that there is a gradual slope up the backside of the mountain that could be climbed in small incremental steps. In his analogy, the backside climb up "Mount Improbable" corresponds to the process of natural selection acting on random changes in the genetic text. What chance alone cannot accomplish blindly or in one leap, selection (acting on mutations) can accomplish through the cumulative effect of many slight successive steps.

Yet the extreme specificity and complexity of proteins presents a difficulty, not only for the chance origin of specified biological information (i.e., for random mutations acting alone), but also for selection and mutation acting in concert. Indeed, mutagenesis experiments cast doubt on each of the two scenarios by which neo-Darwinists envisioned new information arising from the mutation/selection mechanism (for review, see Lonnig 2001). For neo-Darwinism, new functional genes either arise from non-coding sections in the genome or from preexisting genes. Both scenarios are problematic.

In the first scenario, neo-Darwinists envision new genetic information arising from those sections of the genetic text that can presumably vary freely without consequence to the organism. According to this scenario, non-coding sections of the genome, or duplicated sections of coding regions, can experience a protracted period of "neutral evolution" (Kimura 1983) during which alterations in nucleotide sequences have no discernible effect on the function of the organism. Eventually, however, a new gene sequence will arise that can code for a novel protein. At that point, natural selection can favor the new gene and its functional protein product, thus securing the preservation and heritability of both.

This scenario has the advantage of allowing the genome to vary through many generations, as mutations "search" the space of possible base sequences. The scenario has an overriding problem, however: the size of the combinatorial space (i.e., the number of possible amino acid sequences) and the extreme rarity and isolation of the functional sequences within that space of possibilities. Since natural selection can do nothing to help generate new functional sequences, but rather can only preserve such sequences once they have arisen, chance alone--random variation--must do the work of information generation--that is, of finding the exceedingly rare functional sequences within the set of combinatorial possibilities. Yet the probability of randomly assembling (or "finding," in the previous sense) a functional sequence is extremely small.

Cassette mutagenesis experiments performed during the early 1990s suggest that the probability of attaining (at random) the correct sequencing for a short protein 100 amino acids long is about 1 in 1065 (Reidhaar-Olson & Sauer 1990, Behe 1992:65-69). This result agreed closely with earlier calculations that Yockey (1978) had performed based upon the known sequence variability of cytochrome c in different species and other theoretical considerations. More recent mutagenesis research has provided additional support for the conclusion that functional proteins are exceedingly rare among possible amino acid sequences (Axe 2000, 2004). Axe (2004) has performed site directed mutagenesis experiments on a 150-residue protein-folding domain within a B-lactamase enzyme. His experimental method improves upon earlier mutagenesis techniques and corrects for several sources of possible estimation error inherent in them. On the basis of these experiments, Axe has estimated the ratio of (a) proteins of typical size (150 residues) that perform a specified function via any folded structure to (b) the whole set of possible amino acids sequences of that size. Based on his experiments, Axe has estimated his ratio to be 1 to 1077. Thus, the probability of finding a functional protein among the possible amino acid sequences corresponding to a 150-residue protein is similarly 1 in 1077.

Other considerations imply additional improbabilities. First, new Cambrian animals would require proteins much longer than 100 residues to perform many necessary specialized functions. Ohno (1996) has noted that Cambrian animals would have required complex proteins such as lysyl oxidase in order to support their stout body structures. Lysyl oxidase molecules in extant organisms comprise over 400 amino acids. These molecules are both highly complex (non-repetitive) and functionally specified. Reasonable extrapolation from mutagenesis experiments done on shorter protein molecules suggests that the probability of producing functionally sequenced proteins of this length at random is so small as to make appeals to chance absurd, even granting the duration of the entire universe. (See Dembski 1998:175-223 for a rigorous calculation of this "Universal Probability Bound"; See also Axe 2004.) Yet, second, fossil data (Bowring et al. 1993, 1998a:1, 1998b:40; Kerr 1993; Monatersky 1993), and even molecular analyses supporting deep divergence (Wray et al. 1996), suggest that the duration of the Cambrian explosion (between 5-10 x 106 and, at most, 7 x 107 years) is far smaller than that of the entire universe (1.3-2 x 1010 years). Third, DNA mutation rates are far too low to generate the novel genes and proteins necessary to building the Cambrian animals, given the most probable duration of the explosion as determined by fossil studies (Conway Morris 1998b). As Ohno (1996:8475) notes, even a mutation rate of 10-9 per base pair per year results in only a 1% change in the sequence of a given section of DNA in 10 million years. Thus, he argues that mutational divergence of preexisting genes cannot explain the origin of the Cambrian forms in that time.4

The selection/mutation mechanism faces another probabilistic obstacle. The animals that arise in the Cambrian exhibit structures that would have required many new types of cells, each of which would have required many novel proteins to perform their specialized functions. Further, new cell types require Asystems of proteins that must, as a condition of functioning, act in close coordination with one another. The unit of selection in such systems ascends to the system as a whole. Natural selection selects for functional advantage. But new cell types require whole systems of proteins to perform their distinctive functions. In such cases, natural selection cannot contribute to the process of information generation until after the information necessary to build the requisite system of proteins has arisen. Thus random variations must, again, do the work of information generation--and now not simply for one protein, but for many proteins arising at nearly the same time. Yet the odds of this occurring by chance alone are, of course, far smaller than the odds of the chance origin of a single gene or protein--so small in fact as to render the chance origin of the genetic information necessary to build a new cell type (a necessary but not sufficient condition of building a new body plan) problematic given even the most optimistic estimates for the duration of the Cambrian explosion.

Dawkins (1986:139) has noted that scientific theories can rely on only so much "luck" before they cease to be credible. The neutral theory of evolution, which, by its own logic, prevents natural selection from playing a role in generating genetic information until after the fact, relies on entirely too much luck. The sensitivity of proteins to functional loss, the need for long proteins to build new cell types and animals, the need for whole new systems of proteins to service new cell types, the probable brevity of the Cambrian explosion relative to mutation rates--all suggest the immense improbability (and implausibility) of any scenario for the origination of Cambrian genetic information that relies upon random variation alone unassisted by natural selection.

Yet the neutral theory requires novel genes and proteins to arise--essentially--by random mutation alone. Adaptive advantage accrues after the generation of new functional genes and proteins. Thus, natural selection cannot play a role until new information-bearing molecules have independently arisen. Thus neutral theorists envisioned the need to scale the steep face of a Dawkins-style precipice of which there is no gradually sloping backside--a situation that, by Dawkins' own logic, is probabilistically untenable.

In the second scenario, neo-Darwinists envisioned novel genes and proteins arising by numerous successive mutations in the preexisting genetic text that codes for proteins. To adapt Dawkins's metaphor, this scenario envisions gradually climbing down one functional peak and then ascending another. Yet mutagenesis experiments again suggest a difficulty. Recent experiments show that, even when exploring a region of sequence space populated by proteins of a single fold and function, most multiple-position changes quickly lead to loss of function (Axe 2000). Yet to turn one protein into another with a completely novel structure and function requires specified changes at many sites. Indeed, the number of changes necessary to produce a new protein greatly exceeds the number of changes that will typically produce functional losses. Given this, the probability of escaping total functional loss during a random search for the changes needed to produce a new function is extremely small--and this probability diminishes exponentially with each additional requisite change (Axe 2000). Thus, Axe's results imply that, in all probability, random searches for novel proteins (through sequence space) will result in functional loss long before any novel functional protein will emerge.

Blanco et al. have come to a similar conclusion. Using directed mutagenesis, they have determined that residues in both the hydrophobic core and on the surface of the protein play essential roles in determining protein structure. By sampling intermediate sequences between two naturally occurring sequences that adopt different folds, they found that the intermediate sequences "lack a well defined three-dimensional structure." Thus, they conclude that it is unlikely that a new protein fold via a series of folded intermediates sequences (Blanco et al. 1999:741).

Thus, although this second neo-Darwinian scenario has the advantage of starting with functional genes and proteins, it also has a lethal disadvantage: any process of random mutation or rearrangement in the genome would in all probability generate nonfunctional intermediate sequences before fundamentally new functional genes or proteins would arise. Clearly, nonfunctional intermediate sequences confer no survival advantage on their host organisms. Natural selection favors only functional advantage. It cannot select or favor nucleotide sequences or polypeptide chains that do not yet perform biological functions, and still less will it favor sequences that efface or destroy preexisting function.

Evolving genes and proteins will range through a series of nonfunctional intermediate sequences that natural selection will not favor or preserve but will, in all probability, eliminate (Blanco et al. 1999, Axe 2000). When this happens, selection-driven evolution will cease. At this point, neutral evolution of the genome (unhinged from selective pressure) may ensue, but, as we have seen, such a process must overcome immense probabilistic hurdles, even granting cosmic time.

Thus, whether one envisions the evolutionary process beginning with a noncoding region of the genome or a preexisting functional gene, the functional specificity and complexity of proteins impose very stringent limitations on the efficacy of mutation and selection. In the first case, function must arise first, before natural selection can act to favor a novel variation. In the second case, function must be continuously maintained in order to prevent deleterious (or lethal) consequences to the organism and to allow further evolution. Yet the complexity and functional specificity of proteins implies that both these conditions will be extremely difficult to meet. Therefore, the neo-Darwinian mechanism appears to be inadequate to generate the new information present in the novel genes and proteins that arise with the Cambrian animals.

Novel Body Plans

The problems with the neo-Darwinian mechanism run deeper still. In order to explain the origin of the Cambrian animals, one must account not only for new proteins and cell types, but also for the origin of new body plans. Within the past decade, developmental biology has dramatically advanced our understanding of how body plans are built during ontogeny. In the process, it has also uncovered a profound difficulty for neo-Darwinism.

Significant morphological change in organisms requires attention to timing. Mutations in genes that are expressed late in the development of an organism will not affect the body plan. Mutations expressed early in development, however, could conceivably produce significant morphological change (Arthur 1997:21). Thus, events expressed early in the development of organisms have the only realistic chance of producing large-scale macroevolutionary change (Thomson 1992). As John and Miklos (1988:309) explain, macroevolutionary change requires alterations in the very early stages of ontogenesis.

Yet recent studies in developmental biology make clear that mutations expressed early in development typically have deleterious effects (Arthur 1997:21). For example, when early-acting body plan molecules, or morphogens such as bicoid (which helps to set up the anterior-posterior head-to-tail axis in Drosophila), are perturbed, development shuts down (Nusslein-Volhard & Wieschaus 1980, Lawrence & Struhl 1996, Muller & Newman 2003).5 The resulting embryos die. Moreover, there is a good reason for this. If an engineer modifies the length of the piston rods in an internal combustion engine without modifying the crankshaft accordingly, the engine won't start. Similarly, processes of development are tightly integrated spatially and temporally such that changes early in development will require a host of other coordinated changes in separate but functionally interrelated developmental processes downstream. For this reason, mutations will be much more likely to be deadly if they disrupt a functionally deeply-embedded structure such as a spinal column than if they affect more isolated anatomical features such as fingers (Kauffman 1995:200).

This problem has led to what McDonald (1983) has called "a great Darwinian paradox" (p. 93). McDonald notes that genes that are observed to vary within natural populations do not lead to major adaptive changes, while genes that could cause major changes--the very stuff of macroevolution--apparently do not vary. In other words, mutations of the kind that macroevolution doesn't need (namely, viable genetic mutations in DNA expressed late in development) do occur, but those that it does need (namely, beneficial body plan mutations expressed early in development) apparently don't occur.6 According to Darwin (1859:108) natural selection cannot act until favorable variations arise in a population. Yet there is no evidence from developmental genetics that the kind of variations required by neo-Darwinism--namely, favorable body plan mutations--ever occur.

Developmental biology has raised another formidable problem for the mutation/selection mechanism. Embryological evidence has long shown that DNA does not wholly determine morphological form (Goodwin 1985, Nijhout 1990, Sapp 1987, Muller & Newman 2003), suggesting that mutations in DNA alone cannot account for the morphological changes required to build a new body plan.

DNA helps direct protein synthesis.7 It also helps to regulate the timing and expression of the synthesis of various proteins within cells. Yet, DNA alone does not determine how individual proteins assemble themselves into larger systems of proteins; still less does it solely determine how cell types, tissue types, and organs arrange themselves into body plans (Harold 1995:2774, Moss 2004). Instead, other factors--such as the three-dimensional structure and organization of the cell membrane and cytoskeleton and the spatial architecture of the fertilized egg--play important roles in determining body plan formation during embryogenesis.

For example, the structure and location of the cytoskeleton influence the patterning of embryos. Arrays of microtubules help to distribute the essential proteins used during development to their correct locations in the cell. Of course, microtubules themselves are made of many protein subunits. Nevertheless, like bricks that can be used to assemble many different structures, the tubulin subunits in the cell's microtubules are identical to one another. Thus, neither the tubulin subunits nor the genes that produce them account for the different shape of microtubule arrays that distinguish different kinds of embryos and developmental pathways. Instead, the structure of the microtubule array itself is determined by the location and arrangement of its subunits, not the properties of the subunits themselves. For this reason, it is not possible to predict the structure of the cytoskeleton of the cell from the characteristics of the protein constituents that form that structure (Harold 2001:125).

Two analogies may help further clarify the point. At a building site, builders will make use of many materials: lumber, wires, nails, drywall, piping, and windows. Yet building materials do not determine the floor plan of the house, or the arrangement of houses in a neighborhood. Similarly, electronic circuits are composed of many components, such as resistors, capacitors, and transistors. But such lower-level components do not determine their own arrangement in an integrated circuit. Biological symptoms also depend on hierarchical arrangements of parts. Genes and proteins are made from simple building blocks--nucleotide bases and amino acids--arranged in specific ways. Cell types are made of, among other things, systems of specialized proteins. Organs are made of specialized arrangements of cell types and tissues. And body plans comprise specific arrangements of specialized organs. Yet, clearly, the properties of individual proteins (or, indeed, the lower-level parts in the hierarchy generally) do not fully determine the organization of the higher-level structures and organizational patterns (Harold 2001:125). It follows that the genetic information that codes for proteins does not determine these higher-level structures either.

These considerations pose another challenge to the sufficiency of the neo-Darwinian mechanism. Neo-Darwinism seeks to explain the origin of new information, form, and structure as a result of selection acting on randomly arising variation at a very low level within the biological hierarchy, namely, within the genetic text. Yet major morphological innovations depend on a specificity of arrangement at a much higher level of the organizational hierarchy, a level that DNA alone does not determine. Yet if DNA is not wholly responsible for body plan morphogenesis, then DNA sequences can mutate indefinitely, without regard to realistic probabilistic limits, and still not produce a new body plan. Thus, the mechanism of natural selection acting on random mutations in DNA cannot in principle generate novel body plans, including those that first arose in the Cambrian explosion.

Of course, it could be argued that, while many single proteins do not by themselves determine cellular structures and/or body plans, proteins acting in concert with other proteins or suites of proteins could determine such higher-level form. For example, it might be pointed out that the tubulin subunits (cited above) are assembled by other helper proteins--gene products--called Microtubule Associated Proteins (MAPS). This might seem to suggest that genes and gene products alone do suffice to determine the development of the three-dimensional structure of the cytoskeleton.

Yet MAPS, and indeed many other necessary proteins, are only part of the story. The location of specified target sites on the interior of the cell membrane also helps to determine the shape of the cytoskeleton. Similarly, so does the position and structure of the centrosome which nucleates the microtubules that form the cytoskeleton. While both the membrane targets and the centrosomes are made of proteins, the location and form of these structures is not wholly determined by the proteins that form them. Indeed, centrosome structure and membrane patterns as a whole convey three-dimensional structural information that helps determine the structure of the cytoskeleton and the location of its subunits (McNiven & Porter 1992:313-329). Moreover, the centrioles that compose the centrosomes replicate independently of DNA replication (Lange et al. 2000:235-249, Marshall & Rosenbaum 2000:187-205). The daughter centriole receives its form from the overall structure of the mother centriole, not from the individual gene products that constitute it (Lange et al. 2000). In ciliates, microsurgery on cell membranes can produce heritable changes in membrane patterns, even though the DNA of the ciliates has not been altered (Sonneborn 1970:1-13, Frankel 1980:607-623; Nanney 1983:163-170). This suggests that membrane patterns (as opposed to membrane constituents) are impressed directly on daughter cells. In both cases, form is transmitted from parent three-dimensional structures to daughter three-dimensional structures directly and is not wholly contained in constituent proteins or genetic information (Moss 2004).

Thus, in each new generation, the form and structure of the cell arises as the result of both gene products and preexisting three-dimensional structure and organization. Cellular structures are built from proteins, but proteins find their way to correct locations in part because of preexisting three-dimensional patterns and organization inherent in cellular structures. Preexisting three-dimensional form present in the preceding generation (whether inherent in the cell membrane, the centrosomes, the cytoskeleton or other features of the fertilized egg) contributes to the production of form in the next generation. Neither structural proteins alone, nor the genes that code for them, are sufficient to determine the three-dimensional shape and structure of the entities they form. Gene products provide necessary, but not sufficient conditions, for the development of three-dimensional structure within cells, organs and body plans (Harold 1995:2767). But if this is so, then natural selection acting on genetic variation alone cannot produce the new forms that arise in history of life.

Self-Organizational Models

Of course, neo-Darwinism is not the only evolutionary theory for explaining the origin of novel biological form. Kauffman (1995) doubts the efficacy of the mutation/selection mechanism. Nevertheless, he has advanced a self-organizational theory to account for the emergence of new form, and presumably the information necessary to generate it. Whereas neo-Darwinism attempts to explain new form as the consequence of selection acting on random mutation, Kauffman suggests that selection acts, not mainly on random variations, but on emergent patterns of order that self-organize via the laws of nature.

Kauffman (1995:47-92) illustrates how this might work with various model systems in a computer environment. In one, he conceives a system of buttons connected by strings. Buttons represent novel genes or gene products; strings represent the law-like forces of interaction that obtain between gene products-i.e., proteins. Kauffman suggests that when the complexity of the system (as represented by the number of buttons and strings) reaches a critical threshold, new modes of organization can arise in the system "for free"--that is, naturally and spontaneously--after the manner of a phase transition in chemistry.

Another model that Kauffman develops is a system of interconnected lights. Each light can flash in a variety of states--on, off, twinkling, etc. Since there is more than one possible state for each light, and many lights, there are a vast number of possible states that the system can adopt. Further, in his system, rules determine how past states will influence future states. Kauffman asserts that, as a result of these rules, the system will, if properly tuned, eventually produce a kind of order in which a few basic patterns of light activity recur with greater-than-random frequency. Since these actual patterns of light activity represent a small portion of the total number of possible states in which the system can reside, Kauffman seems to imply that self-organizational laws might similarly result in highly improbable biological outcomes--perhaps even sequences (of bases or amino acids) within a much larger sequence space of possibilities.

Do these simulations of self-organizational processes accurately model the origin of novel genetic information? It is hard to think so.

First, in both examples, Kauffman presupposes but does not explain significant sources of preexisting information. In his buttons-and-strings system, the buttons represent proteins, themselves packets of CSI, and the result of preexisting genetic information. Where does this information come from? Kauffman (1995) doesn't say, but the origin of such information is an essential part of what needs to be explained in the history of life. Similarly, in his light system, the order that allegedly arises for "for free" actually arises only if the programmer of the model system "tunes" it in such a way as to keep it from either (a) generating an excessively rigid order or (b) developing into chaos (pp. 86-88). Yet this necessary tuning involves an intelligent programmer selecting certain parameters and excluding others--that is, inputting information.

Second, Kauffman's model systems are not constrained by functional considerations and thus are not analogous to biological systems. A system of interconnected lights governed by pre-programmed rules may well settle into a small number of patterns within a much larger space of possibilities. But because these patterns have no function, and need not meet any functional requirements, they have no specificity analogous to that present in actual organisms. Instead, examination of Kauffman's (1995) model systems shows that they do not produce sequences or systems characterized by specified complexity, but instead by large amounts of symmetrical order or internal redundancy interspersed with aperiodicity or (mere) complexity (pp. 53, 89, 102). Getting a law-governed system to generate repetitive patterns of flashing lights, even with a certain amount of variation, is clearly interesting, but not biologically relevant. On the other hand, a system of lights flashing the title of a Broadway play would model a biologically relevant self-organizational process, at least if such a meaningful or functionally specified sequence arose without intelligent agents previously programming the system with equivalent amounts of CSI. In any case, Kauffman's systems do not produce specified complexity, and thus do not offer promising models for explaining the new genes and proteins that arose in the Cambrian.

Even so, Kauffman suggests that his self-organizational models can specifically elucidate aspects of the Cambrian explosion. According to Kauffman (1995:199-201), new Cambrian animals emerged as the result of "long jump" mutations that established new body plans in a discrete rather than gradual fashion. He also recognizes that mutations affecting early development are almost inevitably harmful. Thus, he concludes that body plans, once established, will not change, and that any subsequent evolution must occur within an established body plan (Kauffman 1995:201). And indeed, the fossil record does show a curious (from a neo-Darwinian point of view) top-down pattern of appearance, in which higher taxa (and the body plans they represent) appear first, only later to be followed by the multiplication of lower taxa representing variations within those original body designs (Erwin et al. 1987, Lewin 1988, Valentine & Jablonski 2003:518). Further, as Kauffman expects, body plans appear suddenly and persist without significant modification over time.

But here, again, Kauffman begs the most important question, which is: what produces the new Cambrian body plans in the first place? Granted, he invokes "long jump mutations" to explain this, but he identifies no specific self-organizational process that can produce such mutations. Moreover, he concedes a principle that undermines the plausibility of his own proposal. Kauffman acknowledges that mutations that occur early in development are almost inevitably deleterious. Yet developmental biologists know that these are the only kind of mutations that have a realistic chance of producing large-scale evolutionary change--i.e., the big jumps that Kauffman invokes. Though Kauffman repudiates the neo-Darwinian reliance upon random mutations in favor of self-organizing order, in the end, he must invoke the most implausible kind of random mutation in order to provide a self-organizational account of the new Cambrian body plans. Clearly, his model is not sufficient.

Punctuated Equilibrium

Of course, still other causal explanations have been proposed. During the 1970s, the paleontologists Eldredge and Gould (1972) proposed the theory of evolution by punctuated equilibrium in order to account for a pervasive pattern of "sudden appearance" and "stasis" in the fossil record. Though advocates of punctuated equilibrium were mainly seeking to describe the fossil record more accurately than earlier gradualist neo-Darwinian models had done, they did also propose a mechanism--known as species selection--by which the large morphological jumps evident in fossil record might have been produced. According to punctuationalists, natural selection functions more as a mechanism for selecting the fittest species rather than the most-fit individual among a species. Accordingly, on this model, morphological change should occur in larger, more discrete intervals than it would given a traditional neo-Darwinian understanding.

Despite its virtues as a descriptive model of the history of life, punctuated equilibrium has been widely criticized for failing to provide a mechanism sufficient to produce the novel form characteristic of higher taxonomic groups. For one thing, critics have noted that the proposed mechanism of punctuated evolutionary change simply lacked the raw material upon which to work. As Valentine and Erwin (1987) note, the fossil record fails to document a large pool of species prior to the Cambrian. Yet the proposed mechanism of species selection requires just such a pool of species upon which to act. Thus, they conclude that the mechanism of species selection probably does not resolve the problem of the origin of the higher taxonomic groups (p. 96).8 Further, punctuated equilibrium has not addressed the more specific and fundamental problem of explaining the origin of the new biological information (whether genetic or epigenetic) necessary to produce novel biological form. Advocates of punctuated equilibrium might assume that the new species (upon which natural selection acts) arise by known microevolutionary processes of speciation (such as founder effect, genetic drift or bottleneck effect) that do not necessarily depend upon mutations to produce adaptive changes. But, in that case, the theory lacks an account of how the specifically higher taxa arise. Species selection will only produce more fit species. On the other hand, if punctuationalists assume that processes of genetic mutation can produce more fundamental morphological changes and variations, then their model becomes subject to the same problems as neo-Darwinism (see above). This dilemma is evident in Gould (2002:710) insofar as his attempts to explain adaptive complexity inevitably employ classical neo-Darwinian modes of explanation.9


Another attempt to explain the origin of form has been proposed by the structuralists such as Gerry Webster and Brian Goodwin (1984, 1996). These biologists, drawing on the earlier work of D'Arcy Thompson (1942), view biological form as the result of structural constraints imposed upon matter by morphogenetic rules or laws. For reasons similar to those discussed above, the structuralists have insisted that these generative or morphogenetic rules do not reside in the lower level building materials of organisms, whether in genes or proteins. Webster and Goodwin (1984:510-511) further envisioned morphogenetic rules or laws operating ahistorically, similar to the way in which gravitational or electromagnetic laws operate. For this reason, structuralists see phylogeny as of secondary importance in understanding the origin of the higher taxa, though they think that transformations of form can occur. For structuralists, constraints on the arrangement of matter arise not mainly as the result of historical contingencies--such as environmental changes or genetic mutations--but instead because of the continuous ahistorical operation of fundamental laws of form--laws that organize or inform matter.

While this approach avoids many of the difficulties currently afflicting neo-Darwinism (in particular those associated with its "genocentricity"), critics (such as Maynard Smith 1986) of structuralism have argued that the structuralist explanation of form lacks specificity. They note that structuralists have been unable to say just where laws of form reside--whether in the universe, or in every possible world, or in organisms as a whole, or in just some part of organisms. Further, according to structuralists, morphogenetic laws are mathematical in character. Yet, structuralists have yet to specify the mathematical formulae that determine biological forms.

Others (Yockey 1992; Polanyi 1967, 1968; Meyer 2003) have questioned whether physical laws could in principle generate the kind of complexity that characterizes biological systems. Structuralists envision the existence of biological laws that produce form in much the same way that physical laws produce form. Yet the forms that physicists regard as manifestations of underlying laws are characterized by large amounts of symmetric or redundant order, by relatively simple patterns such as vortices or gravitational fields or magnetic lines of force. Indeed, physical laws are typically expressed as differential equations (or algorithms) that almost by definition describe recurring phenomena--patterns of compressible "order" not "complexity" as defined by algorithmic information theory (Yockey 1992:77-83). Biological forms, by contrast, manifest greater complexity and derive in ontogeny from highly complex initial conditions--i.e., non-redundant sequences of nucleotide bases in the genome and other forms of information expressed in the complex and irregular three-dimensional topography of the organism or the fertilized egg. Thus, the kind of form that physical laws produce is not analogous to biological form--at least not when compared from the standpoint of (algorithmic) complexity. Further, physical laws lack the information content to specify biology systems. As Polyanyi (1967, 1968) and Yockey (1992:290) have shown, the laws of physics and chemistry allow, but do not determine, distinctively biological modes of organization. In other words, living systems are consistent with, but not deducible, from physical-chemical laws (1992:290).

Of course, biological systems do manifest some reoccurring patterns, processes and behaviors. The same type of organism develops repeatedly from similar ontogenetic processes in the same species. Similar processes of cell division reoccur in many organisms. Thus, one might describe certain biological processes as law-governed. Even so, the existence of such biological regularities does not solve the problem of the origin of form and information, since the recurring processes described by such biological laws (if there be such laws) only occur as the result of preexisting stores of (genetic and/or epigenetic) information and these information-rich initial conditions impose the constraints that produce the recurring behavior in biological systems. (For example, processes of cell division recur with great frequency in organisms, but depend upon information-rich DNA and proteins molecules.) In other words, distinctively biological regularities depend upon preexisting biological information. Thus, appeals to higher-level biological laws presuppose, but do not explain, the origination of the information necessary to morphogenesis.

Thus, structuralism faces a difficult in principle dilemma. On the one hand, physical laws produce very simple redundant patterns that lack the complexity characteristic of biological systems. On the other hand, distinctively biological laws--if there are such laws--depend upon preexisting information-rich structures. In either case, laws are not good candidates for explaining the origination of biological form or the information necessary to produce it.

Cladism: An Artifact of Classification?

Some cladists have advanced another approach to the problem of the origin of form, specifically as it arises in the Cambrian. They have argued that the problem of the origin of the phyla is an artifact of the classification system, and therefore, does not require explanation. Budd and Jensen (2000), for example, argue that the problem of the Cambrian explosion resolves itself if one keeps in mind the cladistic distinction between "stem" and "crown" groups. Since crown groups arise whenever new characters are added to simpler more ancestral stem groups during the evolutionary process, new phyla will inevitably arise once a new stem group has arisen. Thus, for Budd and Jensen what requires explanation is not the crown groups corresponding to the new Cambrian phyla, but the earlier more primitive stem groups that presumably arose deep in the Proterozoic. Yet since these earlier stem groups are by definition less derived, explaining them will be considerably easier than explaining the origin of the Cambrian animals de novo. In any case, for Budd and Jensen the explosion of new phyla in the Cambrian does not require explanation. As they put it, "given that the early branching points of major clades is an inevitable result of clade diversification, the alleged phenomenon of the phyla appearing early and remaining morphologically static is not seen to require particular explanation" (Budd & Jensen 2000:253).

While superficially plausible, perhaps, Budd and Jensen's attempt to explain away the Cambrian explosion begs crucial questions. Granted, as new characters are added to existing forms, novels morphology and greater morphological disparity will likely result. But what causes new characters to arise? And how does the information necessary to produce new characters originate? Budd and Jensen do not specify. Nor can they say how derived the ancestral forms are likely to have been, and what processes, might have been sufficient to produce them. Instead, they simply assume the sufficiency of known neo-Darwinian mechanisms (Budd & Jensen 2000:288). Yet, as shown above, this assumption is now problematic. In any case, Budd and Jensen do not explain what causes the origination of biological form and information.

Convergence and Teleological Evolution

More recently, Conway Morris (2000, 2003c) has suggested another possible explanation based on the tendency for evolution to converge on the same structural forms during the history of life. Conway Morris cites numerous examples of organisms that possess very similar forms and structures, even though such structures are often built from different material substrates and arise (in ontogeny) by the expression of very different genes. Given the extreme improbability of the same structures arising by random mutation and selection in disparate phylogenies, Conway Morris argues that the pervasiveness of convergent structures suggests that evolution may be in some way "channeled" toward similar functional and/or structural endpoints. Such an end-directed understanding of evolution, he admits, raises the controversial prospect of a teleological or purposive element in the history of life. For this reason, he argues that the phenomenon of convergence has received less attention than it might have otherwise. Nevertheless, he argues that just as physicists have reopened the question of design in their discussions of anthropic fine-tuning, the ubiquity of convergent structures in the history of life has led some biologists (Denton 1998) to consider extending teleological thinking to biology. And, indeed, Conway Morris himself intimates that the evolutionary process might be "underpinned by a purpose" (2000:8, 2003b:511).

Conway Morris, of course, considers this possibility in relation to a very specific aspect of the problem of organismal form, namely, the problem of explaining why the same forms arise repeatedly in so many disparate lines of decent. But this raises a question. Could a similar approach shed explanatory light on the more general causal question that has been addressed in this review? Could the notion of purposive design help provide a more adequate explanation for the origin of organismal form generally? Are there reasons to consider design as an explanation for the origin of the biological information necessary to produce the higher taxa and their corresponding morphological novelty?

The remainder of this review will suggest that there are such reasons. In so doing, it may also help explain why the issue of teleology or design has reemerged within the scientific discussion of biological origins (Denton 1986, 1998; Thaxton et al. 1992; Kenyon & Mills 1996: Behe 1996, 2004; Dembski 1998, 2002, 2004; Conway Morris 2000, 2003a, 2003b, Lonnig 2001; Lonnig & Saedler 2002; Nelson & Wells 2003; Meyer 2003, 2004; Bradley 2004) and why some scientists and philosophers of science have considered teleological explanations for the origin of form and information despite strong methodological prohibitions against design as a scientific hypothesis (Gillespie 1979, Lenior 1982:4).

First, the possibility of design as an explanation follows logically from a consideration of the deficiencies of neo-Darwinism and other current theories as explanations for some of the more striking "appearances of design" in biological systems. Neo-Darwinists such as Ayala (1994:5), Dawkins (1986:1), Mayr (1982:xi-xii) and Lewontin (1978) have long acknowledged that organisms appear to have been designed. Of course, neo-Darwinists assert that what Ayala (1994:5) calls the "obvious design" of living things is only apparent since the selection/mutation mechanism can explain the origin of complex form and organization in living systems without an appeal to a designing agent. Indeed, neo-Darwinists affirm that mutation and selection--and perhaps other similarly undirected mechanisms--are fully sufficient to explain the appearance of design in biology. Self-organizational theorists and punctuationalists modify this claim, but affirm its essential tenet. Self-organization theorists argue that natural selection acting on self organizing order can explain the complexity of living things--again, without any appeal to design. Punctuationalists similarly envision natural selection acting on newly arising species with no actual design involved.

And clearly, the neo-Darwinian mechanism does explain many appearances of design, such as the adaptation of organisms to specialized environments that attracted the interest of 19th century biologists. More specifically, known microevolutionary processes appear quite sufficient to account for changes in the size of Galapagos finch beaks that have occurred in response to variations in annual rainfall and available food supplies (Weiner 1994, Grant 1999).

But does neo-Darwinism, or any other fully materialistic model, explain all appearances of design in biology, including the body plans and information that characterize living systems? Arguably, biological forms--such as the structure of a chambered nautilus, the organization of a trilobite, the functional integration of parts in an eye or molecular machine--attract our attention in part because the organized complexity of such systems seems reminiscent of our own designs. Yet, this review has argued that neo-Darwinism does not adequately account for the origin of all appearances of design, especially if one considers animal body plans, and the information necessary to construct them, as especially striking examples of the appearance of design in living systems. Indeed, Dawkins (1995:11) and Gates (1996:228) have noted that genetic information bears an uncanny resemblance to computer software or machine code. For this reason, the presence of CSI in living organisms, and the discontinuous increases of CSI that occurred during events such as the Cambrian explosion, appears at least suggestive of design.

Does neo-Darwinism or any other purely materialistic model of morphogenesis account for the origin of the genetic and other forms of CSI necessary to produce novel organismal form? If not, as this review has argued, could the emergence of novel information-rich genes, proteins, cell types and body plans have resulted from actual design, rather than a purposeless process that merely mimics the powers of a designing intelligence? The logic of neo-Darwinism, with its specific claim to have accounted for the appearance of design, would itself seem to open the door to this possibility. Indeed, the historical formulation of Darwinism in dialectical opposition to the design hypothesis (Gillespie 1979), coupled with the neo-Darwinism's inability to account for many salient appearances of design including the emergence of form and information, would seem logically to reopen the possibility of actual (as opposed to apparent) design in the history of life.

A second reason for considering design as an explanation for these phenomena follows from the importance of explanatory power to scientific theory evaluation and from a consideration of the potential explanatory power of the design hypothesis. Studies in the methodology and philosophy of science have shown that many scientific theories, particularly in the historical sciences, are formulated and justified as inferences to the best explanation (Lipton 1991:32-88, Brush 1989:1124-1129, Sober 2000:44). Historical scientists, in particular, assess or test competing hypotheses by evaluating which hypothesis would, if true, provide the best explanation for some set of relevant data (Meyer 1991, 2002; Cleland 2001:987-989, 2002:474-496).10 Those with greater explanatory power are typically judged to be better, more probably true, theories. Darwin (1896:437) used this method of reasoning in defending his theory of universal common descent. Moreover, contemporary studies on the method of "inference to the best explanation" have shown that determining which among a set of competing possible explanations constitutes the best depends upon judgments about the causal adequacy, or "causal powers," of competing explanatory entities (Lipton 1991:32-88). In the historical sciences, uniformitarian and/or actualistic (Gould 1965, Simpson 1970, Rutten 1971, Hooykaas 1975) canons of method suggest that judgments about causal adequacy should derive from our present knowledge of cause and effect relationships. For historical scientists, "the present is the key to the past" means that present experience-based knowledge of cause and effect relationships typically guides the assessment of the plausibility of proposed causes of past events.

Yet it is precisely for this reason that current advocates of the design hypothesis want to reconsider design as an explanation for the origin of biological form and information. This review, and much of the literature it has surveyed, suggests that four of the most prominent models for explaining the origin of biological form fail to provide adequate causal explanations for the discontinuous increases of CSI that are required to produce novel morphologies. Yet, we have repeated experience of rational and conscious agents--in particular ourselves--generating or causing increases in complex specified information, both in the form of sequence-specific lines of code and in the form of hierarchically arranged systems of parts.

In the first place, intelligent human agents--in virtue of their rationality and consciousness--have demonstrated the power to produce information in the form of linear sequence-specific arrangements of characters. Indeed, experience affirms that information of this type routinely arises from the activity of intelligent agents. A computer user who traces the information on a screen back to its source invariably comes to a mind--that of a software engineer or programmer. The information in a book or inscriptions ultimately derives from a writer or scribe--from a mental, rather than a strictly material, cause. Our experience-based knowledge of information-flow confirms that systems with large amounts of specified complexity (especially codes and languages) invariably originate from an intelligent source from a mind or personal agent. As Quastler (1964) put it, the "creation of new information is habitually associated with conscious activity" (p. 16). Experience teaches this obvious truth.

Further, the highly specified hierarchical arrangements of parts in animal body plans also suggest design, again because of our experience of the kinds of features and systems that designers can and do produce. At every level of the biological hierarchy, organisms require specified and highly improbable arrangements of lower-level constituents in order to maintain their form and function. Genes require specified arrangements of nucleotide bases; proteins require specified arrangements of amino acids; new cell types require specified arrangements of systems of proteins; body plans require specialized arrangements of cell types and organs. Organisms not only contain information-rich components (such as proteins and genes), but they comprise information-rich arrangements of those components and the systems that comprise them. Yet we know, based on our present experience of cause and effect relationships, that design engineers--possessing purposive intelligence and rationality--have the ability to produce information-rich hierarchies in which both individual modules and the arrangements of those modules exhibit complexity and specificity--information so defined. Individual transistors, resistors, and capacitors exhibit considerable complexity and specificity of design; at a higher level of organization, their specific arrangement within an integrated circuit represents additional information and reflects further design. Conscious and rational agents have, as part of their powers of purposive intelligence, the capacity to design information-rich parts and to organize those parts into functional information-rich systems and hierarchies. Further, we know of no other causal entity or process that has this capacity. Clearly, we have good reason to doubt that mutation and selection, self-organizational processes or laws of nature, can produce the information-rich components, systems, and body plans necessary to explain the origination of morphological novelty such as that which arises in the Cambrian period.

There is a third reason to consider purpose or design as an explanation for the origin of biological form and information: purposive agents have just those necessary powers that natural selection lacks as a condition of its causal adequacy. At several points in the previous analysis, we saw that natural selection lacked the ability to generate novel information precisely because it can only act after new functional CSI has arisen. Natural selection can favor new proteins, and genes, but only after they perform some function. The job of generating new functional genes, proteins and systems of proteins therefore falls entirely to random mutations. Yet without functional criteria to guide a search through the space of possible sequences, random variation is probabilistically doomed. What is needed is not just a source of variation (i.e., the freedom to search a space of possibilities) or a mode of selection that can operate after the fact of a successful search, but instead a means of selection that (a) operates during a search--before success--and that (b) is guided by information about, or knowledge of, a functional target.

Demonstration of this requirement has come from an unlikely quarter: genetic algorithms. Genetic algorithms are programs that allegedly simulate the creative power of mutation and selection. Dawkins and Kuppers, for example, have developed computer programs that putatively simulate the production of genetic information by mutation and natural selection (Dawkins 1986:47-49, Kuppers 1987:355-369). Nevertheless, as shown elsewhere (Meyer 1998:127-128, 2003:247-248), these programs only succeed by the illicit expedient of providing the computer with a "target sequence" and then treating relatively greater proximity to future function (i.e., the target sequence), not actual present function, as a selection criterion. As Berlinski (2000) has argued, genetic algorithms need something akin to a "forward looking memory" in order to succeed. Yet such foresighted selection has no analogue in nature. In biology, where differential survival depends upon maintaining function, selection cannot occur before new functional sequences arise. Natural selection lacks foresight.

What natural selection lacks, intelligent selection--purposive or goal-directed design--provides. Rational agents can arrange both matter and symbols with distant goals in mind. In using language, the human mind routinely "finds" or generates highly improbable linguistic sequences to convey an intended or preconceived idea. In the process of thought, functional objectives precede and constrain the selection of words, sounds and symbols to generate functional (and indeed meaningful) sequences from among a vast ensemble of meaningless alternative combinations of sound or symbol (Denton 1986:309-311). Similarly, the construction of complex technological objects and products, such as bridges, circuit boards, engines and software, result from the application of goal-directed constraints (Polanyi 1967, 1968). Indeed, in all functionally integrated complex systems where the cause is known by experience or observation, design engineers or other intelligent agents applied boundary constraints to limit possibilities in order to produce improbable forms, sequences or structures. Rational agents have repeatedly demonstrated the capacity to constrain the possible to actualize improbable but initially unrealized future functions. Repeated experience affirms that intelligent agents (minds) uniquely possess such causal powers.

Analysis of the problem of the origin of biological information, therefore, exposes a deficiency in the causal powers of natural selection that corresponds precisely to powers that agents are uniquely known to possess. Intelligent agents have foresight. Such agents can select functional goals before they exist. They can devise or select material means to accomplish those ends from among an array of possibilities and then actualize those goals in accord with a preconceived design plan or set of functional requirements. Rational agents can constrain combinatorial space with distant outcomes in mind. The causal powers that natural selection lacks--almost by definition--are associated with the attributes of consciousness and rationality--with purposive intelligence. Thus, by invoking design to explain the origin of new biological information, contemporary design theorists are not positing an arbitrary explanatory element unmotivated by a consideration of the evidence. Instead, they are positing an entity possessing precisely the attributes and causal powers that the phenomenon in question requires as a condition of its production and explanation.


An experience-based analysis of the causal powers of various explanatory hypotheses suggests purposive or intelligent design as a causally adequate--and perhaps the most causally adequate--explanation for the origin of the complex specified information required to build the Cambrian animals and the novel forms they represent. For this reason, recent scientific interest in the design hypothesis is unlikely to abate as biologists continue to wrestle with the problem of the origination of biological form and the higher taxa.

Literature Cited

Adams, M. D. Et alia. 2000. The genome sequence of Drosophila melanogaster.--Science 287:2185-2195.

Aris-Brosou, S., & Z. Yang. 2003. Bayesian models of episodic evolution support a late Precambrian explosive diversification of the Metazoa.--Molecular Biology and Evolution 20:1947-1954.

Arthur, W. 1997. The origin of animal body plans. Cambridge University Press, Cambridge, United Kingdom.

Axe, D. D. 2000. Extreme functional sensitivity to conservative amino acid changes on enzyme exteriors.--Journal of Molecular Biology 301(3):585-596.

______. 2004. Estimating the prevalence of protein sequences adopting functional enzyme folds.--Journal of Molecular Biology (in press).

Ayala, F. 1994. Darwin's revolution. Pp. 1-17 in J. Campbell and J. Schopf, eds., Creative evolution?! Jones and Bartlett Publishers, Boston, Massachusetts.

______. A. Rzhetsky, & F. J. Ayala. 1998. Origin of the metazoan phyla: molecular clocks confirm paleontological estimates--Proceedings of the National Academy of Sciences USA. 95:606-611.

Becker, H., & W. Lonnig, 2001. Transposons: eukaryotic. Pp. 529-539 in Nature encyclopedia of life sciences, vol. 18. Nature Publishing Group, London, United Kingdom.

Behe, M. 1992. Experimental support for regarding functional classes of proteins to be highly isolated from each other. Pp. 60-71 in J. Buell and V. Hearn, eds., Darwinism: science or philosophy? Foundation for Thought and Ethics, Richardson, Texas.

______. 1996. Darwin's black box. The Free Press, New York.

______. 2004. Irreducible complexity: obstacle to Darwinian evolution. Pp. 352-370 in W. A. Dembski and M. Ruse, eds., Debating design: from Darwin to DNA. Cambridge University Press, Cambridge, United Kingdom.

Benton, M., & F. J. Ayala. 2003. Dating the tree of life--Science 300:1698-1700.

Berlinski, D. 2000. "On assessing genetic algorithms." Public lecture. Conference: Science and evidence of design in the universe. Yale University, November 4, 2000.

Blanco, F., I. Angrand, & L. Serrano. 1999. Exploring the confirmational properties of the sequence space between two proteins with different folds: an experimental study.--Journal of Molecular Biology 285:741-753.

Bowie, J., & R. Sauer. 1989. Identifying determinants of folding and activity for a protein of unknown sequences: tolerance to amino acid substitution.--Proceedings of the National Academy of Sciences, U.S.A. 86:2152-2156.

Bowring, S. A., J. P. Grotzinger, C. E. Isachsen, A. H. Knoll, S. M. Pelechaty, & P. Kolosov. 1993. Calibrating rates of early Cambrian evolution.--Science 261:1293-1298.

______. 1998a. A new look at evolutionary rates in deep time: Uniting paleontology and high-precision geochronology.--GSA Today 8:1-8.

______. 1998b. Geochronology comes of age.--Geotimes 43:36-40.

Bradley, W. 2004. Information, entropy and the origin of life. Pp. 331-351 in W. A. Dembski and M. Ruse, eds., Debating design: from Darwin to DNA. Cambridge University Press, Cambridge, United Kingdom.

Brocks, J. J., G. A. Logan, R. Buick, & R. E. Summons. 1999. Archean molecular fossils and the early rise of eukaryotes.--Science 285:1033-1036.

Brush, S. G. 1989. Prediction and theory evaluation: the case of light bending.--Science 246:1124-1129.

Budd, G. E. & S. E. Jensen. 2000. A critical reappraisal of the fossil record of the bilaterial phyla.--Biological Reviews of the Cambridge Philosophical Society 75:253-295.

Carroll, R. L. 2000. Towards a new evolutionary synthesis.--Trends in Ecology and Evolution 15:27-32.

Cleland, C. 2001. Historical science, experimental science, and the scientific method.--Geology 29:987-990.

______. 2002. Methodological and epistemic differences between historical science and experimental science.--Philosophy of Science 69:474-496.

Chothia, C., I. Gelfland, & A. Kister. 1998. Structural determinants in the sequences of immunoglobulin variable domain.--Journal of Molecular Biology 278:457-479.

Conway Morris, S. 1998a. The question of metazoan monophyly and the fossil record.--Progress in Molecular and Subcellular Biology 21:1-9.

______. 1998b. Early Metazoan evolution: Reconciling paleontology and molecular biology.--American Zoologist 38 (1998):867-877.

______. 2000. Evolution: bringing molecules into the fold.--Cell 100:1-11.

______. 2003a. The Cambrian "explosion" of metazoans. Pp. 13-32 in G. B. Muller and S. A. Newman, eds., Origination of organismal form: beyond the gene in developmental and evolutionary biology. The M.I.T. Press, Cambridge, Massachusetts.

______. 2003b. Cambrian "explosion" of metazoans and molecular biology: would Darwin be satisfied?--International Journal of Developmental Biology 47(7-8):505-515.

______. 2003c. Life's solution: inevitable humans in a lonely universe. Cambridge University Press, Cambridge, United Kingdom.

Crick, F. 1958. On protein synthesis.--Symposium for the Society of Experimental Biology. 12(1958):138-163.

Darwin, C. 1859. On the origin of species. John Murray, London, United Kingdom.

______. 1896. Letter to Asa Gray. P. 437 in F. Darwin, ed., Life and letters of Charles Darwin, vol. 1., D. Appleton, London, United Kingdom.

Davidson, E. 2001. Genomic regulatory systems: development and evolution. Academic Press, New York, New York.

Dawkins, R. 1986. The blind watchmaker. Penguin Books, London, United Kingdom.

______. 1995. River out of Eden. Basic Books, New York.

______. 1996. Climbing Mount Improbable. W. W. Norton & Company, New York.

Dembski, W. A. 1998. The design inference. Cambridge University Press, Cambridge, United Kingdom.

______. 2002. No free lunch: why specified complexity cannot be purchased without intelligence. Rowman & Littlefield, Lanham, Maryland.

______. 2004. The logical underpinnings of intelligent design. Pp. 311-330 in W. A. Dembski and M. Ruse, eds., Debating design: from Darwin to DNA. Cambridge University Press, Cambridge, United Kingdom.

Denton, M. 1986. Evolution: a theory in crisis. Adler & Adler, London, United Kingdom.

______. 1998. Nature's density. The Free Press, New York.

Eden, M. 1967. The inadequacies of neo-Darwinian evolution as a scientific theory. Pp. 5-12 in P. S. Morehead and M. M. Kaplan, eds., Mathematical challenges to the Darwinian interpretation of evolution. Wistar Institute Symposium Monograph, Allen R. Liss, New York.

Eldredge, N., & S. J. Gould. 1972. Punctuated equilibria: an alternative to phyletic gradualism. Pp. 82-115 in T. Schopf, ed., Models in paleobiology. W. H. Freeman, San Francisco.

Erwin, D. H. 1994. Early introduction of major morphological innovations.--Acta Palaeontologica Polonica 38:281-294.

______. 2000. Macroevolution is more than repeated rounds of microevolution.--Evolution & Development 2:78-84.

______. 2004. One very long argument.--Biology and Philosophy 19:17-28.

______, J. Valentine, & D. Jablonski. 1997. The origin of animal body plans.--American Scientist 85:126-137.

______, ______, & J. J. Sepkoski. 1987. A comparative study of diversification events: the early Paleozoic versus the Mesozoic.--Evolution 41:1177-1186.

Foote, M. 1997. Sampling, taxonomic description, and our evolving knowledge of morphological diversity.--Paleobiology 23:181-206.

______, J. P. Hunter, C. M. Janis, & J. J. Sepkoski. 1999. Evolutionary and preservational constraints on origins of biologic groups: Divergence times of eutherian mammals.--Science 283:1310-1314.

Frankel, J. 1980. Propagation of cortical differences in tetrahymena.--Genetics 94:607-623.

Gates, B. 1996. The road ahead. Blue Penguin, Boulder, Colorado.
Gerhart, J., & M. Kirschner. 1997. Cells, embryos, and evolution. Blackwell Science, London, United Kingdom.

Gibbs, W. W. 2003. The unseen genome: gems among the junk.--Scientific American. 289:46-53.

Gilbert, S. F., J. M. Opitz, & R. A. Raff. 1996. Resynthesizing evolutionary and developmental biology.--Developmental Biology 173:357-372.

Gillespie, N. C. 1979. Charles Darwin and the problem of creation. University of Chicago Press, Chicago.

Goodwin, B. C. 1985. What are the causes of morphogenesis?--BioEssays 3:32-36.

______. 1995. How the leopard changed its spots: the evolution of complexity. Scribner's, New York, New York.

Gould, S. J. 1965. Is uniformitarianism necessary?--American Journal of Science 263:223-228.

Gould, S. J. 2002. The structure of evolutionary theory. Harvard University Press, Cambridge, Massachusetts.

Grant, P. R. 1999. Ecology and evolution of Darwin's finches. Princeton University Press, Princeton, New Jersey.

Grimes, G. W., & K. J. Aufderheide. 1991. Cellular aspects of pattern formation: the problem of assembly. Monographs in Developmental Biology, vol. 22. Karger, Baseline, Switzerland.

Grotzinger, J. P., S. A. Bowring, B. Z. Saylor, & A. J. Kaufman. 1995. Biostratigraphic and geochronologic constraints on early animal evolution.--Science 270:598-604.

Harold, F. M. 1995. From morphogenes to morphogenesis.--Microbiology 141:2765-2778.

______. 2001. The way of the cell: molecules, organisms, and the order of life. Oxford University Press, New York.

Hodge, M. J. S. 1977. The structure and strategy of Darwin's long argument.--British Journal for the History of Science 10:237-245.

Hooykaas, R. 1975. Catastrophism in geology, its scientific character in relation to actualism and uniformitarianism. Pp. 270-316 in C. Albritton, ed., Philosophy of geohistory (1785-1970). Dowden, Hutchinson & Ross, Stroudsburg, Pennsylvania.

John, B., & G. Miklos. 1988. The eukaryote genome in development and evolution. Allen & Unwinding, London, United Kingdom.

Kauffman, S. 1995. At home in the universe. Oxford University Press, Oxford, United Kingdom.

Kenyon, D., & G. Mills. 1996. The RNA world: a critique.--Origins & Design 17(1):9-16.

Kerr, R. A. 1993. Evolution's Big Bang gets even more explosive.-- Science 261:1274-1275.

Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge, United Kingdom.

Koonin, E. 2000. How many genes can make a cell?: the minimal genome concept.--Annual Review of Genomics and Human Genetics 1:99-116.

Kuppers, B. O. 1987. On the prior probability of the existence of life. Pp. 355-369 in L. Kruger et al., eds., The probabilistic revolution. M.I.T. Press, Cambridge, Massachusetts.

Lange, B. M. H., A. J. Faragher, P. March, & K. Gull. 2000. Centriole duplication and maturation in animal cells. Pp. 235-249 in R. E. Palazzo and G. P. Schatten, eds., The centrosome in cell replication and early development. Current Topics in Developmental Biology, vol. 49. Academic Press, San Diego.

Lawrence, P. A., & G. Struhl. 1996. Morphogens, compartments and pattern: lessons from Drosophila?--Cell 85:951-961.

Lenior, T. 1982. The strategy of life. University of Chicago Press, Chicago.

Levinton, J. 1988. Genetics, paleontology, and macroevolution. Cambridge University Press, Cambridge, United Kingdom.

______. 1992. The big bang of animal evolution.--Scientific American 267:84-91.

Lewin, R. 1988. A lopsided look at evolution.--Science 241:292.

Lewontin, R. 1978. Adaptation. Pp. 113-125 in Evolution: a Scientific American book. W. H. Freeman & Company, San Francisco.

Lipton, P. 1991. Inference to the best explanation. Routledge, New York.

Lonnig, W. E. 2001. Natural selection. Pp. 1008-1016 in W. E. Craighead and C. B. Nemeroff, eds., The Corsini encyclopedia of psychology and behavioral sciences, 3rd edition, vol. 3. John Wiley & Sons, New York.

______, & H. Saedler. 2002. Chromosome rearrangements and transposable elements.--Annual Review of Genetics 36:389-410.

Lovtrup, S. 1979. Semantics, logic and vulgate neo-darwinism.--Evolutionary Theory 4:157-172.

Marshall, W. F. & J. L. Rosenbaum. 2000. Are there nucleic acids in the centrosome? Pp. 187-205 in R. E. Palazzo and G. P. Schatten, eds., The centrosome in cell replication and early development. Current Topics in Developmental Biology, vol. 49. San Diego, Academic Press.

Maynard Smith, J. 1986. Structuralism versus selection--is Darwinism enough? Pp. 39-46 in S. Rose and L. Appignanesi, eds., Science and Beyond. Basil Blackwell, London, United Kingdom.

Mayr, E. 1982. Foreword. Pp. xi-xii in M. Ruse, Darwinism defended. Pearson Addison Wesley, Boston, Massachusetts.

McDonald, J. F. 1983. The molecular basis of adaptation: a critical review of relevant ideas and observations.--Annual Review of Ecology and Systematics 14:77-102.

McNiven, M. A. & K. R. Porter. 1992. The centrosome: contributions to cell form. Pp. 313-329 in V. I. Kalnins, ed., The centrosome. Academic Press, San Diego.

Meyer, S. C. 1991. Of clues and causes: a methodological interpretation of origin of life studies. Unpublished doctoral dissertation, University of Cambridge, Cambridge, United Kingdom.

______. 1998. DNA by design: an inference to the best explanation for the origin of biological information.--Rhetoric & Public Affairs, 1(4):519-555.

______. The scientific status of intelligent design: The methodological equivalence of naturalistic and non-naturalistic origins theories. Pp. 151-211 in Science and evidence for design in the universe. Proceedings of the Wethersfield Institute. Ignatius Press, San Francisco.

______. 2003. DNA and the origin of life: information, specification and explanation. Pp. 223-285 in J. A. Campbell and S. C. Meyer, eds., Darwinism, design and public education. Michigan State University Press, Lansing, Michigan.

______. 2004. The Cambrian information explosion: evidence for intelligent design. Pp. 371-391 in W. A. Dembski and M. Ruse, eds., Debating design: from Darwin to DNA. Cambridge University Press, Cambridge, United Kingdom.

______, M. Ross, P. Nelson, & P. Chien. 2003. The Cambrian explosion: biology's big bang. Pp. 323-402 in J. A. Campbell & S. C. Meyer, eds., Darwinism, design and public education. Michigan State University Press, Lansing. See also Appendix C: Stratigraphic first appearance of phyla body plans, pp. 593-598.

Miklos, G. L. G. 1993. Emergence of organizational complexities during metazoan evolution: perspectives from molecular biology, palaeontology and neo-Darwinism.--Mem. Ass. Australas. Palaeontols, 15:7-41.

Monastersky, R. 1993. Siberian rocks clock biological big bang.--Science News 144:148.

Moss, L. 2004. What genes can't do. The M.I.T. Press, Cambridge, Massachusetts.

Muller, G. B. & S. A. Newman. 2003. Origination of organismal form: the forgotten cause in evolutionary theory. Pp. 3-12 in G. B. Muller and S. A. Newman, eds., Origination of organismal form: beyond the gene in developmental and evolutionary biology. The M.I.T. Press, Cambridge, Massachusetts.

Nanney, D. L. 1983. The ciliates and the cytoplasm.--Journal of Heredity, 74:163-170.

Nelson, P., & J. Wells. 2003. Homology in biology: problem for naturalistic science and prospect for intelligent design. Pp. 303-322 in J. A. Campbell and S. C. Meyer, eds., Darwinism, design and public education. Michigan State University Press, Lansing.

Nijhout, H. F. 1990. Metaphors and the role of genes in development.--BioEssays 12:441-446.

Nusslein-Volhard, C., & E. Wieschaus. 1980. Mutations affecting segment number and polarity in Drosophila.--Nature 287:795-801.

Ohno, S. 1996. The notion of the Cambrian pananimalia genome.--Proceedings of the National Academy of Sciences, U.S.A. 93:8475-8478.

Orgel, L. E., & F. H. Crick. 1980. Selfish DNA: the ultimate parasite.--Nature 284:604-607.

Perutz, M. F., & H. Lehmann. 1968. Molecular pathology of human hemoglobin.--Nature 219:902-909.

Polanyi, M. 1967. Life transcending physics and chemistry.--Chemical and Engineering News, 45(35):54-66.

______. 1968. Life's irreducible structure.--Science 160:1308-1312, especially p. 1309.

Pourquie, O. 2003. Vertebrate somitogenesis: a novel paradigm for animal segmentation?--International Journal of Developmental Biology 47(7-8):597-603.

Quastler, H. 1964. The emergence of biological organization. Yale University Press, New Haven, Connecticut.

Raff, R. 1999. Larval homologies and radical evolutionary changes in early development, Pp. 110-121 in Homology. Novartis Symposium, vol. 222. John Wiley & Sons, Chichester, United Kingdom.

Reidhaar-Olson, J., & R. Sauer. 1990. Functionally acceptable solutions in two alpha-helical regions of lambda repressor.--Proteins, Structure, Function, and Genetics, 7:306-316.

Rutten, M. G. 1971. The origin of life by natural causes. Elsevier, Amsterdam.

Sapp, J. 1987. Beyond the gene. Oxford University Press, New York.

Sarkar, S. 1996. Biological information: a skeptical look at some central dogmas of molecular biology. Pp. 187-233 in S. Sarkar, ed., The philosophy and history of molecular biology: new perspectives. Kluwer Academic Publishers, Dordrecht.

Schutzenberger, M. 1967. Algorithms and the neo-Darwinian theory of evolution. Pp. 73-75 in P. S. Morehead and M. M. Kaplan, eds., Mathematical challenges to the Darwinian interpretation of evolution. Wistar Institute Symposium Monograph. Allen R. Liss, New York.

Shannon, C. 1948. A mathematical theory of communication.--Bell System Technical Journal 27:379-423, 623-656.

Shu, D. G., H. L. Loud, S. Conway Morris, X. L. Zhang, S. X. Hu, L. Chen, J. Han, M. Zhu, Y. Li, & L. Z. Chen. 1999. Lower Cambrian vertebrates from south China.--Nature 402:42-46.

Shubin, N. H., & C. R. Marshall. 2000. Fossils, genes, and the origin of novelty. Pp. 324-340 in Deep time. The Paleontological Society.

Simpson, G. 1970. Uniformitarianism: an inquiry into principle, theory, and method in geohistory and biohistory. Pp. 43-96 in M. K. Hecht and W. C. Steered, eds., Essays in evolution and genetics in honor of Theodosius Dobzhansky. Appleton-Century-Crofts, New York.

Sober, E. 2000. The philosophy of biology, 2nd edition. Westview Press, San Francisco.

Sonneborn, T. M. 1970. Determination, development, and inheritance of the structure of the cell cortex. In Symposia of the International Society for Cell Biology 9:1-13.

Sole, R. V., P. Fernandez, & S. A. Kauffman. 2003. Adaptive walks in a gene network model of morphogenesis: insight into the Cambrian explosion.--International Journal of Developmental Biology 47(7-8):685-693.

Stadler, B. M. R., P. F. Stadler, G. P. Wagner, & W. Fontana. 2001. The topology of the possible: formal spaces underlying patterns of evolutionary change.--Journal of Theoretical Biology 213:241-274.

Steiner, M., & R. Reitner. 2001. Evidence of organic structures in Ediacara-type fossils and associated microbial mats.--Geology 29(12):1119-1122.

Taylor, S. V., K. U. Walter, P. Kast, & D. Hilvert. 2001. Searching sequence space for protein catalysts.--Proceedings of the National Academy of Sciences, U.S.A. 98:10596-10601.

Thaxton, C. B., W. L. Bradley, & R. L. Olsen. 1992. The mystery of life's origin: reassessing current theories. Lewis and Stanley, Dallas, Texas.

Thompson, D. W. 1942. On growth and form, 2nd edition. Cambridge University Press, Cambridge, United Kingdom.

Thomson, K. S. 1992. Macroevolution: The morphological problem.--American Zoologist 32:106-112.

Valentine, J. W. 1995. Late Precambrian bilaterians: grades and clades. Pp. 87-107 in W. M. Fitch and F. J. Ayala, eds., Temporal and mode in evolution: genetics and paleontology 50 years after Simpson. National Academy Press, Washington, D.C.

______. 2004. On the origin of phyla. University of Chicago Press, Chicago, Illinois.

______, & D. H. Erwin, 1987. Interpreting great developmental experiments: the fossil record. Pp. 71-107 in R. A. Raff and E. C. Raff, eds., Development as an evolutionary process. Alan R. Liss, New York.

______, & D. Jablonski. 2003. Morphological and developmental macroevolution: a paleontological perspective.--International Journal of Developmental Biology 47:517-522.

Wagner, G. P. 2001. What is the promise of developmental evolution? Part II: A causal explanation of evolutionary innovations may be impossible.--Journal of Experimental Zoology (Mol. Dev. Evol.) 291:305-309.

______, & P. F. Stadler. 2003. Quasi-independence, homology and the Unity-C of type: a topological theory of characters.--Journal of Theoretical Biology 220:505-527.

Webster, G., & B. Goodwin. 1984. A structuralist approach to morphology.--Rivista di Biologia 77:503-10.

______, & ______. 1996. Form and transformation: generative and relational principles in biology. Cambridge University Press, Cambridge, United Kingdom.

Weiner, J. 1994. The beak of the finch. Vintage Books, New York.

Willmer, P. 1990. Invertebrate relationships: patterns in animal evolution. Cambridge University Press, Cambridge, United Kingdom.

______. 2003. Convergence and homoplasy in the evolution of organismal form. Pp. 33-50 in G. B. Muller and S. A. Newman, eds., Origination of organismal form: beyond the gene in developmental and evolutionary biology. The M.I.T. Press, Cambridge, Massachusetts.

Woese, C. 1998. The universal ancestor.--Proceedings of the National Academy of Sciences, U.S.A. 95:6854-6859.

Wray, G. A., J. S. Levinton, & L. H. Shapiro. 1996. Molecular evidence for deep Precambrian divergences among metazoan phyla.--Science 274:568-573.

Yockey, H. P. 1978. A calculation of the probability of spontaneous biogenesis by information theory.--Journal of Theoretical Biology 67:377-398.

______, 1992. Information theory and molecular biology, Cambridge University Press, Cambridge, United Kingdom.

End Notes

En Español (PDF)

117(2):213-239. 2004

The origin of biological information and the higher taxonomic categories
Stephen C. Meyer

Intelligent Design theory (ID) can contribute to science on at least two levels. On one level, ID is concerned with inferring from the evidence whether a given feature of the world is designed. This is the level on which William Dembski's explanatory filter and Michael Behe's concept of irreducible complexity operate. It is also the level that has received the most attention in recent years, largely because the existence of even one intelligently designed feature in living things (at least prior to human beings) would overturn the Darwinian theory of evolution that currently dominates Western biology. On another level, ID could function as a "metatheory," providing a conceptual framework for scientific research. By suggesting testable hypotheses about features of the world that have been systematically neglected by older metatheories (such as Darwin's), and by leading to the discovery of new features, ID could indirectly demonstrate its scientific fruitfulness. In November 2002, Bill Dembski, Paul Nelson and I visited the Detroit headquarters of Ideation, Inc. Ideation is a thriving business based on TRIZ, an acronym for the Russian words meaning "Theory of Inventive Problem Solving." Based on a survey of successful patents, TRIZ provides guidelines for finding solutions to specific engineering or manufacturing problems. When Ideation's president took us out to lunch, he told us that before ID could be taken seriously it would have to solve some real problems.


I was inspired by this to sketch out something I called a Theory of Organismal Problem-Solving (TOPS). Strictly speaking, I suppose the biological equivalent of TRIZ would survey successful experiments for guidelines to solve research problems posed by existing hypotheses. I chose to try a different approach, however: As I formulated it, TOPS suggests how ID could lead to new hypotheses and scientific discoveries. TOPS begins with the observation that the evidence is sufficient to warrant at least provisional acceptance of two propositions: (1) Darwinian evolution (the theory that new features of living things originate through natural selection acting on random variations) is false, and (2) ID (the theory that many features of living things could only have originated through intelligent agency) is true. TOPS then explicitly rejects several implications of Darwinian evolution. These include: (1a) The implication that living things are best understood from the bottom up, in terms of their molecular constituents. (1b) The implications that DNA mutations are the raw materials of macroevolution, that embryo development is controlled by a genetic program, that cancer is a genetic disease, etc. (1c) The implication that many features of living things are useless vestiges of random processes, so it is a waste of time to inquire into their functions. Finally, TOPS assumes as a working hypothesis that various implications of ID are true. These include: (2a) The implication that living things are best understood from the top down, as irreducibly complex organic wholes. (2b) The implications that DNA mutations do not lead to macroevolution, that the developmental program of an embryo is not reducible to its DNA, that cancer originates in higher structural features of the cell rather than in its DNA, etc. (2c) The implication that all features of living things should be presumed to have a function until proven otherwise, and that reverse engineering is the best way to understand them. It is important to note that "implication" is not the same as "logical deduction." Darwinian evolution does not logically exclude the ID implications listed here, nor does ID logically exclude every implication of Darwinian evolution. A Darwinian may entertain the idea that other features of an embryo besides DNA influence its development, and Darwinians can (and do) use reverse engineering to understand the functions of features in living things. Furthermore, an ID viewpoint does not logically rule out genetic programs or the idea that some features of living things may be useless vestiges of evolution. The differences between Darwinian evolution and ID that form the starting-point for TOPS are not mutually exclusive logical entailments, but differences in emphasis. The goal of TOPS is not to show that Darwinian evolution leads logically to false conclusions, but to explore what happens when ID rather than evolutionary theory is used as a framework to ask research questions. Take, for example, research on the vast regions of vertebrate genomes that do not code for proteins. From a neo-Darwinian perspective, DNA mutations can provide the raw materials for evolution because DNA encodes proteins that determine the essential features of organisms. Since non-coding regions do not produce proteins, Darwinian biologists have been dismissing them for decades as random evolutionary noise or "junk DNA." From an ID perspective, however, it is extremely unlikely that an organism would expend its resources on preserving and transmitting so much "junk." It is much more likely that non-coding regions have functions that we simply haven't discovered yet. Recent research shows that "junk DNA" does, indeed, have previously unsuspected functions. Although that research was done in a Darwinian framework, its results came as a complete surprise to people trying to ask Darwinian research questions. The fact that "junk DNA" is not junk has emerged not because of evolutionary theory but in spite of it. On the other hand, people asking research questions in an ID framework would presumably have been looking for the functions of non-coding regions of DNA all along, and we might now know considerably more about them.

TOPS and Cancer

In November 2002, I decided to apply TOPS to a specific biomedical problem. Not being one to proceed timidly, I chose to tackle cancer. I quickly learned from reviewing the recent scientific literature that cancer is not correlated with any consistent pattern of DNA mutations, but it iscorrelated with abnormalities at the chromosomal level -- a phenomenon called "chromosomal instability" (Lengauer et al., 1998). Chromosomal instability, in turn, is correlated with centrosome abnormalities -- particularly the presence of extra or enlarged centrosomes. A growing number of researchers regard cancer not as a DNA disease, but as a "centrosomal disease" (Brinkley and Goepfert, 1998; Pihan et al., 1998; Lingle and Salisbury, 2000). In 1985, I had published a hypothesis about how centrosomes might produce a force in dividing cells that pushes chromosomes away from the spindle poles (Wells, 1985). Cell biologists have long been aware of this "polar ejection force" or "polar wind" (Rieder et al., 1986; Rieder and Salmon, 1994), but its mechanism remains unknown. The force has been attributed to microtubule elongation and/or microtubule-associated motor proteins, but neither of these explanations fits all the facts (Wells, 2004). In the hypothesis I proposed in 1985, magnetic interactions in the centrosome would cause spindle microtubules to "wobble" like a laboratory vortexer, though at a much higher frequency and much smaller amplitude, producing a centrifugal-like force directed away from spindle poles. I subsequently realized (with help from physicist David Snoke) that the magnetic interactions I had proposed in 1985 would not work. In 2002 it occurred to me, however, that the still-viable vortexer concept might help to explain the link between centrosomes and cancer: Centrosomes that are too numerous or too large would produce too strong a polar ejection force, damaging chromosomes and leading to chromosomal instability. If the polar ejection force were really the link between centrosomes and cancer, however, and the polar ejection force were due to a vortexer-like motion of spindle microtubules, what could be the mechanism producing this motion? My attention quickly turned to centrioles. Centrosomes in animal cells contain centrioles, tiny organelles less than a millionth of a meter long. Except for their role in nucleating eukaryotic cilia and flagella, their precise functions remain mysterious (Preble et al., 2000). They have never been a favorite object of study within the framework of Darwinian theory, because even though they replicate every time a cell divides they contain no DNA (Marshall and Rosenbaum, 2000), and they have no evolutionary intermediates from which to reconstruct phylogenies (Fulton, 1971). The cells of higher plants do not contain centrioles (Luykx, 1970; Pickett- Heaps, 1971); nor do they produce a polar ejection force like the one observed in animal cells (Khodjakov et al., 1996). It occurred to me that the correlation might not be accidental. Centrioles might be the source of the polar ejection force, and they might hold the clue to understanding cancer. In the electron microscope, centrioles look like tiny turbines. Using TOPS as my guide, I concluded that if centrioles look like turbines they might actuallybe turbines. I then used reverse engineering to formulate a testable, quantitative hypothesis linking centrioles, polar ejection forces, and cancer. That hypothesis is summarized below, and the detailed technical version (Wells, 2004) has been submitted for publication in a biology journal.

Centrioles as tiny turbines

Centrioles are roughly cylindrical in shape, and when mature they typically have a diameter of about 0.2 μm and a length of about 0.4 μm. The end of a centriole closest to the center of the cell is called "proximal," and the other end is called "distal." The organelle is composed of nine clusters of microtubules. These are organized as triplets in the proximal half, but the outermost microtubule in each triplet terminates about halfway toward the distal end, which consists of doublet microtubules (Stubblefield and Brinkley, 1967; De Harven, 1968; Wheatley, 1982; Bornens, et al., 1987). The triplet microtubules making up the proximal half of a centriole form blades that are tilted about 45 degrees relative to the circumference. Various authors have noted that the triplet microtubules have a turbine-like disposition. If the centriole were actually a tiny turbine, fluid exiting through the blades would cause the organelle to rotate clockwise when viewed from the proximal end. In order for the centriolar turbine to turn, there must be a mechanism to pump fluid through the blades. Helical structures have been observed in the lumens of centrioles (Stubblefield and Brinkley, 1967; Paintrand et al., 1992). Helical structures have also been observed associated with the central pair apparatus that rotates inside a ciliary or flagellar axoneme (Goodenough and Heuser, 1985; Mitchell, 2003), and axonemes are nucleated by basal bodies that are interconvertible with centrioles (Preble et al., 2000). If the helix inside a centriole rotates like the central apparatus of an axoneme, it could function as an "Archimedes' screw," a corkscrew-action pump that would draw fluid in through the proximal end and force it out through the triplet-microtubule turbine blades. The helical pump could be powered by dynein. Dynein produces microtubule-mediated movements in the axonemes of cilia and flagella, though its mode of action in centrioles would have to be different from the former. Cilia and flagella move because of dynein-based sliding between doublet microtubules (Brokaw, 1994; Porter and Sale, 2000). In centrioles, however, the only dynein-like structures appear to be associated with internal columns in the lumen. (Paintrand et al., 1992) Dynein molecules in those columns could drive an internal Archimedes' screw pump by interacting with its helical blades. By analogy with the central pair apparatus in axonemes, the helix inside a centriole would presumably rotate at about 100 Hz.

Dynamics of a centriole pair

Most centrosomes contain a pair of centrioles connected near their proximal ends and oriented at right angles to each other (Bornens, et al., 1987; Paintrand et al., 1992; Bornens, 2002). The older member ("mother") of a centriole pair is distinguished from the younger ("daughter") by various structures, including "distal appendages" that project at an angle from the distal-most edges of the doublet microtubules, and "subdistal appendages" that form a thick collar around most of the distal half of the mother centriole and serve as an anchor for microtubules that extend into the spindle (Paintrand et al., 1992; Piel et al., 2000). When centrioles are isolated under low calcium conditions, the subdistal appendages dissociate from the wall of the mother centriole while the distal appendages remain connected to it (Paintrand et al., 1992). These characteristics are consistent with a model in which the subdistal appendages form a bearing connected to the cell's cytoskeleton, and the distal appendages form a flange holding the mother centriole in its bearing. (Figure 1) page6image1087641280 Figure 1. Cross-section of a centriole pair. (M) Mother centriole. (D) Daughter centriole. Note the internal helices in each. (a) Subdistal appendages. (b) Spindle microtubules (which are anchored to the subdistal appendages). (c) Distal appendages. In the hypothesis presented here, the subdistal appendages function as a bearing and the distal appendages function as a flange. The large ellipse is the centromatrix capsule enclosing the centriole pair. The daughter centriole, constrained by its proximal connection to the mother, would not rotate on its own axis; instead, it would swing bodily around the long axis of the mother centriole. Nevertheless, the daughter would still function as a turbine, producing a torque that would press the mother centriole laterally against the inner wall of its bearing. The daughter's torque would thereby cause the centriole pair to revolve eccentrically, producing a wobble resembling the motion of a laboratory vortexer. The centriole pair is surrounded by a structural network of 12- to 15-nm diameter filaments called the "centromatrix" (Schnackenberg et al., 1998). The fluid inside the centromatrix capsule would not remain stationary, but would be stirred in a circle by the revolving daughter centriole. It might seem that friction against the inner wall of the centromatrix would offer enormous resistance to such movement; surprisingly, however, the resistance could be quite low because of ""nanobubbles" (Tyrrell and Attard, 2001; Steitz et al., 2003; Ball, 2003). Nanobubbles 200 nm in diameter and 20 nm thick could render a surface composed of hydrophobic 12-15 nm filaments almost frictionless. With power being continually supplied by the helical pump inside the mother centriole, calculations show that the centriole pair could reach an angular velocity of more than 10 kHz midway through cell division (see Mathematical Appendix, below).

Centrioles and the polar ejection force

The subdistal appendages that form the bearing for the revolving centriole pair also anchor microtubules that extend into the spindle (Paintrand et al., 1992; Piel et al., 2000). Other microtubules are anchored in the pericentriolar material surrounding the centromatrix. Just as a vortexer imparts its wobble to a test tube placed in it, so the centrosome would impart its wobble to the microtubules emanating from it. Spindle microtubules would presumably not transmit this motion as uniformly as the rigid glass walls of a test tube, but they may be rigid enough to induce objects within the spindle to undergo movements not unlike the contents of a test tube in a vortexer. It is worth noting in this regard that microtubules in ordered arrays exhibit more stiffness than would be expected from non-interacting rigid rods (Sato et al., 1988). Objects within the spindle would then undergo high frequency, small amplitude circular movements perpendicular to polar microtubules, as originally proposed by Wells (1985). Objects in the middle of a bipolar spindle would thus experience a centrifugal force laterally outward from the long axis of the spindle. Calculations (see Mathematical Appendix, below) show that this force could be more than five times as strong as the force of gravity. The conical arrangement of the microtubules would convert part of this to a component parallel to the spindle axis, producing a smaller force tending to move objects radially away from the pole. The wobble produced by a revolving centriole pair could thereby generate a polar ejection force.

Implications for cancer

If centrioles generate a polar ejection force, the presence of too many centriole pairs at either pole could result in an excessive polar ejection force that subjects chromosomes to unusual stresses that cause breaks and translocations. Even more serious than the presence of extra centrioles would be a failure of the control mechanisms that normally shut down centriolar turbines at the beginning of anaphase, since centriole pairs would then continue to accelerate and generate polar ejection forces far greater than normal. A centriole-generated polar ejection force could be regulated in part by intracellular calcium levels. In dividing animal cells, the onset of anaphase is normally accompanied by a transient rise in intracellular Ca2+ concentration (Poenie et al., 1986). Elevated Ca2+ concentrations can lead to asymmetrical bending or quiescence in sea urchin sperm flagella axonemes (Brokaw, 1987). This may be due to a Ca2+-induced change in the direction of the power stroke of dynein arms (Ishijima et al., 1996), or to an effect on the central pair apparatus (Bannai, et al., 2000). If the helical pump inside a centriole is driven by dynein, then a rise in intracellular calcium concentration could shut it down. It is worth noting in this regard that a number of recent studies have reported a link between calcium and vitamin D deficiency and various types of cancer. Dietary calcium supplements can modestly reduce the risk of colorectal cancer (McCullough et al., 2003), and there appears to be an inverse correlation between vitamin D levels and prostate cancer (Konety et al., 1999). Analogs and metabolites of vitamin D inhibit the growth of prostate cancer cells in vitro (Krishnan et al., 2003) and in vivo (Vegesna et al., 2003), and they have similar inhibitory effects on breast cancer cells (Flanagan et al., 2003). If centrioles generate a polar ejection force, the correlation between calcium and vitamin D levels and cancer could be a consequence -- at least in part -- of the role of calcium in turning off centriolar turbines at the onset of anaphase.


Stubblefield and Brinkley (1967) proposed that sequential movements of the centriole's triplet microtubules turn an internal helix, which they believed to be DNA, in order to facilitate microtubule assembly. It has since become clear, however, that centrioles do not contain DNA (Marshall and Rosenbaum, 2000). In the hypothesis proposed here, a centriole is a tiny turbine composed of triplet microtubule blades and powered by an internal helical pump. This is the reverse of Stubblefield and Brinkley's idea that the triplet microtubules turn the internal helix. Bornens (1979) suggested that rapidly rotating centrioles, powered by an ATPase in cartwheel structures at their proximal ends, function like gyroscopes to provide an inertial reference system for the cell and generate electrical oscillations that coordinate cellular processes. In the hypothesis proposed here, rapidly rotating centrioles would produce small-amplitude, high-oscillations in spindle microtubules that are mechanical, not electrical as Bornens proposed. There are several ways to test this hypothesis. Two ways are:
  1. It should be possible to detect oscillations in spindle microtubules early in prometaphase by immunofluorescence microscopy and high-speed camera technology.
  2. It should be possible to regulate the polar ejection force by raising the concentration of intracellular calcium during prometaphase or blocking its rise at the beginning of anaphase.
If the hypothesis presented here withstands these and other experimental tests, then it may contribute to a better understanding not only of cell division, but also of cancer.


The author gratefully acknowledges helpful suggestions from David W. Snoke, Keith Pennock, and Lucy P. Wells. The author also thanks Joel Shoop for producing the illustration and Peter L. Maricich for assisting with the mathematical analysis.


  • Ball, P., 2003. How to keep dry in water. Nature 423, 25-26.
  • Bannai, H., Yoshimura, M., Takahashi, K., Shingyoji, C., 2000. Calcium regulation of microtubule sliding in reactivated sea urchin sperm flagella. J. Cell Sci. 113, 831-839.
  • Bornens, M., 1979. The centriole as a gyroscopic oscillator: implications for cell organization and some other consequences. Biol. Cell. 35, 115-132. Bornens, M., 2002. Centrosome composition and microtubule anchoring mechanisms. Curr. Opin. Cell Biol. 14, 25-34.
  • Bornens, M., Paintrand, M., Berges, J., Marty, M-C., Karsenti, E., 1987. Structural and chemical characterization of isolated centrosomes. Cell Motil. Cytoskeleton 8, 238-249.
  • Brinkley, B.R., Goepfert, T.M., 1998. Supernumerary centrosomes and cancer: Boveri's hypothesis resurrected. Cell Motil. Cytoskeleton 41, 281-288.
  • Brokaw, C.J., 1987. Regulation of sperm flagellar motility by calcium and camp-dependent phosphorylation. J. Cell. Biochem. 35, 175-184.
  • Brokaw, C.J., 1994. Control of flagellar bending: a new agenda based on dynein diversity. Cell Motil. Cytoskeleton 28, 199-204.
  • De Harven, E., 1968. The centriole and the mitotic spindle, in: Dalton, A.J., Haguenau, F. (Eds.), Ultrastructure in Biological Systems, v. 3: The Nucleus, Academic Press, New York, pp. 197-227.
  • Flanagan, L., Packman, K., Juba, B., O'Neill, S., Tenniswood, M., Welsh, J., 2003. Efficacy of vitamin D compounds to modulate estrogen receptor negative breast cancer growth and invasion. J. Steroid Biochem. Mol. Biol. 84, 181- 192.
  • Fulton, C., 1971. Centrioles, in: Reinert, A.J., Ursprung, H., (Eds.), Origin and Continuity of Cell Organelles, Springer-Verlag, New York, pp. 170-221.
  • Goodenough, U.W., Heuser, J.E., 1985. Substructure of inner dynein arms, radial spokes, and the central pair/projection complex of cilia and flagella. J. Cell Biol. 100, 2008-2018.
  • Ishijima, S., Kubo-Irie, M., Mohri, H., Hamaguchi, Y., 1996. Calcium-dependent bidirectional power stroke of the dynein arms in sea urchin sperm axonemes. J. Cell Sci. 109, 2833-2842.
  • Khodjakov, A., Cole, R.W., Bajer, A.S., Rieder, C.L., 1996. The force for poleward chromosome motion in Haemanthus cells acts along the length of the chromosome during metaphase but only at the kinetochore during anaphase. J. Cell Biol. 132, 1093-1104.
  • Konety, B.R., Johnson, C.S., Trump, D.L., Getzenberg, R.H., 1999. Vitamin D in the prevention and treatment of prostate cancer. Semin. Urol. Oncol. 17, 77-84.
  • Krishnan, A.V., Peehl, D.M., Feldman, D., 2003. Inhibition of prostate cancer growth by vitamin D: regulation of target gene expression. J. Cell. Biochem. 88, 363-371.
  • Lengauer, C., Kinzler, K.W., Vogelstein, B., 1998. Genetic instabilities in human cancers. Nature 396, 643-649.
  • Lingle, W.L., Salisbury, J.L., 2000. The role of the centrosome in the development of malignant tumors. Curr. Top. Dev. Biol. 49, 313-329.
  • Luykx, P., 1970. Cellular mechanisms of chromosome distribution. Int. Rev. Cytol. Suppl. 2, 1-173.
  • Marshall, W.F., Rosenbaum, J.L., 2000. Are there nucleic acids in the centrosome? Curr. Top. Dev. Biol. 49, 187-205.
  • McCullough, M.L., Robertson, A.S., Rodriquez, C., Jacobs, E.J., Chao, A., Jonas, C., Calle, E.E., Willett, W.C., Thun, M.J., 2003. Calcium, vitamin D, dairy products, and risk of colorectal cancer in the cancer prevention study II nutrition cohort (United States). Cancer Causes Control 14, 1-12.
  • Mitchell, D.R., 2003. Reconstruction of the projection periodicity and surface architecture of the flagellar central pair complex. Cell Motil. Cytoskeleton 55, 188-199.
  • Paintrand, M., Moudjou, M., Delacroix, H., Bornens, M., 1992. Centrosome organization and centriole architecture: their sensitivity to divalent cations. J. Struct. Biol. 108, 107-128.
  • Pickett-Heaps, J., 1971. The autonomy of the centriole: fact or fallacy? Cytobios 3, 205-214.
  • Piel, M., Meyer, P., Khodjakov, A., Rieder, C.L., Bornens, M., 2000. The respective contributions of the mother and daughter centrioles to centrosome activity and behavior in vertebrate cells. J. Cell Biol. 149, 317- 329.
  • Pihan, G.A., Purohit, A., Wallace, J., Knecht, H., Woda, B., Quesenberry, P., Doxsey, S.J., 1998. Centrosome defects and genetic instability in malignant tumors. Cancer Res. 58, 3974-3985.
  • Poenie, M., Alderton, J., Steinhardt, R., Tsien, R., 1986. Calcium rises abruptly and briefly throughout the cell at the onset of anaphase. Science 233, 886- 889.
  • Porter, M.E., Sale, W.S., 2000. The 9 + 2 axoneme anchors multiple inner arm dyneins and a network of kinases and phosphatases that control motility. J. Cell Biol. 151, F37-F42.
  • Preble, A.M., Giddings, T.M. Jr., Dutcher, S.K., 2000. Basal bodies and centrioles: their function and structure. Curr. Top. Dev. Biol. 49, 207-233.
  • Rieder, C.L., Davison, A.E., Jensen, L.C.W., Cassimeris, L., Salmon, E.D., 1986. Oscillatory movements of monooriented chromosomes and their position relative to the spindle pole result from the ejection properties of the aster and half-spindle. J. Cell Biol. 103, 581-591.
  • Rieder, C.L., Salmon, E.D., 1994. Motile kinetochores and polar ejection forces dictate chromosome position on the vertebrate mitotic spindle. J. Cell Biol. 124, 223-233.
  • Sato, M., Schwartz, W.H., Selden, S.C., Pollard, T.D., 1988. Mechanical properties of brain tubulin and microtubules. J. Cell Biol. 106, 1205-1211.
  • Schnackenberg, B.J., Khodjakov, A., Rieder, C.L., Palazzo, R.E., 1998. The disassembly and reassembly of functional centrosomes in vitro. Proc. Natl. Acad. Sci. U.S.A. 95, 9295-9300.
  • Steitz, R., Gutberlet, T., Hauss, T., Klösgen, B., Krastev, R., Schemmel, S., Simonsen, A.C., Findenegg, G.H., 2003. Nanobubbles and their precursor layer at the interface of water against a hydrophobic substrate. Langmuir 19, 2409-2418.
  • Stubblefield, E., Brinkley, B.R., 1967. Architecture and function of the mammalian centriole, in: Warren, K.B. (Ed.), Formation and Fate of Cell Organelles, Academic Press, New York, pp. 175-218.
  • Tyrrell, J.W.G., Attard, P., 2001. Images of nanobubbles on hydrophobic surfaces and their interactions. Phys. Rev. Lett. 87, 176104/1-176104/4.
  • Vegesna, V., O'Kelly, J., Said, J., Uskokovic, M., Binderup, L., Koeffle, H.P., 2003. Ability of potent vitamin D3 analogs to inhibit growth of prostate cancer cells in vivo. Anticancer Res. 23, 283-290.
  • Wheatley, D.N., 1982. The Centriole: A Central Enigma of Cell Biology, Elsevier, Amsterdam.
  • Wells, J., 1985. Inertial force as a possible factor in mitosis. BioSystems 17, 301- 315.
  • Wells, J., 2004. A hypothesis linking centrioles, polar ejection forces, and cancer. Submitted for publication.

Mathematical Appendix

This is only a summary; for details see Wells (2004). A rotating helical pump would cause a fluid flow U into the proximal end of the centriole of

U = 4πφRotanθ(Ro2 – Ri2) (A1)

in which φ and θ are the angular velocity and pitch of the helix, respectively; Ro is the radius of the centriolar lumen (and thus the outer radius of the helix blades); and Ri is the radius of the central column around which the blades wind. Neglecting the thickness of the blades, and using values derived from electron micrographs of centrioles and measurements of central pair rotations in cilia, the fluid flow can be calculated to be of the order of U ≈ 10–19 m3 sec-1. The torque τ produced by the centriolar turbine would be the tangential component of the product of the velocity and the mass of fluid moving through the slits per second, multiplied by the distance of the turbine blades from the axis of rotation (approximately the outer radius of the centriole). The velocity and mass flow can be calculated from U, the approximate area of the slits between the turbine blades, and the density of the fluid being pumped through them. Since the outer radius of a centriole is approximately 0.1 μm, the resulting torque would be of the order of τ ≈ 10–28 kg m2 sec-2. In the rotational equivalent of Newton's force law, the angular acceleration α would be

α = τ/I (A2)

in which I is the effective moment of inertia of the revolving centriole pair. This would be of the order of 10-29 kg m2 (for derivation see Wells, 2004), so the angular acceleration produced by the torque of the mother centriole would be of the order of α ≈ 10 sec-2. Assuming negligible friction (because of nanobubbles), this torque would cause the angular velocity of the centriole pair to increase about 10 Hz every second. One minute after start-up, the centriole pair would be revolving about 600 Hz; after twenty minutes (i.e., about halfway through cell division), the pair would be revolving about 12,000 Hz. Orthogonally oriented centrioles would impart a wobble to the spindle microtubules and produce a centrifugal acceleration β given by

β = (αt)2 dtanε (A3)

in which t is the number of seconds that have elapsed since the turbines started, d is an object's distance from the centrosome, and ε is the eccentricity of the wobble. If the eccentricity of the wobble is 1°, then twenty minutes after start-up an object 20 μm from the spindle pole would be subjected to a centrifugal acceleration β of approximately 50 m sec-2, about five times greater than the acceleration due to gravity.

Both Charles Darwin himself and contemporary neo-Darwinists such as Francisco Ayala, Richard Dawkins, and Richard Lewontin acknowledge that biological organisms appear to have been designed by an intelligence. Yet classical Darwinists and contemporary Darwinists alike have argued that what Francisco Ayala calls the “obvious design” of living things is only apparent. As Ayala, a former president of the American Association for the Advancement of Science, has explained: “The functional design of organisms and their features would therefore seem to argue for the existence of a designer. It was Darwin’s greatest accomplishment to show that the directive organization of living beings can be explained as the result of a natural process, natural selection, without any need to resort to a Creator or other external agent.”

According to Darwin and his contemporary followers, the mechanism of natural selection acting on random variation is sufficient to explain the origin of those features of life that once seemed to require explanation by reference to an intelligent or purposeful designer. Thus, according to Darwinists, the design hypothesis now represents an unnecessary and un-parsimonious explanation for the complexity and apparent design of living organisms. On these as well as methodological grounds contemporary biologists have generally excluded the design hypothesis from consideration as an explanation for the origin of biological form.

Yet does Darwinism, in either its classical or contemporary versions, fully succeed in explaining the origin of biological form? Can it explain all evidence of apparent design? Most biologists now acknowledge that the Darwinian mechanism of natural selection acting on random variations can explain small-scale microevolutionary changes, such as cyclical variations in the size of the beaks of Galapagos finches or reversible changes in the expression of genes controlling color in English peppered moths.2 But what about the large-scale innovations in the history of life? What about the origin of completely new organs, body plans, and structures — the macroevolutionary innovation to which the fossil record attests? Does Darwinism, or neo-Darwinism, or any other strictly materialistic model of evolutionary change explain the origin of the basic body plans or structural “designs” of animal life, without invoking actual (that is, purposive or intelligent) design?

In this essay, we will test the claims of neo-Darwinism and two other materialistic models of evolutionary theory: punctuated equilibrium and self-organization. We will do so by assessing how well these theories ex- plain the main features of the Cambrian explosion — a term that refers to the geologically sudden appearance of numerous new animal forms (and their distinctive body plans) 530 million years ago. We shall show that the Cambrian fossil record contradicts the empirical expectations of both neo-Darwinism and punctuated equilibrium in several significant respects. We further show that neither neo-Darwinism’s selection/mutation mechanism nor more recent self-organizational models can account for the origin of the biological information necessary to produce the Cambrian animals and their distinctive body plans. Instead, we will argue that intelligent design explains both the pattern of the fossil record and the origin of new biological form and information better than the competing models of purposeless and undirected evolutionary change.


Abstract: For the scientific community intelligent design represents creationism's latest grasp at scientific legitimacy. Accordingly, intelligent design is viewed as yet another ill-conceived attempt by creationists to straightjacket science within a religious ideology. But in fact intelligent design can be formulated as a scientific theory having empirical consequences and devoid of religious commitments. Intelligent design can be unpacked as a theory of information. Within such a theory, information becomes a reliable indicator of design as well as a proper object for scientific investigation. In my paper I shall (1) show how information can be reliably detected and measured, and (2) formulate a conservation law that governs the origin and flow of information. My broad conclusion is that information is not reducible to natural causes, and that the origin of information is best sought in intelligent causes. Intelligent design thereby becomes a theory for detecting and measuring information, explaining its origin, and tracing its flow.

1. Information

In Steps Towards Life Manfred Eigen (1992, p. 12) identifies what he regards as the central problem facing origins-of-life research: "Our task is to find an algorithm, a natural law that leads to the origin of information." Eigen is only half right. To determine how life began, it is indeed necessary to understand the origin of information. Even so, neither algorithms nor natural laws are capable of producing information. The great myth of modern evolutionary biology is that information can be gotten on the cheap without recourse to intelligence. It is this myth I seek to dispel, but to do so I shall need to give an account of information. No one disputes that there is such a thing as information. As Keith Devlin (1991, p. 1) remarks, "Our very lives depend upon it, upon its gathering, storage, manipulation, transmission, security, and so on. Huge amounts of money change hands in exchange for information. People talk about it all the time. Lives are lost in its pursuit. Vast commercial empires are created in order to manufacture equipment to handle it." But what exactly is information? The burden of this paper is to answer this question, presenting an account of information that is relevant to biology.

What then is information? The fundamental intuition underlying information is not, as is sometimes thought, the transmission of signals across a communication channel, but rather, the actualization of one possibility to the exclusion of others. As Fred Dretske (1981, p. 4) puts it, "Information theory identifies the amount of information associated with, or generated by, the occurrence of an event (or the realization of a state of affairs) with the reduction in uncertainty, the elimination of possibilities, represented by that event or state of affairs." To be sure, whenever signals are transmitted across a communication channel, one possibility is actualized to the exclusion of others, namely, the signal that was transmitted to the exclusion of those that weren't. But this is only a special case. Information in the first instance presupposes not some medium of communication, but contingency. Robert Stalnaker (1984, p. 85) makes this point clearly: "Content requires contingency. To learn something, to acquire information, is to rule out possibilities. To understand the information conveyed in a communication is to know what possibilities would be excluded by its truth." For there to be information, there must be a multiplicity of distinct possibilities any one of which might happen. When one of these possibilities does happen and the others are ruled out, information becomes actualized. Indeed, information in its most general sense can be defined as the actualization of one possibility to the exclusion of others (observe that this definition encompasses both syntactic and semantic information).

This way of defining information may seem counterintuitive since we often speak of the information inherent in possibilities that are never actualized. Thus we may speak of the information inherent in flipping one-hundred heads in a row with a fair coin even if this event never happens. There is no difficulty here. In counterfactual situations the definition of information needs to be applied counterfactually. Thus to consider the information inherent in flipping one-hundred heads in a row with a fair coin, we treat this event/possibility as though it were actualized. Information needs to referenced not just to the actual world, but also cross-referenced with all possible worlds.

2. Complex Information

How does our definition of information apply to biology, and to science more generally? To render information a useful concept for science we need to do two things: first, show how to measure information; second, introduce a crucial distinction--the distinction between specified and unspecified information. First, let us show how to measure information. In measuring information it is not enough to count the number of possibilities that were excluded, and offer this number as the relevant measure of information. The problem is that a simple enumeration of excluded possibilities tells us nothing about how those possibilities were individuated in the first place. Consider, for instance, the following individuation of poker hands:

  • (i) A royal flush.
  • (ii) Everything else.

To learn that something other than a royal flush was dealt (i.e., possibility (ii)) is clearly to acquire less information than to learn that a royal flush was dealt (i.e., possibility (i)). Yet if our measure of information is simply an enumeration of excluded possibilities, the same numerical value must be assigned in both instances since in both instances a single possibility is excluded.

It follows, therefore, that how we measure information needs to be independent of whatever procedure we use to individuate the possibilities under consideration. And the way to do this is not simply to count possibilities, but to assign probabilities to these possibilities. For a thoroughly shuffled deck of cards, the probability of being dealt a royal flush (i.e., possibility (i)) is approximately .000002 whereas the probability of being dealt anything other than a royal flush (i.e., possibility (ii)) is approximately .999998. Probabilities by themselves, however, are not information measures. Although probabilities properly distinguish possibilities according to the information they contain, nonetheless probabilities remain an inconvenient way of measuring information. There are two reasons for this. First, the scaling and directionality of the numbers assigned by probabilities needs to be recalibrated. We are clearly acquiring more information when we learn someone was dealt a royal flush than when we learn someone wasn't dealt a royal flush. And yet the probability of being dealt a royal flush (i.e., .000002) is minuscule compared to the probability of being dealt something other than a royal flush (i.e., .999998). Smaller probabilities signify more information, not less.

The second reason probabilities are inconvenient for measuring information is that they are multiplicative rather than additive. If I learn that Alice was dealt a royal flush playing poker at Caesar's Palace and that Bob was dealt a royal flush playing poker at the Mirage, the probability that both Alice and Bob were dealt royal flushes is the product of the individual probabilities. Nonetheless, it is convenient for information to be measured additively so that the measure of information assigned to Alice and Bob jointly being dealt royal flushes equals the measure of information assigned to Alice being dealt a royal flush plus the measure of information assigned to Bob being dealt a royal flush.

Now there is an obvious way to transform probabilities which circumvents both these difficulties, and that is to apply a negative logarithm to the probabilities. Applying a negative logarithm assigns the more information to the less probability and, because the logarithm of a product is the sum of the logarithms, transforms multiplicative probability measures into additive information measures. What's more, in deference to communication theorists, it is customary to use the logarithm to the base 2. The rationale for this choice of logarithmic base is as follows. The most convenient way for communication theorists to measure information is in bits. Any message sent across a communication channel can be viewed as a string of 0's and 1's. For instance, the ASCII code uses strings of eight 0's and 1's to represent the characters on a typewriter, with whole words and sentences in turn represented as strings of such character strings. In like manner all communication may be reduced to the transmission of sequences of 0's and 1's. Given this reduction, the obvious way for communication theorists to measure information is in number of bits transmitted across a communication channel. And since the negative logarithm to the base 2 of a probability corresponds to the average number of bits needed to identify an event of that probability, the logarithm to the base 2 is the canonical logarithm for communication theorists. Thus we define the measure of information in an event of probability p as -log2p (see Shannon and Weaver, 1949, p. 32; Hamming, 1986; or indeed any mathematical introduction to information theory).

What about the additivity of this information measure? Recall the example of Alice being dealt a royal flush playing poker at Caesar's Palace and that Bob being dealt a royal flush playing poker at the Mirage. Let's call the first event A and the second B. Since randomly dealt poker hands are probabilistically independent, the probability of A and B taken jointly equals the product of the probabilities of A and B taken individually. Symbolically, P(A&B) = P(A)xP(B). Given our logarithmic definition of information we therefore define the amount of information in an event E as I(E) =def -log2P(E). It then follows that P(A&B) = P(A)xP(B) if and only if I(A&B) = I(A)+I(B). Since in the example of Alice and Bob P(A) = P(B) = .000002, I(A) = I(B) = 19, and I(A&B) = I(A)+I(B) = 19 + 19 = 38. Thus the amount of information inherent in Alice and Bob jointly obtaining royal flushes is 38 bits.

Since lots of events are probabilistically independent, information measures exhibit lots of additivity. But since lots of events are also correlated, information measures exhibit lots of non-additivity as well. In the case of Alice and Bob, Alice being dealt a royal flush is probabilistically independent of Bob being dealt a royal flush, and so the amount of information in Alice and Bob both being dealt royal flushes equals the sum of the individual amounts of information. But consider now a different example. Alice and Bob together toss a coin five times. Alice observes the first four tosses but is distracted, and so misses the fifth toss. On the other hand, Bob misses the first toss, but observes the last four tosses. Let's say the actual sequence of tosses is 11001 (1 = heads, 0 = tails). Thus Alice observes 1100* and Bob observes *1001. Let A denote the first observation, B the second. It follows that the amount of information in A&B is the amount of information in the completed sequence 11001, namely, 5 bits. On the other hand, the amount of information in A alone is the amount of information in the incomplete sequence 1100*, namely 4 bits. Similarly, the amount of information in B alone is the amount of information in the incomplete sequence *1001, also 4 bits. This time information doesn't add up: 5 = I(A&B) _ I(A)+I(B) = 4+4 = 8.

Here A and B are correlated. Alice knows all but the last bit of information in the completed sequence 11001. Thus when Bob gives her the incomplete sequence *1001, all Alice really learns is the last bit in this sequence. Similarly, Bob knows all but the first bit of information in the completed sequence 11001. Thus when Alice gives him the incomplete sequence 1100*, all Bob really learns is the first bit in this sequence. What appears to be four bits of information actually ends up being only one bit of information once Alice and Bob factor in the prior information they possess about the completed sequence 11001. If we introduce the idea of conditional information, this is just to say that 5 = I(A&B) = I(A)+I(B|A) = 4+1. I(B|A), the conditional information of B given A, is the amount of information in Bob's observation once Alice's observation is taken into account. And this, as we just saw, is 1 bit.

I(B|A), like I(A&B), I(A), and I(B), can be represented as the negative logarithm to the base two of a probability, only this time the probability under the logarithm is a conditional as opposed to an unconditional probability. By definition I(B|A) =def -log2P(B|A), where P(B|A) is the conditional probability of B given A. But since P(B|A) =def P(A&B)/P(A), and since the logarithm of a quotient is the difference of the logarithms, log2P(B|A) = log2P(A&B) - log2P(A), and so -log2P(B|A) = -log2P(A&B) + log2P(A), which is just I(B|A) = I(A&B) - I(A). This last equation is equivalent to

(*) I(A&B) = I(A)+I(B|A)

Formula (*) holds with full generality, reducing to I(A&B) = I(A)+I(B) when A and B are probabilistically independent (in which case P(B|A) = P(B) and thus I(B|A) = I(B)).

Formula (*) asserts that the information in both A and B jointly is the information in A plus the information in B that is not in A. Its point, therefore, is to spell out how much additional information B contributes to A. As such, this formula places tight constraints on the generation of new information. Does, for instance, a computer program, call it A, by outputting some data, call the data B, generate new information? Computer programs are fully deterministic, and so B is fully determined by A. It follows that P(B|A) = 1, and thus I(B|A) = 0 (the logarithm of 1 is always 0). From Formula (*) it therefore follows that I(A&B) = I(A), and therefore that the amount of information in A and B jointly is no more than the amount of information in A by itself.

For an example in the same spirit consider that there is no more information in two copies of Shakespeare's Hamlet than in a single copy. This is of course patently obvious, and any formal account of information had better agree. To see that our formal account does indeed agree, let A denote the printing of the first copy of Hamlet, and B the printing of the second copy. Once A is given, B is entirely determined. Indeed, the correlation between A and B is perfect. Probabilistically this is expressed by saying the conditional probability of B given A is 1, namely, P(B|A) = 1. In information-theoretic terms this is to say that I(B|A) = 0. As a result I(B|A) drops out of Formula (*), and so I(A&B) = I(A). Our information-theoretic formalism therefore agrees with our intuition that two copies of Hamlet contain no more information than a single copy.

Information is a complexity-theoretic notion. Indeed, as a purely formal object, the information measure described here is a complexity measure (cf. Dembski, 1998, ch. 4). Complexity measures arise whenever we assign numbers to degrees of complication. A set of possibilities will often admit varying degrees of complication, ranging from extremely simple to extremely complicated. Complexity measures assign non-negative numbers to these possibilities so that 0 corresponds to the most simple and _ to the most complicated. For instance, computational complexity is always measured in terms of either time (i.e., number of computational steps) or space (i.e., size of memory, usually measured in bits or bytes) or some combination of the two. The more difficult a computational problem, the more time and space are required to run the algorithm that solves the problem. For information measures, degree of complication is measured in bits. Given an event A of probability P(A), I(A) = -log2P(A) measures the number of bits associated with the probability P(A). We therefore speak of the "complexity of information" and say that the complexity of information increases as I(A) increases (or, correspondingly, as P(A) decreases). We also speak of "simple" and "complex" information according to whether I(A) signifies few or many bits of information. This notion of complexity is important to biology since not just the origin of information stands in question, but the origin of complex information.

3. Complex Specified Information

Given a means of measuring information and determining its complexity, we turn now to the distinction between specified and unspecified information. This is a vast topic whose full elucidation is beyond the scope of this paper (the details can be found in my monograph The Design Inference). Nonetheless, in what follows I shall try to make this distinction intelligible, and offer some hints on how to make it rigorous. For an intuitive grasp of the difference between specified and unspecified information, consider the following example. Suppose an archer stands 50 meters from a large blank wall with bow and arrow in hand. The wall, let us say, is sufficiently large that the archer cannot help but hit it. Consider now two alternative scenarios. In the first scenario the archer simply shoots at the wall. In the second scenario the archer first paints a target on the wall, and then shoots at the wall, squarely hitting the target's bull's-eye. Let us suppose that in both scenarios where the arrow lands is identical. In both scenarios the arrow might have landed anywhere on the wall. What's more, any place where it might land is highly improbable. It follows that in both scenarios highly complex information is actualized. Yet the conclusions we draw from these scenarios are very different. In the first scenario we can conclude absolutely nothing about the archer's ability as an archer, whereas in the second scenario we have evidence of the archer's skill.

The obvious difference between the two scenarios is of course that in the first the information follows no pattern whereas in the second it does. Now the information that tends to interest us as rational inquirers generally, and scientists in particular, is not the actualization of arbitrary possibilities which correspond to no patterns, but rather the actualization of circumscribed possibilities which do correspond to patterns. There's more. Patterned information, though a step in the right direction, still doesn't quite get us specified information. The problem is that patterns can be concocted after the fact so that instead of helping elucidate information, the patterns are merely read off already actualized information.

To see this, consider a third scenario in which an archer shoots at a wall. As before, we suppose the archer stands 50 meters from a large blank wall with bow and arrow in hand, the wall being so large that the archer cannot help but hit it. And as in the first scenario, the archer shoots at the wall while it is still blank. But this time suppose that after having shot the arrow, and finding the arrow stuck in the wall, the archer paints a target around the arrow so that the arrow sticks squarely in the bull's-eye. Let us suppose further that the precise place where the arrow lands in this scenario is identical with where it landed in the first two scenarios. Since any place where the arrow might land is highly improbable, in this as in the other scenarios highly complex information has been actualized. What's more, since the information corresponds to a pattern, we can even say that in this third scenario highly complex patterned information has been actualized. Nevertheless, it would be wrong to say that highly complex specified information has been actualized. Of the three scenarios, only the information in the second scenario is specified. In that scenario, by first painting the target and then shooting the arrow, the pattern is given independently of the information. On the other hand, in this, the third scenario, by first shooting the arrow and then painting the target around it, the pattern is merely read off the information.

Specified information is always patterned information, but patterned information is not always specified information. For specified information not just any pattern will do. We therefore distinguish between the "good" patterns and the "bad" patterns. The "good" patterns will henceforth be called specifications. Specifications are the independently given patterns that are not simply read off information. By contrast, the "bad" patterns will be called fabrications. Fabrications are the post hoc patterns that are simply read off already existing information.

Unlike specifications, fabrications are wholly unenlightening. We are no better off with a fabrication than without one. This is clear from comparing the first and third scenarios. Whether an arrow lands on a blank wall and the wall stays blank (as in the first scenario), or an arrow lands on a blank wall and a target is then painted around the arrow (as in the third scenario), any conclusions we draw about the arrow's flight remain the same. In either case chance is as good an explanation as any for the arrow's flight. The fact that the target in the third scenario constitutes a pattern makes no difference since the pattern is constructed entirely in response to where the arrow lands. Only when the pattern is given independently of the arrow's flight does a hypothesis other than chance come into play. Thus only in the second scenario does it make sense to ask whether we are dealing with a skilled archer. Only in the second scenario does the pattern constitute a specification. In the third scenario the pattern constitutes a mere fabrication.

The distinction between specified and unspecified information may now be defined as follows: the actualization of a possibility (i.e., information) is specified if independently of the possibility's actualization, the possibility is identifiable by means of a pattern. If not, then the information is unspecified. Note that this definition implies an asymmetry between specified and unspecified information: specified information cannot become unspecified information, though unspecified information may become specified information. Unspecified information need not remain unspecified, but can become specified as our background knowledge increases. For instance, a cryptographic transmission whose cryptosystem we have yet to break will constitute unspecified information. Yet as soon as we break the cryptosystem, the cryptographic transmission becomes specified information.

What is it for a possibility to be identifiable by means of an independently given pattern? A full exposition of specification requires a detailed answer to this question. Unfortunately, such an exposition is beyond the scope of this paper. The key conceptual difficulty here is to characterize the independence condition between patterns and information. This independence condition breaks into two subsidiary conditions: (1) a condition to stochastic conditional independence between the information in question and certain relevant background knowledge; and (2) a tractability condition whereby the pattern in question can be constructed from the aforementioned background knowledge. Although these conditions make good intuitive sense, they are not easily formalized. For the details refer to my monograph The Design Inference.

If formalizing what it means for a pattern to be given independently of a possibility is difficult, determining in practice whether a pattern is given independently of a possibility is much easier. If the pattern is given prior to the possibility being actualized--as in the second scenario above where the target was painted before the arrow was shot--then the pattern is automatically independent of the possibility, and we are dealing with specified information. Patterns given prior to the actualization of a possibility are just the rejection regions of statistics. There is a well-established statistical theory that describes such patterns and their use in probabilistic reasoning. These are clearly specifications since having been given prior to the actualization of some possibility, they have already been identified, and thus are identifiable independently of the possibility being actualized (cf. Hacking, 1965).

Many of the interesting cases of specified information, however, are those in which the pattern is given after a possibility has been actualized. This is certainly the case with the origin of life: life originates first and only afterwards do pattern-forming rational agents (like ourselves) enter the scene. It remains the case, however, that a pattern corresponding to a possibility, though formulated after the possibility has been actualized, can constitute a specification. Certainly this was not the case in the third scenario above where the target was painted around the arrow only after it hit the wall. But consider the following example. Alice and Bob are celebrating their fiftieth wedding anniversary. Their six children all show up bearing gifts. Each gift is part of a matching set of china. There is no duplication of gifts, and together the gifts constitute a complete set of china. Suppose Alice and Bob were satisfied with their old set of china, and had no inkling prior to opening their gifts that they might expect a new set of china. Alice and Bob are therefore without a relevant pattern whither to refer their gifts prior to actually receiving the gifts from their children. Nevertheless, the pattern they explicitly formulate only after receiving the gifts could be formed independently of receiving the gifts--indeed, we all know about matching sets of china and how to distinguish them from unmatched sets. This pattern therefore constitutes a specification. What's more, there is an obvious inference connected with this specification: Alice and Bob's children were in collusion, and did not present their gifts as random acts of kindness.

But what about the origin of life? Is life specified? If so, to what patterns does life correspond, and how are these patterns given independently of life's origin? Obviously, pattern-forming rational agents like ourselves don't enter the scene till after life originates. Nonetheless, there are functional patterns to which life corresponds, and which are given independently of the actual living systems. An organism is a functional system comprising many functional subsystems. The functionality of organisms can be cashed out in any number of ways. Arno Wouters (1995) cashes it out globally in terms of viability of whole organisms. Michael Behe (1996) cashes it out in terms of the irreducible complexity and minimal function of biochemical systems. Even the staunch Darwinist Richard Dawkins will admit that life is specified functionally, cashing out the functionality of organisms in terms of reproduction of genes. Thus Dawkins (1987, p. 9) will write: "Complicated things have some quality, specifiable in advance, that is highly unlikely to have been acquired by random chance alone. In the case of living things, the quality that is specified in advance is . . . the ability to propagate genes in reproduction."

Information can be specified. Information can be complex. Information can be both complex and specified. Information that is both complex and specified I call "complex specified information," or CSI for short. CSI is what all the fuss over information has been about in recent years, not just in biology, but in science generally. It is CSI that for Manfred Eigen constitutes the great mystery of biology, and one he hopes eventually to unravel in terms of algorithms and natural laws. It is CSI that for cosmologists underlies the fine-tuning of the universe, and which the various anthropic principles attempt to understand (cf. Barrow and Tipler, 1986). It is CSI that David Bohm's quantum potentials are extracting when they scour the microworld for what Bohm calls "active information" (cf. Bohm, 1993, pp. 35-38). It is CSI that enables Maxwell's demon to outsmart a thermodynamic system tending towards thermal equilibrium (cf. Landauer, 1991, p. 26). It is CSI on which David Chalmers hopes to base a comprehensive theory of human consciousness (cf. Chalmers, 1996, ch. 8). It is CSI that within the Kolmogorov-Chaitin theory of algorithmic information takes the form of highly compressible, non-random strings of digits (cf. Kolmogorov, 1965; Chaitin, 1966).

Nor is CSI confined to science. CSI is indispensable in our everyday lives. The 16-digit number on your VISA card is an example of CSI. The complexity of this number ensures that a would-be thief cannot randomly pick a number and have it turn out to be a valid VISA card number. What's more, the specification of this number ensures that it is your number, and not anyone else's. Even your phone number constitutes CSI. As with the VISA card number, the complexity ensures that this number won't be dialed randomly (at least not too often), and the specification ensures that this number is yours and yours only. All the numbers on our bills, credit slips, and purchase orders represent CSI. CSI makes the world go round. It follows that CSI is a rife field for criminality. CSI is what motivated the greedy Michael Douglas character in the movie Wall Street to lie, cheat, and steal. CSI's total and absolute control was the objective of the monomaniacal Ben Kingsley character in the movie Sneakers. CSI is the artifact of interest in most techno-thrillers. Ours is an information age, and the information that captivates us is CSI.

4. Intelligent Design

Whence the origin of complex specified information? In this section I shall argue that intelligent causation, or equivalently design, accounts for the origin of complex specified information. My argument focuses on the nature of intelligent causation, and specifically, on what it is about intelligent causes that makes them detectable. To see why CSI is a reliable indicator of design, we need to examine the nature of intelligent causation. The principal characteristic of intelligent causation is directed contingency, or what we call choice. Whenever an intelligent cause acts, it chooses from a range of competing possibilities. This is true not just of humans, but of animals as well as extra-terrestrial intelligences. A rat navigating a maze must choose whether to go right or left at various points in the maze. When SETI (Search for Extra-Terrestrial Intelligence) researchers attempt to discover intelligence in the extra-terrestrial radio transmissions they are monitoring, they assume an extra-terrestrial intelligence could have chosen any number of possible radio transmissions, and then attempt to match the transmissions they observe with certain patterns as opposed to others (patterns that presumably are markers of intelligence). Whenever a human being utters meaningful speech, a choice is made from a range of possible sound-combinations that might have been uttered. Intelligent causation always entails discrimination, choosing certain things, ruling out others.

Given this characterization of intelligent causes, the crucial question is how to recognize their operation. Intelligent causes act by making a choice. How then do we recognize that an intelligent cause has made a choice? A bottle of ink spills accidentally onto a sheet of paper; someone takes a fountain pen and writes a message on a sheet of paper. In both instances ink is applied to paper. In both instances one among an almost infinite set of possibilities is realized. In both instances a contingency is actualized and others are ruled out. Yet in one instance we infer design, in the other chance. What is the relevant difference? Not only do we need to observe that a contingency was actualized, but we ourselves need also to be able to specify that contingency. The contingency must conform to an independently given pattern, and we must be able independently to formulate that pattern. A random ink blot is unspecifiable; a message written with ink on paper is specifiable. Wittgenstein (1980, p. 1e) made the same point as follows: "We tend to take the speech of a Chinese for inarticulate gurgling. Someone who understands Chinese will recognize language in what he hears. Similarly I often cannot discern the humanity in man."

In hearing a Chinese utterance, someone who understands Chinese not only recognizes that one from a range of all possible utterances was actualized, but is also able to specify the utterance as coherent Chinese speech. Contrast this with someone who does not understand Chinese. In hearing a Chinese utterance, someone who does not understand Chinese also recognizes that one from a range of possible utterances was actualized, but this time, because lacking the ability to understand Chinese, is unable to specify the utterance as coherent speech. To someone who does not understand Chinese, the utterance will appear gibberish. Gibberish--the utterance of nonsense syllables uninterpretable within any natural language--always actualizes one utterance from the range of possible utterances. Nevertheless, gibberish, by corresponding to nothing we can understand in any language, also cannot be specified. As a result, gibberish is never taken for intelligent communication, but always for what Wittgenstein calls "inarticulate gurgling."

The actualization of one among several competing possibilities, the exclusion of the rest, and the specification of the possibility that was actualized encapsulates how we recognize intelligent causes, or equivalently, how we detect design. Actualization-Exclusion-Specification, this triad constitutes a general criterion for detecting intelligence, be it animal, human, or extra-terrestrial. Actualization establishes that the possibility in question is the one that actually occurred. Exclusion establishes that there was genuine contingency (i.e., that there were other live possibilities, and that these were ruled out). Specification establishes that the actualized possibility conforms to a pattern given independently of its actualization.

Now where does choice, which we've cited as the principal characteristic of intelligent causation, figure into this criterion? The problem is that we never witness choice directly. Instead, we witness actualizations of contingency which might be the result of choice (i.e., directed contingency), but which also might be the result of chance (i.e., blind contingency). Now there is only one way to tell the difference--specification. Specification is the only means available to us for distinguishing choice from chance, directed contingency from blind contingency. Actualization and exclusion together guarantee we are dealing with contingency. Specification guarantees we are dealing with a directed contingency. The Actualization-Exclusion-Specification triad is therefore precisely what we need to identify choice and therewith intelligent causation.

Psychologists who study animal learning and behavior have known of the Actualization-Exclusion-Specification triad all along, albeit implicitly. For these psychologists--known as learning theorists--learning is discrimination (cf. Mazur, 1990; Schwartz, 1984). To learn a task an animal must acquire the ability to actualize behaviors suitable for the task as well as the ability to exclude behaviors unsuitable for the task. Moreover, for a psychologist to recognize that an animal has learned a task, it is necessary not only to observe the animal making the appropriate behavior, but also to specify this behavior. Thus to recognize whether a rat has successfully learned how to traverse a maze, a psychologist must first specify the sequence of right and left turns that conducts the rat out of the maze. No doubt, a rat randomly wandering a maze also discriminates a sequence of right and left turns. But by randomly wandering the maze, the rat gives no indication that it can discriminate the appropriate sequence of right and left turns for exiting the maze. Consequently, the psychologist studying the rat will have no reason to think the rat has learned how to traverse the maze. Only if the rat executes the sequence of right and left turns specified by the psychologist will the psychologist recognize that the rat has learned how to traverse the maze. Now it is precisely the learned behaviors we regard as intelligent in animals. Hence it is no surprise that the same scheme for recognizing animal learning recurs for recognizing intelligent causes generally, to wit, actualization, exclusion, and specification.

Now this general scheme for recognizing intelligent causes coincides precisely with how we recognize complex specified information: First, the basic precondition for information to exist must hold, namely, contingency. Thus one must establish that any one of a multiplicity of distinct possibilities might obtain. Next, one must establish that the possibility which was actualized after the others were excluded was also specified. So far the match between this general scheme for recognizing intelligent causation and how we recognize complex specified information is exact. Only one loose end remains--complexity. Although complexity is essential to CSI (corresponding to the first letter of the acronym), its role in this general scheme for recognizing intelligent causation is not immediately evident. In this scheme one among several competing possibilities is actualized, the rest are excluded, and the possibility which was actualized is specified. Where in this scheme does complexity figure in?

The answer is that it is there implicitly. To see this, consider again a rat traversing a maze, but now take a very simple maze in which two right turns conduct the rat out of the maze. How will a psychologist studying the rat determine whether it has learned to exit the maze. Just putting the rat in the maze will not be enough. Because the maze is so simple, the rat could by chance just happen to take two right turns, and thereby exit the maze. The psychologist will therefore be uncertain whether the rat actually learned to exit this maze, or whether the rat just got lucky. But contrast this now with a complicated maze in which a rat must take just the right sequence of left and right turns to exit the maze. Suppose the rat must take one hundred appropriate right and left turns, and that any mistake will prevent the rat from exiting the maze. A psychologist who sees the rat take no erroneous turns and in short order exit the maze will be convinced that the rat has indeed learned how to exit the maze, and that this was not dumb luck. With the simple maze there is a substantial probability that the rat will exit the maze by chance; with the complicated maze this is exceedingly improbable. The role of complexity in detecting design is now clear since improbability is precisely what we mean by complexity (cf. section 2).

This argument for showing that CSI is a reliable indicator of design may now be summarized as follows: CSI is a reliable indicator of design because its recognition coincides with how we recognize intelligent causation generally. In general, to recognize intelligent causation we must establish that one from a range of competing possibilities was actualized, determine which possibilities were excluded, and then specify the possibility that was actualized. What's more, the competing possibilities that were excluded must be live possibilities, sufficiently numerous so that specifying the possibility that was actualized cannot be attributed to chance. In terms of probability, this means that the possibility that was specified is highly improbable. In terms of complexity, this means that the possibility that was specified is highly complex. All the elements in the general scheme for recognizing intelligent causation (i.e., Actualization-Exclusion-Specification) find their counterpart in complex specified information--CSI. CSI pinpoints what we need to be looking for when we detect design.

As a postscript, I call the reader's attention to the etymology of the word "intelligent." The word "intelligent" derives from two Latin words, the preposition inter, meaning between, and the verb lego, meaning to choose or select. Thus according to its etymology, intelligence consists in choosing between. It follows that the etymology of the word "intelligent" parallels the formal analysis of intelligent causation just given. "Intelligent design" is therefore a thoroughly apt phrase, signifying that design is inferred precisely because an intelligent cause has done what only an intelligent cause can do--make a choice.

5. The Law of the Conversation of Information

Evolutionary biology has steadfastly resisted attributing CSI to intelligent causation. Although Manfred Eigen recognizes that the central problem of evolutionary biology is the origin of CSI, he has no thought of attributing CSI to intelligent causation. According to Eigen natural causes are adequate to explain the origin of CSI. The only question for Eigen is which natural causes explain the origin of CSI. The logically prior question of whether natural causes are even in-principle capable of explaining the origin of CSI he ignores. And yet it is a question that undermines Eigen's entire project. Natural causes are in-principle incapable of explaining the origin of CSI. To be sure, natural causes can explain the flow of CSI, being ideally suited for transmitting already existing CSI. What natural causes cannot do, however, is originate CSI. This strong proscriptive claim, that natural causes can only transmit CSI but never originate it, I call the Law of Conservation of Information. It is this law that gives definite scientific content to the claim that CSI is intelligently caused. The aim of this last section is briefly to sketch the Law of Conservation of Information (a full treatment will be given in Uncommon Descent, a book I am jointly authoring with Stephen Meyer and Paul Nelson).

To see that natural causes cannot account for CSI is straightforward. Natural causes comprise chance and necessity (cf. Jacques Monod's book by that title). Because information presupposes contingency, necessity is by definition incapable of producing information, much less complex specified information. For there to be information there must be a multiplicity of live possibilities, one of which is actualized, and the rest of which are excluded. This is contingency. But if some outcome B is necessary given antecedent conditions A, then the probability of B given A is one, and the information in B given A is zero. If B is necessary given A, Formula (*) reduces to I(A&B) = I(A), which is to say that B contributes no new information to A. It follows that necessity is incapable of generating new information. Observe that what Eigen calls "algorithms" and "natural laws" fall under necessity.

Since information presupposes contingency, let us take a closer look at contingency. Contingency can assume only one of two forms. Either the contingency is a blind, purposeless contingency--which is chance; or it is a guided, purposeful contingency--which is intelligent causation. Since we already know that intelligent causation is capable of generating CSI (cf. section 4), let us next consider whether chance might also be capable of generating CSI. First notice that pure chance, entirely unsupplemented and left to its own devices, is incapable of generating CSI. Chance can generate complex unspecified information, and chance can generate non-complex specified information. What chance cannot generate is information that is jointly complex and specified.

Biologists by and large do not dispute this claim. Most agree that pure chance--what Hume called the Epicurean hypothesis--does not adequately explain CSI. Jacques Monod (1972) is one of the few exceptions, arguing that the origin of life, though vastly improbable, can nonetheless be attributed to chance because of a selection effect. Just as the winner of a lottery is shocked at winning, so we are shocked to have evolved. But the lottery was bound to have a winner, and so too something was bound to have evolved. Something vastly improbable was bound to happen, and so, the fact that it happened to us (i.e., that we were selected--hence the name selection effect) does not preclude chance. This is Monod's argument and it is fallacious. It fails utterly to come to grips with specification. Moreover, it confuses a necessary condition for life's existence with its explanation. Monod's argument has been refuted by the philosophers John Leslie (1989), John Earman (1987), and Richard Swinburne (1979). It has also been refuted by the biologists Francis Crick (1981, ch. 7), Bernd-Olaf Küppers (1990, ch. 6), and Hubert Yockey (1992, ch. 9). Selection effects do nothing to render chance an adequate explanation of CSI.

Most biologists therefore reject pure chance as an adequate explanation of CSI. The problem here is not simply one of faulty statistical reasoning. Pure chance is also scientifically unsatisfying as an explanation of CSI. To explain CSI in terms of pure chance is no more instructive than pleading ignorance or proclaiming CSI a mystery. It is one thing to explain the occurrence of heads on a single coin toss by appealing to chance. It is quite another, as Küppers (1990, p. 59) points out, to follow Monod and take the view that "the specific sequence of the nucleotides in the DNA molecule of the first organism came about by a purely random process in the early history of the earth." CSI cries out for explanation, and pure chance won't do. As Richard Dawkins (1987, p. 139) correctly notes, "We can accept a certain amount of luck in our [scientific] explanations, but not too much."

If chance and necessity left to themselves cannot generate CSI, is it possible that chance and necessity working together might generate CSI? The answer is No. Whenever chance and necessity work together, the respective contributions of chance and necessity can be arranged sequentially. But by arranging the respective contributions of chance and necessity sequentially, it becomes clear that at no point in the sequence is CSI generated. Consider the case of trial-and-error (trial corresponds to necessity and error to chance). Once considered a crude method of problem solving, trial-and-error has so risen in the estimation of scientists that it is now regarded as the ultimate source of wisdom and creativity in nature. The probabilistic algorithms of computer science (e.g., genetic algorithms--see Forrest, 1993) all depend on trial-and-error. So too, the Darwinian mechanism of mutation and natural selection is a trial-and-error combination in which mutation supplies the error and selection the trial. An error is committed after which a trial is made. But at no point is CSI generated.

Natural causes are therefore incapable of generating CSI. This broad conclusion I call the Law of Conservation of Information, or LCI for short. LCI has profound implications for science. Among its corollaries are the following: (1) The CSI in a closed system of natural causes remains constant or decreases. (2) CSI cannot be generated spontaneously, originate endogenously, or organize itself (as these terms are used in origins-of-life research). (3) The CSI in a closed system of natural causes either has been in the system eternally or was at some point added exogenously (implying that the system though now closed was not always closed). (4) In particular, any closed system of natural causes that is also of finite duration received whatever CSI it contains before it became a closed system.

This last corollary is especially pertinent to the nature of science for it shows that scientific explanation is not coextensive with reductive explanation. Richard Dawkins, Daniel Dennett, and many scientists are convinced that proper scientific explanations must be reductive, moving from the complex to the simple. Thus Dawkins (1987, p. 316) will write, "The one thing that makes evolution such a neat theory is that it explains how organized complexity can arise out of primeval simplicity." Thus Dennett (1995, p. 153) will view any scientific explanation that moves from simple to complex as "question-begging." Thus Dawkins (1987, p. 13) will explicitly equate proper scientific explanation with what he calls "hierarchical reductionism," according to which "a complex entity at any particular level in the hierarchy of organization" must properly be explained "in terms of entities only one level down the hierarchy." While no one will deny that reductive explanation is extremely effective within science, it is hardly the only type of explanation available to science. The divide-and-conquer mode of analysis behind reductive explanation has strictly limited applicability within science. In particular, this mode of analysis is utterly incapable of making headway with CSI. CSI demands an intelligent cause. Natural causes will not do.

William A. Dembski, presented at Naturalism, Theism and the Scientific Enterprise: An Interdisciplinary Conference at the University of Texas, Feb. 20-23, 1997.


Barrow, John D. and Frank J. Tipler. 1986. The Anthropic Cosmological Principle. Oxford: Oxford University Press.
Behe, Michael. 1996. Darwin's Black Box: The Biochemical Challenge to Evolution. New York: The Free Press.
Bohm, David. 1993. The Undivided Universe: An Ontological Interpretation of Quantum Theory. London: Routledge.
Chaitin, Gregory J. 1966. On the Length of Programs for Computing Finite Binary Sequences. Journal of the ACM, 13:547-569.
Chalmers, David J. 1996. The Conscious Mind: In Search of a Fundamental Theory. New York : Oxford University Press.
Crick, Francis. 1981. Life Itself: Its Origin and Nature. New York: Simon and Schuster.
Dawkins, Richard. 1987. The Blind Watchmaker. New York: Norton.
Dembski, William A. 1998. The Design Inference: Eliminating Chance through Small Probabilities. Forthcoming, Cambridge University Press.
Dennett, Daniel C. 1995. Darwin's Dangerous Idea: Evolution and the Meanings of Life. New York: Simon & Schuster.
Devlin, Keith J. 1991. Logic and Information. New York: Cambridge University Press.
Dretske, Fred I. 1981. Knowledge and the Flow of Information. Cambridge, Mass.: MIT Press.
Earman, John. 1987. The Sap Also Rises: A Critical Examination of the Anthropic Principle. American Philosophical Quarterly, 24(4): 307­317.
Eigen, Manfred. 1992. Steps Towards Life: A Perspective on Evolution, translated by Paul Woolley. Oxford: Oxford University Press.
Forrest, Stephanie. 1993. Genetic Algorithms: Principles of Natural Selection Applied to Computation. Science, 261:872-878.
Hacking, Ian. 1965. Logic of Statistical Inference. Cambridge: Cambridge University Press.
Hamming, R. W. 1986. Coding and Information Theory, 2nd edition. Englewood Cliffs, N. J.: Prentice-Hall.
Kolmogorov, Andrei N. 1965. Three Approaches to the Quantitative Definition of Information. Problemy Peredachi Informatsii (in translation), 1(1): 3-11.
Küppers, Bernd-Olaf. 1990. Information and the Origin of Life. Cambridge, Mass.: MIT Press.
Landauer, Rolf. 1991. Information is Physical. Physics Today, May: 23­29.
Leslie, John. 1989. Universes. London: Routledge.
Mazur, James. E. 1990. Learning and Behavior, 2nd edition. Englewood Cliffs, N.J.: Prentice Hall.
Monod, Jacques. 1972. Chance and Necessity. New York: Vintage.
Schwartz, Barry. 1984. Psychology of Learning and Behavior, 2nd edition. New York: Norton.
Shannon, Claude E. and W. Weaver. 1949. The Mathematical Theory of Communication. Urbana, Ill.: University of Illinois Press.
Stalnaker, Robert. 1984. Inquiry. Cambridge, Mass.: MIT Press.
Swinburne, Richard. 1979. The Existence of God. Oxford: Oxford University Press.
Wittgenstein, Ludwig. 1980. Culture and Value, edited by G. H. von Wright, translated by P. Winch. Chicago: University of Chicago Press.
Wouters, Arno. 1995. Viability Explanation. Biology and Philosophy, 10:435-457.
Yockey, Hubert P. 1992. Information Theory and Molecular Biology. Cambridge: Cambridge University Press.