Most of the basic concepts in evolutionary theory predate 1953 when virtually nothing was known about DNA12. As early as 1930, it became clear that evolving organisms had to maintain old functions as they gained new ones. One way to achieve this was gene duplication: within a genome, a gene would duplicate, allowing one copy to generate the current cellular products needed to function while the duplicated gene took on new functions, as mutations accumulated. In those days, DNA was not known. It was only known that each nucleus of a eukaryotic cell contained pairs of chromosomes, which characterizes diploidy. These pairs were known to separate during a cell division in two, to duplicate again to generate two diploid daughter cells identical to the parent cell or else to form haploid gametes and ovum, which in turn may reunite within a diploid egg. It was not known that genes are physically composed of nucleic acid bases assembled in linear strings of DNA, themselves double-paired and located in the chromosomes.
The first attempt at finding an explanation for the sudden appearance of a superior level of organization within a phylum was done by Susumu Ohno (died 2000), who worked at the City of Hope Hospital in Los Angeles. Ohno proposed in 1970 that the transition from Amphioxus, a fishlike invertebrate possessing a chord, to vertebrated fishes (lampreys) was accomplished by a whole genome duplication. By this, he meant the duplication of all the chromosomes of the parent cell within the two daughter cells. Considering that trisomic children within the human race are not generally known to be at an advantage compared to their parents, Ohno argued that organisms could handle extra genomes much better than they could handle extra copies of a particular chromosome, because a second copy of a single chromosome would translate into production of too many of the particular proteins specified by the genes of that chromosome, which could be harmful to the organism. Because chromosomes exist in pairs, the first round produced two copies of each chromosome, four in total. A parent organism with 3 chromosomes would generate offspring with 6 chromosomes. According to Ohno, a second round of duplication occurred immediately thereafter, giving rise to the diversity of the vertebrates, the third duplication allowing the appearance of the amphibians. In the end, one may speculate that the 48 chromosomes of Man are due to 4 rounds of duplication, starting with 3 primal chromosomes.
These speculations about rounds of whole genome duplications are sustained by some evidence, and disproved by other. The possibility of whole genome duplication is possible, but although probable it has not yet been proven. What is more likely to have occurred is the duplication of single genes. Since 1970, tremendous progress has been made in identifying the molecular components of genomes and defining their evolutionary relationships. Molecular geneticists operate in a conceptual universe as different from classical genetics and evolutionary theory as quantum physics is from Newtonian mechanics. They have come up lately with a radically different picture of genome organization.
4.3.1 "Hard-wired" organisms
The view developed at the dawn of molecular genetics during the 1960s was based on research made on the bacterium Escherischia coli. According to the results obtained with that bacterium, it was proposed that the first self-replicating entities were RNA-based. In these postulated primitive organisms, RNA was used to store, replicate and process genetic information. Later organisms developed DNA as a repository for information. Direct read-out of information from DNA into RNA allows the genome to be faithfully reproduced in RNA. This outcome occurs in what may now be called “hard-wired” organisms.
In a hard-wired genome, DNA directly codes for all proteins. Organisms with hard-wired genomes evolve by mutation to DNA, permanently altering the sequences that encode proteins. When the fidelity of a polymerase (i.e. the enzyme that synthesizes the nucleic acid chain) is high, the nucleotide sequence will be preserved; when the fidelity is low, the sequence will tend to change. Over time, genetic information became organized in response units. Information stored in this way tends to be clustered in operons, so that the read-out of a single transcriptional unit initiated in response to a stimulus was rapid and appropriate. Such a form of adaptation is well suited to unicellular organisms. Short replication times and rapid multiplication allow for the generation of large numbers of individuals with divergent DNA genomes that enable survival in changing conditions.
Biologists understand hard-wired genomes well. Work on hard-wired genomes reinforced the dogma that RNA was a passive messenger transferring information from DNA to protein.
4.3.2 Adaptation to predictable challenges
Fidelity of replication and repair is necessary to maintain a genome; diversity among a genome’s descendants improves the chance that some will survive, or even flourish, as the environment presents challenges and opportunities. A tendency to change some parts of the genome is more risky than a tendency to change others. Thus, the balance between fidelity and exploration falls under selective pressure and, for each genome, certain genetic changes are order of magnitudes more likely than other changes.
Which genomes are likely to endure through the survival of descendants? Those for which probable changes provide effective responses to probable challenges. Genomes may evolve stereotyped mechanisms to overcome predictable challenges, as for example host/pathogen battles. In this battle, the immune system of the host changes and adapts to the surface antigens of the pathogen and the pathogen varies its surface antigens, seeking to hide from the immune system. To the extent that variation tends to focus on places where experience has indicated that it is most likely to generate a new function and tends to move away from areas where changes historically have done more harm than good, the genome evolves a “world view” of which types of changes are most likely to yield a new function and are less likely to destroy something essential. Genomes have evolved mechanisms that facilitate their own evolution.
The ability to adapt and evolve can be viewed as a skill, which the genome learns as it moves through time and generations.
4.3.3 Meeting the unpredictable challenge: translocation and transposition
Although there are predictable challenges such as those encountered by a pathogen invading a new host, most environmental challenges are unpredictable. The existence of mechanisms that diversify a genome and increase the probability that its descendants will survive an unpredictable environment, began to be suspected as early as the middle of the 1940s, with the publication of the work of O. Avery and of Barbara McClintock.
The Avery laboratory endeavored to understand how killed virulent bacteria could transform the descendants of a non-virulent strain of Pneumococcus into a virulent type and demonstrated in 1944 that DNA carried the transforming information. DNA was discovered as something Pneumococcus took up from its environment, which changed its descendants13. Barbara McClintock confirmed this proof of lateral genetic transfer. Working with maize, she stressed that changes in a genome are not limited to uncatalyzed chance. A genome can take up information from the outside, which transforms its behavior in a heritable way. Genetic information also can move within a genome; this movement can be regulated, and induced in time of stress14.
In contrast to “predictable challenges”, Barbara McClintock observed that when a genome is confronted by a difficulty for which it is “unprepared”, it may reorganize itself: the environmental stress can alter the balance between genomic stability/repair, can increase genetic variation and change the spectrum of mutations. The favorable mutations that arise are rapidly spread through a population by uptake from the environment. The stress induces DNA double-strands breaks that thereafter rearrange in a more favorable way to meet the stress. The stress may also reactivate dormant genes. Genomes are not a passive list of recipes: mechanisms have emerged within genomes that facilitate their own evolution. Mc Clintock refused to pass away, so that this remarkable observation, fiercely opposed by her distinguished colleagues not only because it upset preconceived ideas and conventional wisdom but also because she was a woman, was rewarded with a Nobel Prize when she was 81.
A genome’s ability to grow and to explore new organizational structures would be severely constrained if its options were limited to simple point mutations. There is a selective advantage of having multiple mechanisms that generate genetic diversity. In many organisms, multiple simultaneous changes occur at once. One mechanism is lateral transfer, whereby a preexisting functional DNA coming from other organisms in the environment may be taken up. This horizontal transfer of genetic information can cross species barriers. Another mechanism is gene duplication. Genomes that evolve mechanisms that duplicate functional DNA have a strong selective advantage over genomes that proceed with random sequences and test every mutation. Duplication of a gene within a nucleus allows a genome to explore variation without losing the function of the original copy. Note that such genomes must evolve specific mechanisms in addition to mark duplicates for accelerated diversification, to protect against the risk of massive loss of duplicated sequences.
The idea of regulation of gene expression and of genome maintenance dates back to Mc Clintock’s observations that developmental patterns in maize varied by insertion and excision of chromosome bits. Far from the motionless double helix of textbook covers, “living” DNA is dynamic. Evolution emerged as a biological function. Evolution is not only point mutation but is also translocation and transposition.
4.3.4 Composition of the operon
The first molecular lesson acquired since 1953 is that many different genetic codes exist in addition to the triplet code for amino acids (this is the codon). These codes affect genome function, such as replication, transcription and recombination, chromatin organization, RNA processing, chromosome localization and chromosome pairing, etc. The second insight is that the basic genetic elements function as systems composed of multiple codons. The systems are called operons. For example the gene in Escherischia coli that is responsible for the metabolization of lactose is composed not only of protein-coding sequences but also contains all the sequences specifying the transcription factor binding sites necessary to promote and enhance the synthesis of the enzymes, to accelerate or decelerate it, restrict the production of enzymes dedicated to other metabolites, and to stop the whole process when the supply in lactose is exhausted. This mosaic of genetic elements dedicated to the task of processing lactose is an operon. The third perception is that the essential elements of the operon that do not encode proteins are dispersed all over the genome, in different operons. They are repetitive elements because they must be recognized the same way at different loci. These repetitive DNA elements set the system architecture of each species. What makes each species unique is not the nature of its proteins but a distinct specific organization of the repetitive DNA elements that must be recognized by nuclear replication and transcription functions. Within a phylum, many organisms have interchangeable proteins. About ninety percent of the genome of the zebra fish is similar to the human genome. They differ critically in how protein synthesis is regulated during development, so as to obtain a zebra fish instead of a chimpanzee. Two species are most easily distinguished by their repetitive DNAs.
4.3.5 Natural genetic engineering systems
An important question in evolution is: How can new adaptations, such as the eye or the wing, originate? Conventional explanations that point mutations randomly generate advantageous changes in complex characters accumulated one by one are unconvincing on probabilistic grounds. Darwin addressed the conceptual difficulty of complex adaptations. The dilemma posed by a complex adaptation as an eye, a placenta, an ear, a kidney, language, is that it requires many independent elements, all of which must be present for the organ to be useful. Darwin mentioned in The origin of species by means of natural selection: “ To suppose that the eye with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I freely confess, absurd to the highest degree” 15.
A complex eye could have formed within 400,000 years. The placenta is regulated by 50 genes. To these are added the genes that produce protein hormones and hemoglobin adapted to fetus development, facilitation of gas exchanges, transfer of nutrients, disposal of waste products, suppression of immunological interactions or other forms of conflict between the mother and embryo. The placenta of a guppy like fish genus (Poeciliopsis) was formed three times in various species of the genus and took between 0.75 million years and 2.36 million years to form16. How could such complex formations have occurred in such a short time?
The outline of an answer lies in the ability of genetic engineering systems to operate non-randomly at multiple loci. All cells contain genetic engineering capacities. Natural genetic engineering systems can be activated to work at many sites in the genome. The cells are able to carry out recombination between DNA segments, to integrate exogenous DNA (transformation), to transfer DNA from one cell to another (plasmids), to insert and excise operons, to transport defined DNA segments from one location to another (transposons) and to join DNA segments. This genetic engineering is subject to biological regulation controlling the timing and the localization of changes.
It is thereby mechanistically plausible to postulate that major changes can occur rapidly in the repetitive DNA content of the genome during speciation. In its simplest form, this mechanism depends only on activation of one or more mobile elements systems that can rapidly insert regulatory motifs into appropriate sites in multiple genetic loci, leaving selection the task of picking out variants with new viable functionalities. Rather than being restricted to contemplating a slow process depending on random and blind genetic variation inducing a gradual phenotypic change, we can now envision a rapid genome restructuring guided by biological feedback networks17.
4.3.6 The "soft-wired" organisms
The complexity of genes increases tremendously when we consider the processes of multicellular development. In these complex organisms, genomes have expanded during evolution. Their DNA appears to have been acquired from many sources. Only a fraction of DNA codes directly for protein. In these organisms, RNA is processed extensively, allowing a number of different messages to be produced from the same gene. In eukaryotes, regions of genes encoding proteins, called exons, are separated by DNA with no protein coding functions. These are the introns. Splicing of RNA is needed to extract relevant information from these genomes. In mammals, the triplet of DNA nucleotides for arginine (cytosine-guanine-guanine) is not specified in the gene but it arises through RNA editing, during which a codon for glutamine (i.e. cytosine-adenine-guanine) is changed into one for arginine. As a consequence, the nucleotide sequences present in RNA differ from those present in DNA. The message no longer exactly corresponds to the gene. This type of gene is “soft-wired” to indicate that the flow of information from DNA to RNA is not fixed but changes according to how the messenger RNA is processed. In soft-wired systems, RNA is programmable, changing as environmental and developmental events impact a cell. In a soft-wired genome, the one-to-one correspondence between DNA and RNA exists no longer. The exact RNA variant produced depends on the manner in which the RNA is spliced, processed or edited. In these “soft-wired” organisms, RNA processing can be thought of as a series of steps, one or more of which have either a “default” outcome or else an “alternative” outcome. Modifications to RNA processing enable species to evolve in the absence of mutation to DNA.
4.3.7 Evolution driven by RNA editing and integration
Messenger RNA is spliced and patched and the RNA ultimately produced thus depends on how it is processed. During processing, each cell and each organism generate only a small subset of the possible RNAs encoded within the genome. DNA may be called a virtual genome, as it does not directly specify a phenotype but only possible phenotypes. Phenotype is determined by how messenger RNAs are processed. In these organisms, proper processing of messenger RNA requires information provided by another RNA, the guide RNA, that is not protein-coding. The guide RNAs direct the processing of other RNAs and allow integration of information from many different sources. The editing of a large number of different RNA sequences can be tested without ever altering DNA coding sequences, and this is advantageous from an evolutionary point of view. Unedited RNA is produced together with edited RNA and both original proteins and modified proteins are available to the organisms. The phenotype that proves the most successful is later integrated within the genome through reverse transcription. It was shown that information from RNA could become part of the DNA repository through reverse transcription. The flow of information from DNA and RNA is thus bi-directional.
Evolution driven by RNA editing separates the mechanisms responsible for the generation of diversity, which are intron based, from those needed to ensure genetic stability of the DNA exons. Reverse transcription of RNA in DNA also helps evolution. New sequences can be explored in RNA, and then passed into DNA genomes through reverse transcription. RNA is not a passive transmitter of information; distinct messages can be produced from a single gene. The RNA can be edited, which allows a genome to “try out” sequences before incorporating them into the genome permanently.
These mechanisms that modulate the probability of genetic change, complicate the molecular analyses of phylogenetic relationships and rates of evolution. To the extent that RNA editing occurs, sequences obtained from it cannot be assumed to be the original genomic sequences. To distinguish close evolutionary relationships and derive phylogenies on the base of DNA sequences is also difficult due to uneven probability of mutations and the horizontal transfer of genetic material.
In conclusion, organisms “learn“ from each other: the exchange of genetic information among organisms through horizontal gene transfer demonstrates a potential genetic connection among living things. The importance of diversity for genome survival should lead us to treasure the diversity of our own species.
4.3.8 The genome’s immune system
The central dogma of molecular biology is that the genetic information flow in cells are the replication of DNA from DNA, transcription of RNA from DNA, and translation of proteins from RNA. Genomes are sensitive to invasion by viruses. Forty-five percent of the human genome consists of remnants of previous transposons and virus invasions. Eight percent of our genome consists of retroviruses. One would expect that organisms need to fight off these invasions to prevent the genome from being taken over by molecular invaders. Most known viruses store and replicate their genomes as RNA, with no natural DNA forms. In recent years, a defense mechanism has been discovered, that was conserved among eukaryotes. It has specificity against foreign nucleic acid elements and has the ability to amplify and raise a massive response against an invading nucleic acid. The two problems are, firstly to recognize self from non-self and, secondly how to amplify an initial response in a specific fashion. These problems are also those faced by the immune system.
Highly concentrated double-stranded RNAs are not normally found and are indicative of a viral invasion. These foreign genes are silenced by an RNA-mediated mechanism. The long double-stranded foreign RNA (dsRNA) is sliced by an enzyme (the DICER enzyme) into many small interfering RNA (siRNA). An amplification step is therewith accomplished. Depending on the length of the dsRNA, the amplification could measure 10- to 20-fold. Since the siRNA is used many times, the amplification is further assured. The siRNA attaches to the messenger RNA generated by the invading double-stranded RNA and inactivates it by the help of an endonuclease.
13. Avery O. T. et al.: Studies on the chemical nature of the substance inducing transformation of pneumococcal types. Induction of transformation by a deoxyribonucleic acid fraction isolated from Pneumococcus type III. J. Exp. Med. 1944: 79, 127-158