2. The Evolution of Molecules

2.3 Viruses

The next step in the evolution of life along the nucleic acid pathway requires the presence of enclosures. Molecules of RNA comprised of about 50 bases are most likely to go through this process because these molecules reproduce with a low error rate while able to form a stable tertiary structure. Considering the size of the molecules under discussion, one should assume that suitable cavities must have a volume of 10-13 cubic centimeters. Such enclosures are readily available between clay particles or can be built through aggregation with other organic material.

The RNA molecules that do not fit perfectly in their entirety within aggregates will diffuse and be destroyed during decay phases while those that fit will be protected against destruction. This is a convergent phase during which all faulty RNA pieces are eliminated. The aggregates come apart during multiplication phases and aggregate again during decay phases. The result of this process is that the RNA within aggregates is reproduced at a low error rate as the faulty reproduced pieces are eliminated (fig. 2.7).

Figure 2.7. I. RNA’s replicate within clay enclosures (not shown). II. RNA’s endowed with a tertiary structure can form an aggregate that protects them against degradation. The formation of such an aggregate requires time. This is an evolutive convergent phase: all RNA pieces less protected during decay phases are eliminated.. III. Aggregates form more readily when the pieces are assembled with a collector strand. Identical units are superior to heterogeneous pieces. The longer the collector strands, the more efficient they are. This is an evolutive divergent phase for the RNA collector strand, which may change at will provided its collector function is fulfilled. IV. Aggregates can be formed from protein subunits. A tridimensional aggregation is represented by adenovirus (252 subunits). A two-dimensional aggregation type is represented by tobacco mosaic virus. The best protein synthesizing mechanism has selective advantages. This is a convergent phase that leads to autonomously reproducing bacteria.

The aggregates may have formed in clay particles containing copper, together with proteinic membranes. The first membrane was probably formed from polyglycine. The evolution of the membrane is tied to the evolution of the genetic code, as the origin of the genetic code is tied up with the evolution of the membrane. It is assumed that the earliest genetic code was a guanine (G)-cytosine (C) code. The doublet GG coded for glycine and the doublet CC coded for proline. Guanine is the least water-soluble nucleotide, among those under discussion. The most primitive code may well be GG, that codes for the simplest aminoacid, glycine.

GC coded alanine. The RNA code evolved by the addition of adenine, that coded for various other aminoacids, ending with the addition of uracil, that allowed for the coding of all the 21 amino acids known to be coded. As said, the very first membrane, composed of glycine oligomers, was polymerized on mineral surfaces containing copper. In turn, the diglycine plus copper is capable of coding for cytosine. The glycine oligomer may have coded for oligocytosine. This poly-C would in turn be a template for the polymerization of poly-G. These sets of polymerizations on the edges of a surface of a copper-containing clay particle set the stage for the further evolution of both the code and the membranes.

The RNA world here described was almost certainly not exposed to free hot aqueous solutions because the RNA, contrary to DNA, is labile in hot water at high pH in the presence of bivalent cations such as calcium and magnesium. The prebiotic milieu was therefore not an organic soup but an organic scum at 100°-200°C. Biosynthesis and polymerization are postulated to have taken place on the surface of iron sulfide (FeS and FeS2). Both polymers of aminoacids and of nucleic acids are the evolutionary outcome of such surface-contained archaic metabolism.

The next step in evolution may have improved the mode of collecting the aggregates. Indeed, the time required for the formation of aggregates is very long if the pieces are dispersed in three dimensions and one relies on chance for the occurrence of an aggregation. The initial concentration of the pieces in this case is preponderant. If the nucleic acid is provided with a long collector strand, then the pieces that form the aggregate hook on the collector strand and move along it until the point of fixation of the aggregate is reached. This system greatly accelerates the formation of aggregates.

The collector strand fulfills only a physical function. It may change its composing bases during replication. The changes introduced in the composition of the collector do not influence the replication of the aggregates as long as the matching regions of the formed aggregates are still correctly maintained. The exposed surfaces of aggregates are free to vary and will do so. This represents another divergent phase of evolution, with the production of different collector strands. These gain in efficiency with length. Certain animal viruses have stretches composed of cytidylic acid. These stretches may be a souvenir of the collector role fulfilled in earlier times.

The next convergent step is reached when the RNA molecules, now much longer by the addition of the collector strand, produce in some way their own aggregates. I describe at the beginning of chapter 3 the way nucleic acid synthesizes proteins. The production by an RNA of its own proteinic envelope leads to a turning point because such systems are free and independent of naturally occurring compartments or aggregates. They can reproduce wherever the necessary periodic fluctuations occur. It is a convergent phase because any system able to complete the synthesis of its envelope faster than another system has selection advantages and will prevail. At this point, selection favors systems endowed with a more sophisticated protein-synthesizing apparatus.

The remnants of these systems of life at a very elementary level are still among us. Viruses exist composed of a single strand of nucleic acid enclosed by a capsule of protein. This protein shell (the capsid) is composed in general of several small pieces, which have the remarkable property of spontaneously rearranging themselves and reconstituting a full capsid when the loose subunits are present in sufficiently high concentration in a solution. This happens even when there is no nucleic acid present to be enclosed. In addition, some viral capsids will enclose any suitable nucleic acid, even if not the legitimate one. Many viruses of this elementary type are known: poliovirus, foot-and-mouth-disease virus, tobacco mosaic virus, etc. (fig.2.8).

Figure 2.8. Viruses present tremendous variations in size, shape and organizational levels.

Just like minerals, these viruses can form crystals. The nucleic acid of these viruses does not need the capsid in order to multiply, and is still "alive" even if deprived of it: the naked nucleic acid can invade cells and start an infection.

More elaborate viral systems are those where the viral capsid is itself enclosed within an envelope made of proteins and lipids. Influenza and mumps (fig.2.8) are representatives of this evolutive level. Functionally, these virus types are also superior, because they carry within them a protein that will direct and facilitate the synthesis of new viral nucleic acid.

The nucleic acid of these virus types is a single-strand of RNA that serves as the depository of the genetic information. In small viruses such as poliovirus, it is this RNA that directly specifies the synthesis of various viral proteins. Complementary strands of RNA, also called "negative" strands, are necessary intermediates for the synthesis of new "positive" infectious nucleic acid. At the influenza level of organization, the system becomes more elaborate. In this case, the "negative" strand is put to better use and specifies the synthesis of various viral proteins. Within the influenza group, a higher level of organization is reached with the paramyxoviruses such as rubella, mumps, Newcastle disease virus, that manage to enclose within the capsid not only the positive strand of nucleic acid, the viral protein destined to start viral duplication, but also the negative strand of nucleic acid. The whole is enclosed within an envelope.

With a refined protein-synthesizing mechanism evolving through selection advantages offered to mutants, all that remains to be achieved is to stop the waste of material occurring every time that the duplication mechanism of RNA is working. Since both RNA strands duplicate while only one serves as messenger RNA for the synthesis of protein, the other strand of RNA is almost useless. Salvage of material is accomplished by the storing of the genetic information in a more stable structure. This is the double-stranded DNA. The discovery of the double-strandedness of DNA is a major scientific accomplishment. In 1968, J. Watson attributed the discovery to himself and depicted himself as a prototype of the scientific hero. His relation of the discovery (The double helix) was a scandal even before publication. The Harvard press refused to publish it because of his self-aggrandizement and the scurrilous portraits of all of the principals in the story. He used the work of Rosalind Franklin without paying due tribute to her (Rosalind Franklin, by B. Maddox, Harper Collins, London and NY, 2002). His behavior is a good example of the indecent way science and medicine are currently practiced, as I will point out again on different occasions and have pointed out already (La France malade de sa m├ędecine. Ed. de Paris, 2005).

The origin of the DNA is an enigma. The most elusive aspect of DNA emergence is the nature of its template: it is indeed commonly assumed that DNA is copied from an RNA template. But where does the replicative enzyme, the reverse transcriptase, come from? It has been found11 that no template is necessary. In fact, the enzyme that synthesizes RNA in the presence of magnesium (polynucleotide phosphorylase) is just as well able to synthesize DNA when in the presence of ferric iron. But was ferric iron readily available in primeval times? Apparently not. We will see in the next chapter that ferric iron became available only 2 billion years ago, with the activity of bacteria, in which case DNA-based viruses are late apparitions in evolution, evolving after the emergence of the bacteria, which made ferric iron available.

There are 16 possible nucleotide bases that could pair up to make DNA. Why did nature pick the four we know as adenine (A), thymine (T), guanine (G) and cytosine (C) for the genomic alphabet? This choice, according to Donald Mac Donaill of Trinity College Dublin, incorporates a tactic for minimizing the occurrence of errors in the pairing of the bases, in the same way that error-coding systems are incorporated into ISBNs on books, bank accounts and airline tickets. Each nucleotide presents three bonding sites to its partner, namely either a hydrogen donor or an acceptor. A nucleotide offering donor-acceptor-donor sites would bond only with an acceptor-donor-acceptor nucleotide. A mismatch may accidentally occur, but is further minimized by the chemical formula of the two pairs, purines (thymine and cytosine) versus pyrimidines (adenine and guanine). The A-T and G-C choice forms the best pairs that are the most different from each other, so that their ubiquitous use in living beings represents an efficient and successful choice rather than an accident of evolution.

The double-stranded DNA is retained thereafter throughout the whole of the subsequently appearing living forms. A form of life of this type is represented in its primitive aspect by phage phy-x-174, a virus infecting bacteria, whose genetic material is composed of a single strand of DNA. Herpes simplex virus, with 84 known genes, represents a higher level of organization. The genetic material of this virus is composed of double-stranded DNA. A messenger RNA, synthesized on this DNA, is required for the synthesis of proteins. Like influenza, this virus carries within its capsid the enzyme responsible for the duplication of its own nucleic acid. The system is further perfected in the vaccinia virus group. In this group, the complexity of the organism increases considerably, as does its size. Yet, vaccinia is still dependent on the presence of amino acids in the environment for its multiplication and cannot synthesize these building blocks from simpler supplies. Human cytomegalovirus is a herpes virus with more than 200 genes. This giant among viruses contains not only DNA but also four species of RNA, producing 4 proteins synthesized as soon as the viral particles penetrate into the host’s cells12. Tumor viruses are entities that have retained a great flexibility: the DNA is the storage of information and viral messenger RNA is synthesized on this DNA. Under various conditions, some of the viral messenger RNA can be reverted straight back into DNA. But this reverse synthesis occurs now on a portion of the DNA of the cell. By this process, viral DNA has in part been integrated in the cellular genome of the host and this leads to wild multiplication of this host cell, i.e. a tumor13.

The psittacosis group of viruses represents higher forms of life, reaching the border of autonomous synthesis of high quality nutrients. The autonomous level is perhaps also reached by the pleuro-pneumonia-like organisms. This step is certainly reached by the bacteria.

These refinements in the duplication of the genetic material and the synthesis of proteins took place during 500 million years, between 4.0 and 3.5 billion years ago. The oldest traces of biological material observed in rocks are 3.5 billion years old. Bacteria are 3 billion years old.

An analysis of the succession of events needed to reach the organizational level of the bacteria, which are fully autonomous and able to synthesize about 1,000 proteins, reveals that the time granted was largely sufficient. Starting from simple polymers, this level could have been reached after the passing of about 1011 generations. If we assume that a generation occurs within a day, we are within reasonable limits since, under optimal conditions, bacteriophages, i.e. viruses that multiply in bacteria, multiply within 7 minutes, a bacterium divides within 6 to 7 hours and a virus like polio or foot-and mouth-disease reproduces 10,000 fold within 6 to 7 hours. In this case, 1 x 108 years of 365 days of 24 hours (we know the days were much shorter in early times) would be needed to achieve the goal. The first microbial communities detected with certainty in south African sediments were thriving 3 x 109 years ago. As we will see in the next chapter, life may have been killed off as late as 3.8 x 109 years ago by impacts from large asteroids. Thus, 800 million years appear left for the origin and early diversification of life.

The time thus available for this achievement was at least five times longer than the time needed.


11. M. Beljanski: De Novo synthesis of DNA-like molecules by polyribonucleotide phosphorylase in vitro. Molecular evolution, 1996, 42: 493-499

12. W. A. Bresnahan and T. Shenk : a subset of viral transcripts packaged within human cytomegalovirus particles. Science 288, 1999: 2373-2376

13. However, the genetic material of some tumor viruses is RNA.

This entry was posted in 2. The Evolution of Molecules. Bookmark the permalink.

Comments are closed.