Vol. 59, No 4/2012

The polymerization of amino acids under anhydrous prebiotic conditions was first studied several decades ago. Here we use a stochastic model stressing the relevant role of the polarity of amino acids in the formation of oligopeptides in a prebiotic milieu. Our goal is to outline the predominance of co-polypeptides over homo-polypeptides, resulting not only from the randomness, but also from polarity properties of amino acids. Our results conclude that there was a higher probability of the formation of co-polypeptides than of homo-polymers. Besides, we may hypothesize that the former would have a more ample spectrum of possible chemical functions than homo-polypeptides.


INTRODUCTION
The polymerization of amino acids, under anhydrous prebiotic conditions, has been studied by several authors (see Nakashima et al., 1977;Fox et al., 1977).We have put forward a simple probabilistic model to analyze such processes (Mosqueira et al., 2000).This model pays particular attention to the polarity of the participating amino acids.We believe that this feature of amino acids should imprint a bias in the produced polypeptides.
There is an experimental evidence of such a bias.It has been reported that the thermal anhydrous synthesis of tri-peptides involving glutamic acid, glycine, and tyrosine produced only 2 tri-peptides.The formation of 36 tri-peptides is expected under an a priori assumption of an even probability of reaction (that is, purely statistical) between different amino acids (Nakashima et al., 1977;Fox et al., 1977).(These authors studied only tyrosine containing tri-peptides).Furthermore, a mechanistic study of this reaction has been performed (Hartmann et al., 1981).It is worthy to mention that all the experimental work on this type of polymerization reactions, described in the literature, provided only semi-quantitative results and there are no kinetic data to study the evolution of oligopeptides with time.
We also have experimental evidence in organic chemistry for the synthesis of biased polymers (Katime, 1994).In the presence of two monomers M 1 and M 2 , and their respective free radicals, M 1 • and M 2

•
, the propa-gation reaction is described as making use of 4 kinetic constants: k 11 and k 12 for reactions M 1 • +M i , with i=1, 2 and k 21 and k 22 for reactions M 2 • +M i , with i=1, 2. Of course, k 11 ≠k 12 ≠k 21 ≠k 22 .Such conditions would lead to the synthesis of biased polymers and not to purely random polymers.The polarity of amino acids will result in the synthesis of random and biased oligopeptides, that is, oligopeptides with limited randomness.
To take into account the different polarity of amino acids, we have adopted, as a fairly good approach, the Dickerson and Geis (1969) classification of amino acids -into polar positive (p+), polar negative (p-), neutral (n), and non-polar (np) -which is an electrostatic or electromagnetic classification (the latter, when the charges are moving, which is usually the case).Such electromagnetic classification is important because we are focusing on possible chemical reactions between amino acids.In chemical kinetics, it is important to consider the electromagnetic nature of the reacting species.For example, we may have a reaction between an ion and a molecule (ion-molecule reactions which are very effective and fast), or a quite different reaction between 2 nonpolar molecules.It is with such ideas in mind that we adhere to this classification of amino acids.
We apply this model to allow the anhydrous polymerization of amino acids.Of course, we know this is an unrealistic situation in a prebiotic environment as, more probably, in such media we would encounter a mixture of molecules, and not only amino acids.We proceed in this manner firstly because it is the usual assumption in prebiotic chemistry.Secondly, there are some experimental data of reference to compare with our theoretical predictions.We remark, however, that our probabilistic method may be applied to other situations including a variety of molecules, not only amino acids.The validity of our model will be verified by future research.
We used our probabilistic model -allowing two group interactions (from a total of 4 groups, as mentioned above) -to different stages of the known reaction mechanism for the synthesis of tri-peptides from glutamic acid, tyrosine, and glycine.Our model was able to explain a strong bias previously observed (Mosqueira et al., 2000).Further, we realized that the initial conditions did not play an important role in this process, as we found that the attainment of the steady state was not influenced by the value of the initial conditions.In another work, we discussed the role of the steady state as an important constraint limiting or biasing the sequences of the produced oligopeptides (Mosqueira et al., 2002).Furthermore, we have exhaustively studied two group interactions and their possible outcomes.
Finally, we extended our probabilistic model to any possible permute of the 4 groups of amino acids, and referred once more to the relevance of the initiator for this oligomerization reaction (Mosqueira et al. 2008).

POLYMERIZATION OF α-AMINO ACIDS IN HYPOTHETICAL PREBIOTIC CONDITIONS
Several decades ago, it was experimentally established that to polymerize amino acids under anhydrous thermal conditions, there must be a sufficient proportion of at least one tri-functional amino acid, such as aspartic acid, glutamic acid, or lysine (Harada & Fox, 1965).This situation may lead to the presumption that bi-functional amino acids are unable to homo-polymerize.However, glycine, is an exception to the rule and does not need a tri-functional amino acid to self-polymerize.There are only few examples of homo-polymerization of amino acids in prebiotic conditions.
It is important to realize that polymerization is possible only if the conditions are anhydrous.Nonetheless, asparagine (a tri-functional amino acid) can homopolymerize in aqueous media upon heating (Kovacs & Nagy, 1961;Harada et al., 1978;Munegumi et al., 1994).Aspartic acid is a tri-functional amino acid and at pH 6 it has a negative charge in the carbonyl groups and a positive charge in the amino group.The self-polymerization of finely powdered DL-aspartic acid has been reported (Kovacs et al., 1961).It was carried out either by heating in vacuum (at 200ºC for 120 hours) or removing the water formed by azeotropic distillation.The polymeric material, referred to as anhydropolyaspartic acid, was formed by the loss of 2 molecules of water during condensation.
Glycine can either self-polymerize in aqueous media upon heating (Meggy, 1956) or in a water-ammonia mixture at temperatures below 150ºC (Oro & Guidry, 1961).The polymerization of glycine was investigated in detail by Meggy (1956) who found that glycine polymerized to poly-glycine at temperatures above 140°C, in the presence of a limited amount of water (i.e., less than about 3 parts of water per 1 part of diketopiperazine.Diketopiperazine results from the cyclodehydration of 2 glycines).If the ratio is 4 to 1, no polymer of glycine is formed.).
In the present work, we show the advantage of applying our probabilistic model to the process of homopolymerization of amino acids.As an important complement, we marginally refer to copolymerization (by copolymerization we mean polymerization of two different monomers), to make our point more soundly exposed.

METHODS
In this work, we consider the polymerization of amino acids under possible thermal prebiotic conditions, via a dehydration-condensation reaction.From the electric standpoint, all amino acids have identical amino groups and acid groups.They only differ in the electric properties of the residue group.It is this group which determines the electric properties of an amino acid.As we have already mentioned, we adopted the Dickerson and Geis (1969) classification of amino acids based on their electric characteristics.
We now consider the reactivity among such electric groups.Taken pairwise (as in a typical bimolecular reaction), we may find amino acid pairs with a high chance to react, and those with a lower probability to condensate.That is, in essence, a process in which randomness is present.For this reason, we adopt as a mathematical tool the first order Markov chains.We present a summary of this stochastic process in the following section.

THE MODEL
Let us define a finite Markov chain (Moran, 1984).Consider events that can occur at successive discrete stages and denote them by a variable, k , which can take the values 0, 1, ..., n ... At each stage, a finite number of events E 1 , E 2 , ..., E n ... can occur.These are the possible states of the system.
At each stage k+1, we suppose that the events E 1 , ..., E n occur with certain probabilities, which depend only on the events that occurred at stage k and not on anything that had happened previously.We express p ij for the probability of E j to occur at stage k+1 conditional on E i having occurred at stage k.
The set of quantities, p ij , i=1, ..., n, j=1, ..., n known as the transition probabilities, are non-negative, and satisfy the conditions Besides, P = (p ij ) is an n x n matrix and is known as the transition probability (or reactivity) matrix of the system (or stochastic matrix of the system).
If the probabilities of the events E 1 , ..., E n at any stage k are denoted by p 1 (k), ..., p n (k), for this state matrix after k stages, we have and these equations can be written in the matrix form p(k+1) = p(k)P (3) where p(k) is a row vector (or 1 x n matrix) whose elements are p 1 (k), ..., p n (k).Let us define a 1 x n initial state matrix (or an initial state row vector) p(0).By applying (3) repeatedly we see that where k is an integer.Now, we assume different electromagnetic interactions between the reacting monomers (amino acids).To that end, in accordance with Dickerson and Geis (1969), we classify amino acids into four groups: polar positive (p+), polar negative (p -), neutral (n), and non-polar (np).So, we arrange the four possible electromagnetic interactions between amino acids into a 4×4 P matrix. (1) (5) (2) Biased versus unbiased randomness in homo-polymers and copolymers of amino acids in the prebiotic world Thus, for example, the element p 13 is equal to p + n.Besides, the state of the system is represented at any stage k by a matrix of the state of the system that is a row matrix with four elements: As time elapses, such initial state attains a steady state.Such state may be calculated by the following equation (Moran, 1984): This equation states that the row vector of a given stage is the same as the row vector of the following stage.This of course is the steady state condition.This state seems to appear once k has attained a sufficiently large value (i.e.k is not greater than 6-11).This state persists to all subsequent stages, as long as the process is sustained, i.e. in our case, as long as the chemical process of polymerisation is sustained.The existence and attainment of the steady state stabilizes the proportion of different sequences produced.It is an important condition that limits variability in polymer sequencing.Furthermore, we found that initial conditions do not play a relevant role in the attainment of the steady state.The same steady state is attained irrespective of the initial conditions.(For further details see Mosqueira et al. (2002)).To calculate the steady state row vector, we should use equation ( 7) plus the probabilistic condition expressed by equation ( 1).
We remark that matrix (5) reduces its rank in case there are less than four groups of amino acids.That is, if there are only 3 groups of amino acids, then matrix (5) becomes a 3×3 matrix.Likewise, if there are only 2 groups of amino acids, then matrix (5) becomes a 2×2 matrix, and with only 1 group of amino acids, it becomes reduced to a 1×1 matrix.This is necessary in order to maintain in every instance a stochastic transition matrix.
Finally, we should make a succinct comment on the interpretation that we give to p ij in equation ( 5), which slightly differs from an orthodox interpretation of a transition matrix in a Markov chain.In a Markov chain, a matrix element p ij signifies the probability that an entity i becomes an entity j.In our approach, we interpret it as the probability of chemical reaction between entities i and j.This is the summary of the model up to this point.

COPOLYMERIZATION VERSUS HOMO-POLYMERIZATION
We now apply our model to discern the relative abundances of copolymers and homo-polymers.
In a homo-polymerization reaction, the electric nature of the participating chemical species will be the same.On the other hand, in a copolymerization reaction (two different monomers) a minimum variation in the electric nature of the reacting species is introduced.
Consider a bimolecular reaction -as it is usually the case -with only 2 kinds of species, say p + and n.Then, equation ( 5) becomes a 2×2 matrix, and equation ( 6) becomes a 1×2 matrix, respectively and Now, let us examine -within qualitative criteria -the numerical magnitudes to be assigned to the elements of matrix (8).We should expect a larger frequency of collisions for p + n than for p + p + .This supposition is based on basic physics, because it is well known that equal charges repel each other and opposite charges attract.We assume that this result is qualitatively correct.Then p + n > p + p + .
In respect to the symmetrical elements (i.e., p + n and np + ), apparently, we should assign the same numerical value, as it might be thought that it is the same phenomenon if p + interacts with n, or if n interacts with p + .However, a careful examination of this situation leads us to the conclusion that in chemistry, the symmetrical case is the exception, and the asymmetrical situation is the rule.
To illustrate this aspect, we will use specific members of groups p + and n to form a dimer.Then, let us use lysine (p + ) and glycine (n).Then, we construct gly-lys and lysgly dimers.
It can be seen from Fig. 1 that neither object is symmetrical.These dimers possess a different charge distribution and therefore are not equivalent.Using basic chemistry and enzyme biochemistry, it can be shown that both dimers react differently in chemical and enzymatic reactions.Such condition suggests that the symmetric elements in matrix ( 8) do not have an equal value.That is, we will assume p + n≠np + .
Finally, in the second row of matrix (8) we will assume np + >nn as a reasonable chemical hypothesis.Then, we define p + p + =α and 1-α=p + n, assuming α is small, that is 1-α>>α.(10) Likewise nn=β and 1-β=n p + , assuming β is small, that is 1-β>>β (11) Under such assumptions we write matrix (8) as The calculation of the steady state ( 7), plus the condition p 1 +p 2 =1, indicates that the composition of the sequences of co-polypeptides in the steady state (p 1 ° p 2 °) is Bearing in mind definitions ( 10) and ( 11), it is reasonable to assume β>α.On such basis, let us identify the effect on the element a 12 in (13) when nα→β with n = 1, 2,… Using this definition, the element p 2 ° takes the form (8) Observe in ( 14) that as n increases p 2 ° increase as well.This result shows that among the copolymers formed in the steady state, the chemical species n would be more abundant than p + (see equation ( 9)).This is a result that would be expected on the physical basis.In the nonrealistic case α=β, both monomers (p + and n) would be equally represented in the co-polypeptides in the steady state.

A FIRST APPROXIMATION TO CALCULATE THE RELATIVE ABUNDANCE OF A CO-POLYMER IN COMPARISON WITH A HOMO-POLYMER
We propose in this section a simple mean to calculate, from our model, the relative ratio η in of the yield of a hetero-polymer over a homo-polymer.In brief, η will be a comparison of the probability elements leading to hetero-polymer synthesis, over the probability elements leading to homo-polymer synthesis.Let us look at matrix equation ( 8) and consider the former elements.These are p + n, np + , and nn.Then, we take an average of those quantities and divide it over the matrix element of the homo-polymer synthesis, that is p + p + .Then Obviously, the value of η is greater than one (see condition (10) on p + p + ) and we may calculate, as a first approximation, the ratio of co-polymers to homo-polymers.In this form, the role of the second monomer participating in the co-polymerization may be clearly identified.Harada (1959) studied the homo-polymerization of lysine, and some other co-polymerizations.He reported that the free DL-lysine converted to its liquid lactam at 150-170ºC with vigorous evolution of water vapor, and homo-polymerized at 180-230ºC.It seems to be the description of the reaction mechanism that occurs in 2 stages (see Fig. 2).In the first step there is an internal cyclo-dehydration of lysine (A), giving rise to a lactam with a net positive charge (B).That is, a tri-functional amino acid (A) converts to a mono-functional amino acid (B).In the second stage, at a higher temperature, such positive lactam molecules react with each other to yield the DL-lysine homo-polymer, although with some difficulty because of identical electrical charges.We find similarities between the self-cyclization of glutamic acid (Mosqueira et al., 2000) and lysine.In both cases, there is a cyclo-dehydration reaction between the amino and carboxylic groups of the same tri-functional molecule.After this reaction, glutamic acid gives rise to a p -species, but lysine forms a lactam group (a p + species).

RESULTS
In the case of lysine polymerization, the use of our model would be straightforward, as the reaction mixture is quite simple due to the number of intermediate species being greatly reduced.The transition matrix (5) would reduce to a 1×1 matrix, as the only possible interaction is p + p + .To this interaction we should assign the value 1.Note that probability equal to 1 should be put on such pair interaction even though there will be a hindrance to carry out such reaction due to identical electrical charges of the reacting molecules.This appears to be contradictory, but it should be so as there is no other species in the reaction mixture, and the stochastic character of the reactivity matrix demands to set p + p + = 1.Accordingly, the initial state matrix (6) with 4 elements would be reduced to a state matrix with 1 element, described simply as (1), a 1×1 matrix.The use of equation ( 2) shows that the state matrix will not change in time: The 1×1 state matrix k [equal to (1)] is multiplied by the 1×1 transition matrix [equal to (1)], to give every time a 1×1 state matrix k+1 [equal to (1)].This process proceeds without changes, as long as there are monomers p + to react.Harada (1959) measured the yields of (1) DL-lysine homo-polymer (2) DL-lysine-glycine co-polymer, and (3) DL-lysine-DL-aspartic acid co-polymer.It is most interesting to note that the yield of (2) was about 9 times higher than (1), and that of (3) was about 6 times higher than (1).These results are revealing.When a purely homo-polymerization process is compared with a copolymerization process (with participation of 2 different monomers), the latter leads to a greater yield.
In our model that takes into account the electric nature of the molecules, these results derive straightforwardly.When at least two classes of monomers participate, then the role of the second monomer is identified.Basically, it assigns to the same charge element p + p + the value of probability of 1 (which is a necessity to maintain a stochastic matrix), to get a more plausible small value for α, far below 1(but of course, not equal to zero.

See condition (10)).
Let us assume that the second monomer is neutral (n), as it is the case for glycine in the DL-lysine-glycine co-polymer of Harada's experiment (1959).So, in the sequence of the produced polymers, sequences of the type … np + nnnnnp + nnp + nnnp + … (of course, it is a single example from a statistical set of co-polymers) will be favoured in comparison to sequences of the type … p + p + p + p + p + p + p + … , that represent the homo-polymer case.For this reason, we conclude that in any instance, we will obtain a much larger yield of copolymers than of homo-polymers.This expectation may be derived from our model, as we proposed equation ( 15).We haveas a first approximation -a ratio η>1.That is, the ratio formed by the yields of a co-polymer over a homopolymer is greater than 1.This is a result in agreement with the experiment (Harada, 1959), in which DL-lysineglycine co-polymer is 9 times more abundant than the DL-lysine homo-polymer.Of course, we cannot give a specific value to η as α and β have not been assigned specific numbers.We stress that the aim of the present work is not to determine the numerical values o such parameters (α and β), but to assume from basic physics the relative forces of repulsion-attraction between electrostatic charges, and construct a simple probabilistic model to deal with such prebiotic polymerization phenomena.

DISCUSSION AND FINAL REMARKS
It can be envisaged that contiguous like charges or monomers will not be favored in a polymerization pro- cess.On the contrary, it would be easier to unite contiguous charges of different polarities.This suggests that, in general, co-polypeptides were produced more abundantly in a prebiotic environment than homo-polypeptides, and therefore the former had more chances to last than homo-polypeptides.This result may be extended to polymerization processes in which more than 2 types of amino acids (that we call hetero-peptides) participate.Furthermore, we may hypothesize that co-polypeptides and hetero-peptides should have a more ample spectrum of possible chemical functions than homo-polypeptides.We see, then, a natural emergence and predominance of complex polypeptides (co-polypeptides and hetero-polypeptides) over simpler homo-polypeptides.This is undoubtedly a valuable result.
We should make clear that the hypothesized larger presence of co-polypeptides and hetero-polypeptides in comparison to homo-polypeptides is not only due to the presence of a variety of monomers and the rules of chance.It is also due to basic rules of physics described in matrix (5) for a polymerization process.We propose that the polypeptides that were produced in a prebiotic environment were random, of course, but were biased and had a limited randomness, due to differences in the polarity of the participating amino acids, described in matrix (5).Presently, we are using super-computer means to evaluate the extent of such bias.