Discrete dynamic system oriented on the formation of prebiotic dipeptides from Rode ’ s experiment

This work attempts to rationalize the possible prebiotic profile of the first dipeptides of about 4 billion years ago based on a computational discrete dynamic system that uses the final yields of the dipeptides obtained in Rode’s experiments of salt-induced peptide formation (Rode et al., 1999, Peptides 20: 773–786). The system built a prebiotic scenario that allowed us to observe that (i) the primordial peptide generation was strongly affected by the abundances of the amino acid monomers, (ii) small variations in the concentration of the monomers have almost no effect on the final distribution pattern of the dipeptides and (iii) the most plausible chemical reaction of prebiotic peptide bond formation can be linked to Rode’s hypothesis of a salt-induced scenario. The results of our computational simulations were related to former simulations of the Miller, and Fox & Harada experiments on amino acid monomer and oligomer generation, respectively, offering additional information to our approach.


INTRODUCTION
If we consider the geochemical conditions that supposedly prevailed on earth 4 billion years ago (Vogel, 1998), it seems that peptides had a greater chance to be formed than any other bio-molecules.One plausible chemical scenario for their generation is the salt-induced peptide formation (SIPF) proposed by Rode and coworkers (Schwendinger & Rode, 1989, Schwendinger & Rode, 1992, Plankensteiner et al., 2005;Reiner et al., 2006;Fraser et al., 2011) involving high concentrations of NaCl subjected to wetting/drying cycles (Saetia et al., 1993) and acting as a condensation reagent for the peptide bond formation.The SIPF hypothesis is supported by the estimated oxygen content of the secondary primitive earth atmosphere, which allowed the oxidation of Cu(I) to Cu(II) (Cloud, 1973;Ochiai, 1978) that is considered as a fundamental condition for the characterization of the amino acid side chain electronegativities (Schwendinger & Rode, 1992;Rode, 1999).An acidic pH and temperatures between 80 and 100°C must have prevailed in the cooling process of the earth after the formation of the first hydrosphere as well as regular drying/wetting and day/night cycles, heavy rainfalls, tidal fluctuations and various atmospheric processes.
Under such scenario, laboratory experiments indicated formation of peptides from binary mixtures of amino acids.It turned out that some amino acids promote the formation of homo-dipeptides and others of hetero-dipeptides.In this context, Rode carried out a systematic study of those amino acids that played a major role in the formation of dipeptides and observed their generation under SIPF conditions.His pioneering work yielded a detailed quantitative description of 81 dipeptides formed by the combination of 9 amino acids: Gly, Ala, His, Asp, Glu, Lys, Pro, Val, and Leu (Rode, 1999).The obtained concentrations of the 81 dipeptides are called here the Rode profile Table 1 (Table 6; Rode, 1999).Such profile is very useful since it can be considered as a quantifiable precedent of the relative composition profile of the starting amino acid monomers.Like in the case of the amino acid distribution in the Murchison meteorite (Wolman et al., 1972), the Rode profile, i.e. the measure of the final composition of the dipeptides in Rode's experiments, can serve as a valuable information enabling us to build a mathematical-computational model about the assumed prebiotic peptide formation.
The present work focuses on the possible prebiotic peptide profile formed 4 billion years ago by using the information of the Rode profile through computational simulation and by comparing this profile with our former studies (Polanco et al., 2013;Polanco et al., 2013a) on the Miller-type generation of amino acid monomers (Miller, 1953) as well as with the experiments by Fox & Harada (1960) on the generation of the so-called "proteinoids".In particular, we simulated in three computational scenarios the hypothetical peptide building (i) resulting from the Miller experiments on the lightning-induced amino acid generation by using the experimentally observed monomer abundances, (ii) considering the initial conditions of the Fox & Harada experiments as well as (iii) reproducing the Rode profile taking into account the starting mixtures of the Rode experiments.The latter allowed us to perform extrapolations of the future and past states of peptide building under those salt-induced conditions, i.e. the hypothetical building of longer peptides than dimers and the inverse process, respectively.
Our computational model intends to recreate the prebiotic scenario from a discrete dynamics system that satisfies the Markov conjecture.This computational scheme allows multiple variables to delimit the model affecting neither the complexity nor the required processing time, due to the assumption of the Markov property (Isaacson & Madsen, 1976).A Markov process is a stochastic system in which the occurrence of a future state depends on the immediately previous state and only on that previous state.Thus the set of random variables {X n } in a process is said to have the Markov proper- Roughly speaking, the Markov property is satisfied if the future location of the object in study depends on its present state and not on its past state.
From this Markov process three relevant results can be identified: (1) The Rode profile enabled us to build up a past-future profile of the prebiotic composition very accurately and with a minimal number of amino acids.(2) The profile of the final composition from the Miller experiment on amino acid monomer formation and those of Fox & Harada, and Rode on the amino acid oligomerization converges into a single profile despite significantly different numbers and proportions of the involved amino acids as well as the circumstance that the Rode approach results in peptide bond formation and the Fox & Harada does not and (3).The polarity bias in the amino acids does not seem to affect the composition of the prebiotic peptides constructed this way.The comparison of the three experimental approaches was performed by constructing a polarity matrix for each one of them.The polarity matrix plays a fundamental role in the polarity index method that we have been using as a versatile fingerprint to identify the main pathogenic role of antimicrobial peptides (Polanco et al., 2012).
The dipeptide formation was considered in the spirit of our former toy model simulations (Polanco et al., 2013), i.e. without taking into account the thermodynamic details of a particular chemical process.

MATERIAL AND METHODS
This work is essentially a comparative study of the abiogenetic experiments by Miller, Fox & Harada, and Rode.The first two experimental approaches have already been computationally modeled by us considering the polarity as a bias (Polanco et al., 2013 and2013a).The computational platform was designed to simulate the evolutionary process of polymerization based on the abundance and concentration of the amino acids to project the future trend of the dipeptide formation.Subsequently, a polarity matrix was calculated for each set of the obtained dipeptides.This polarity matrix was then linearized and its geometric representation as smooth curves was used to compare the future trend of the dipeptides.
To carry out the computational modeling, the same polar classification was considered for the amino acids in each experiment, i.e. the four polar groups P+, P-, N and NP as well as the same proportion and number of amino acids.In the Miller and Fox & Harada experiments particularly, the number of amino acids considered was like that prevailing in the prebiotic 4 billion years ago.To recreate the Rode experiments, the experimental final composition of the dipeptides (matrix A 1 ; Section Discrete dynamic system) was not used.Instead, we estimated the dipeptide formation starting from a prebiotic scenario based on the amino acid monomer abundances as predicted by the Miller and Fox & Harada experiments.To achieve this, each of the 81 dipeptide proportions were extrapolated first forward (expressed in matrix A 6 ; Section Discrete dynamic system) and then, from the construction of analytic functions, backwards (expressed in the B 0.9997 matrix; Section Construction of the B 6 matrix).These functions were verified with the dipeptide proportions in the Rode experiments (Section Proximity between two matrices).The B 0.9997 matrix was verified by measuring its proximity to the Rode matrix A 1 (Section Pastfuture profile).Then the matrices B 0.9997 and A 1 were iterated to obtain the B 6 matrix, representing the distant future of the B 0.9997 , A 6 , and A 1 matrix.Afterwards the proximity between the B 0.9997 and A 1 was verified.In this way we built a broader past-future scenario than defined by the A 1 matrix from Rode's experiments.Finally, with the B 0.9997 matrix, the abiogenetic laboratory experiments were computationally modeled.
Then with the restrictions of abundance, polarity and number of amino acids, each of the abiogenetics experiments were computationally evolved by enabling and disabling the polarity bias and in all six cases the polarity matrices were calculated.Finally, the polarity matrices, expressed as smooth curves, were compared with and without the polarity bias.In both comparisons the consolidated set of genes from Delaye et al. (2005) of three microorganisms was included, representing the closest experimental precedent of the evolutionary trend.6; Rode, 1999), where (i,j) = i-j linkage yields in the i,j amino acids.(na): Linkages not investigated yet, the value is 0.0001 instead of zero, (data supplied by Rode).(nf): Linkages analyzed but not found, the value should be 0.0000 dipeptide.(tr): Linkages found with traces but not measurable, we used 0.0100 (data by Rode).System oriented on the formation of prebiotic dipeptides from Rode's experiment

Discrete dynamic system
The typical Rode experiment (Rode, 1999) consisted of the 9 amino acids Asp, Glu, Gly, Pro, Lys, His, Ala, Leu and Val.The computer simulation of the prebiotic scenario considered a discrete dynamic system (Thom, 1975), which can be written as a matrix equation of the form: A k = A 1 ,..., A 1 , k times.The A 1 matrix represented the final abundance of the experimentally formed dipeptides.The (i,j) element of the A k matrix is read as the yield of i-j peptide linkage of the i, j amino acids in time k.The notation we used in this paper to refer to an (i,j) element from the k-th matrix was A k (i,j) .Construction of the future.The sequence of the Rode system started with the A 1 matrix (Table 1) (Table 6; Rode, 1999), that multiplied by itself A 1 A 1 produced the A 2 matrix, i.e.A 2 = A 1 A 1 .Since the system represented the transformation occurring to the A 1 matrix in time k, then the continuous iterations of the A 1 matrix took us to the future state of the A 1 matrix.This procedure induced a succession of A 1 , A 2 , A 3 ,..., A k , A k+1 ,… matrices, in which the left-end element corresponded to the past state of the system and the rightend element to the future state of the system.It is important to note that the discrete dynamic system from a present state intends to build a future state, but a present state could not be used to build a past state.The matrix representing the future of the A 1 matrix was then set to 6 iterations and it was called A 6 (Table 2).
Construction of the past.In order to know the past of the A 1 matrix (Table 1) (Rode, 1999) it was necessary to know the information of the future trend of it, that is A 1 , A 2 ,..., A 6 .As each of the 81 elements of these matrices represented the final measure of a dipeptide, then to obtain a B matrix that represented the past of the A 1 matrix we designed 81 sixth degree polynomials, to act as a predictor function, to be used later to extrapolate the values in time to represent the past of A 1 matrix.Here, there is an example to clarify this procedure.If we look for the past of the composition Asp-Asp = (1,1) (element located in line 1 column 1 of A matrix), we take all values corresponding to element (1,1) of the A 1 , A 2 ,..., A 6 matrices i.e.A 1 (1,1) = 0.3800 (Table 1), A 2 (1,1) = 0.8852 (data not shown in Tables) successively until A 6 (1,1) = 1498.1847(Table 2).This induces the succession of points: (x k , y (i,j) ) k = (1, 0.3800) 1 ,(2, 0.8852) 2 ,..., (6, 1498.1847)6 .With points (x k , y (i,j) ) k we build the polynomial P(x) = 0.80772x 6 -10.74002x 5 + 52.65633x 4 -108.34695x 3 + 58.31035x 2 + 76.21210x -68.51954, using the least-squares method.Finally we evaluate in this polynomial, all values less than (1, 0.3800) to extrapolate the succession to the past.Following this example let us take the value (1, 0.9999), the polynomial evaluated at this point is P(0.9999) = 0.3770, then B 0.9999 matrix in its element (1,1) has the value 0.3770, i.e.B 0.9999  (1,1) = 0.3770.With this procedure, points {0.9999, 0.9998 and 0.9997} (Table 3) are evaluated generating the B 0.9999 , B 0.9998 , B 0.9997 matrices that represent the remote past of A 1 , with the B 0.9997 matrix particularly representing the most remote past of A 1 matrix.

Construction of the B6 matrix
The B 6 matrix (Table 4) was built by multiplying the B 0.9997 matrix for 6 iterations, i.e.B 6 = B 0.9997 B 0.9997 ... B 0.9997 .Just as the A 6 matrix represented the future of the A 1 matrix (Table 2), the B 6 matrix represented the future of the B 0.9997 matrix.Note that the B 0.9997 matrix was built by polynomial extrapolation (Section Discrete dynamic system) and not as the result of experimental inspection.

Proximity of the A 6 and B 6 matrices
The A 6 and B 6 matrices represented the most distant trend to the future of the peptide linkage composition.The first one corresponded to the experimental data and the second one was the result of the discrete dynamics system.The verification of these matrices was regarding the proximity between their respective elements (Table 5).

Proximity of the A 1 and B 0.9997 matrices
The A 1 and B 0.9997 matrices represented the trend of the most distant past of the peptide linkage composition, the first one corresponded to Rode's experiment and the second one was the result of polynomial extrapolation (Table 6).

The Rode approach
The B 0.9997 matrix represented, by polynomial extrapolation, the remote past of the A 1 matrix (Section Discrete dynamic system), and for us it was a measure of the abundance of the 81 different interactions from Rode's experiment forming the dipeptides taking 9 amino acids from that remote past.With a computer program already used before to recreate prebiotic scenarios (Polanco et al., 2013), we generated a set of 3000 short peptides.The model used two factors: abundance and polarity.As a bias for the abundance we used the inverse relative abundance represented by the B 0.9997 matrix (Table 7) and for the polarity we used two inverse polarity distributions in which one induced a bias (Table 8-A) and one without bias (Table 8-B).

The Fox & Harada approach
The initial conditions of the Fox & Harada experiments used for a hypothetical peptide building scenario The B 0.9997, B 0.9998 , B 0.9999 matrices represent the past trend of A 1 matrix, where the superscript 0.9999 represents the oldest trend in the past (in mM).These matrices were calculated by polynomial extrapolation (Section Composition of the past), in the corresponding value.(na): Linkages not investigated yet.(nf): Linkages analyzed but not found.(tr): Linkages found with traces but not measurable.System oriented on the formation of prebiotic dipeptides from Rode's experiment were simulated by us before (Polanco et al., 2013a).It considered 18 amino acids.The proportions were 10 g Asp and 10g Glu as well as 5 g of the remaining 16 amino acids given in Table 10.We took these proportions and two polarity distributions for the amino acids, one of which induced a bias (Table 8-A), and one did not (Table 8-B).3000 peptides were generated.In these simulations, Gly was considered in the neutral polar group in order to compare it to the Rode's experiment.

Polarity matrix
The polarity matrix is an array of 16 elements, 4 rows and 4 columns that correspond to the polar groups P+, P-, N, and NP, called for simplicity the M matrix.The M matrix was an essential part of the mathematical-computational polarity index method (Polanco et al., 2012;Polanco et al., 2013;Polanco et al., 2014) and it was used to inform in an exhaustive way the polar profile of the analyzed peptides.In or-  |, where the A 6 matrix is the future matrix of the A 1 matrix, i.e.A 6 = A 1 A 1 ,...,A 1 (Section 2.1), and B 6 matrix is the future matrix of B 0.9997 matrix i.e.B 6 = B 1 B 1 ,...,B 1 (Section Construction of the B 6 matrix).(na): Linkages not investigated yet.(nf): Linkages analyzed but not found.(tr): Linkages found with traces but not measurable.Difference in mM between |A 1 (i,j) -B 0.9997  (i,j) | / |B 0.9997  (i,j) |, where A 1 matrix is Rode's matrix (Section Discrete dynamic system), and B 0.9997 matrix is calculated by polynomial extrapolation (Section Discrete dynamic system).(na): Linkages not investigated yet.(nf): Linkages analyzed but not found.(tr): Linkages found with traces but not measurable.der to build this matrix from the set of 3000 peptides taking into account the experiments of Rode, Miller and Fox & Harada with the described hypothetical peptide building extrapolations, we took the 3000 sequences in terms of their amino acids and translated them into the equivalent of their polar groups with the following convention: His, Arg and Lys were translated to the first group; Asp and Glu to the second group; Gly, Ser, Thr, Cys and Tyr to the third group and α-amino-n-butyric acid (9), α-aminoisobutyric (0), Nva, γ-aminobutyric acid (7), β-aminoisobutyric acid (6), β-amino-η-butyric acid (5), β-alanine (4), N-methylalanine (3), N-ethylglycine (2), and Sar, Ala, Leu, Pro, Val, Trp, Met, Phe and Ile were translated to the fourth group.
In this way, the file of amino acid sequences was re-written in terms of an alphabet of 4 numbers {1, 2, 3, and 4}.After this step the number of polar interactions was counted, reading each sequence from left to right by pairs every time.To illustrate this procedure in the sequence EEGPKHKDEV the polar equivalent is 2234111224.At this stage, the initial polarity matrix is equal to zero, i.e.M (i,j) = 0.When we start reading the sequence, from left to right, we find the position (2,2), therefore we add 1 in M matrix, i.e.M (2,2) = 1, after counting this first interaction we move one place to the right, to find the interaction (2,3), and we add 1 to this position, i.e.M (2,3) = 1, and so forth until we find the interaction (4,1) and add 1 incident i.e.M (4,1) = 1.Note that in the following two runs the interaction (1,1) is repeated, therefore interaction (1,1) is 2, i.e.M (1,1) = 2, and so on successively until the end, then we continue with the next sequence.

Polar profile of prebiotic peptides
The M polarity matrix collected all the peptide combinatorial interactions built with the prebiotic computational model.In The 16 columns on the x-axis correspond to 16 polar interactions from the polarity matrix without polar bias (Table 11).order to interpret the M matrix, it was normalized to 1 and ordered in a column-vector of 16 positions (Table 11).In this way the column-vector contained the polar relative distribution of the sequences generated by the model.From this column-vector, a graph was drawn with smooth curves for the four scenarios described (Figs. 1, 2).

Preserved genes
The same number of E. coli, M. jannaschii and S. cereviasiae used by Delaye and coworkers (Delaye et al., 2005) was used here, extracted from the KEGG data base (Kanehisa et al., 2000) for a previous publication (Polanco et al., 2013).

Past-future profile
The terms "remote past" or "distant future" should be understood as approximations.The past and future profiles result from matrix multiplications and the construction of analytical functions.It is not possible to quantify a time-scale and for that reason the kinetics of dipeptide formation in our simulated scenarios cannot be defined.However, it is possible to affirm that these approximations by analytic functions have enabled us to build a past-future scenario with a time period large enough to be compared with the set of preserved genes (Section Preserved genes).
The exponents or superscripts used in the estimation of the remote past (0.9997, 0.9998, and 0.9999) are not arbitrary.Integer values would have produced extremely high values in the final concentrations.Therefore the selection of the exponents was related to the analytic functions.In the case of the superscripts used for the distant future (1, 2, ..., 6), they were integer numbers, as the multiplication of the resulting matrices did not induce extreme concentration values.

RESULTS
The analysis of similarities between the A 6 matrix, which represents the future state of the dipeptides composition from Rode's experiment and the B 6 matrix, ob-  tained by polynomial extrapolation, shows a small difference between the 81 elements (Table 5).The same small difference is observed between the A 1 matrix, representing the initial dipeptide composition from the Rode's experiment and the B 0.9997 matrix built with the discrete dynamic system (Table 6).
Interestingly, the bias by polarity did not alter the polar profile of the peptides significantly (Table 9).In all cases the percentage difference (+/-) between the two distributions with and without bias was not significant.
The curves of all three computational scenarios, either with or without the polarity bias (Figs.1-2), almost preserved the same maximum and minimum points, despite the fact that the amino acid numbers and participation percentages were different.
The Fox & Harada distribution (Fig. 2, column 6) reveals a maximum of Glu and Asp as well as the Rode distribution (Fig. 2, column 11).Something similar occurs with Gly.
The preserved protein distribution (Section Preserved genes) shows an almost total coincidence when the three scenarios without polar bias were compared (Fig. 1).It does not happen the same way for the scenarios with polar bias (columns 2, and 6; Fig. 2).

DISCUSSION
According to our simulations of short peptide formation, the polarity matrices of the discrete dynamics system based on the Miller, Fox & Harada, and Rode approach, were nearly coincident and converged into the same profile regardless of the bias induced by the polarity, the last profile is also consistent with the set of preserved genes (Polanco et al., 2013).From the mathematical point of view, we consider the starting 9 amino acids used in the Rode experiments as a basis (Poole, 2011), i.e. the minimum number of elements in a set capable of generating that set.We do not know if 9 amino acids are in fact the minimum possible to induce the same profile as in the hypothetical peptide formation based on the Miller and Fox & Harada approach.Nevertheless, they represent 40% of those generated in the Miller experiment and 50% of those in the starting conditions of the Fox & Harada experiment.In this regard, the Rode experiment in itself is important, since it can open the discussion about the   proach).System oriented on the formation of prebiotic dipeptides from Rode's experiment minimum number of amino acids capable to generate a prebiotic profile of the proteins.
Our results indicate that the relative abundance of the amino acids is the most influential aspect for the sequential characteristics of the "first peptides" as it is shown by the coincidental distribution of the three scenarios that do not seem to be greatly affected by a polarity bias.This last observation could lead to the modeling of a prebiotic scenario with greater granularity, since it would be possible to prioritize the involved biases and use a hierarchical hidden Markov model (Fine et al., 1998) where, particularly the abundance, would be a non-visible component and the amino acid profile would be the visible element to be determined.Computer simulations in this direction are under progress because the mathematical profile of this type of models allows considering several biases, without increasing the computational complexity.

Figure 1 .
Figure 1.Linear polar interaction between simulated peptides formed in the Rode, Miller, and Fox & Harada approach.The 16 columns on the x-axis correspond to 16 polar interactions from the polarity matrix without polar bias (Table11).

Figure 2 .
Figure 2. Same as Fig. 1 taking into account the polar bias during the peptide formation

Table 1 . A 1 matrix profile.
Initial amino acid concentrations allowing dipeptide formation (in mM) (Table

Table 2 . A 6 matrix profile.
Future trend of peptide linkage composition (in mM) once the A 1 matrix was iterated six times (Section 2.1).(na): Linkages not investigated yet.(nf): Linkages analyzed but not found.(tr): Linkages found with traces.

Table 4 . B 6 matrix profile.
Trend to the future of peptide linkage composition (in mM) from B 0.9997 matrix (Section Composition of the past).(na): Linkages not investigated yet.(nf): Linkages analyzed but not found.(tr): Linkages found with traces but not measurable.

Table 7 . Rode matrix of pre-established values by abundance.
Inverse relative abundances in B 0.9997 matrix (Section Discrete dynamic system).(na): Linkages not investigated yet.(nf): Linkages analyzed but not found.(tr): Linkages found with traces but not measurable.

Table 8 . Polarity composition by lateral chain.
bias, B bias: without polar bias.System oriented on the formation of prebiotic dipeptides from Rode's experiment Inverse relative polarities by lateral chain: [P-] polar, [N] neutral, [P+] basic hydrophilic and [NP] non-polar amino acids.A bias: with polar