Vol. 48 No. 4/2001 893–902 QUARTERLY Sequence analysis of enzymes with asparaginase activity ��

Asparaginases catalyze the hydrolysis of asparagine to aspartic acid and ammonia. Enzymes with asparaginase activity play an important role both in the metabolism of all living organisms as well as in pharmacology. The main goal of this paper is to attempt a classification of all known enzymes with asparaginase activity, based on their amino acid sequences. Some possible phylogenetic consequences are also discussed using dendrograms and structural information derived from crystallographic studies.

enzymes, the periplasmic proteins, known as type II asparaginases [8], from Escherichia coli (EcAII) and Erwinia chrysanthemi (ErA), have been in clinical use in the treatment of acute lymphoblastic leukemia and some other tumors for more than 30 years [9][10][11][12].Crystallographic studies [13,14] and sequence homology analyses [15] have demonstrated that all known type II asparaginases possess two highly conserved amino-acid motifs (Fig. 2).Asparaginases of this type are homotetramers with four active sites.Each active site is created by amino acids from two monomers, including the amino acids form the conserved motifs [13,15].In the nomenclature of EcAII, the four subunits are labeled ABCD, and the active-site-competent intimate dimers are composed of subunits AC or BD.It was postulated [13] that the mechanism of the asparaginase reaction could be a variant of the reaction catalyzed by serine proteases, but with a threonine (12 or 90 in EcAII sequence) in the role of the nucleophilic serine.The other two residues of the putative catalytic triad could be Asp90 and Lys162.The antitumor activity of these enzymes is the effect of their high affinity for the substrate, L-as- paragine (K m = 10 -5 M).Depletion of L-asparagine in the circulating pools starves the tumor cells, which have reduced levels of L-asparagine synthesis.Some enzymes with asparaginase activity can also hydrolyze L-glu- tamine.In the cases where L-glutamine is the better substrate, the enzymes are termed glutaminase-asparaginases (EC 3.5.1.38).Crystallographic studies of glutaminase-asparaginases [16,17] have revealed that they share the same tertiary and quaternary structure with type II asparaginases.The E. coli genome also encodes a related, cytoplasmic (type I) asparaginase (EcAI).It has the conserved amino-acid motifs found in EcAII, but the quaternary structure of these two enzymes is probably different.EcAI has been reported to function as a homodimer [18], and is characterized by a much lower substrate affinity (K m = 10 -3 M).The function of the two isoforms in E. coli and in other bacteria is not clear.It is not impossible that their main physiological function in the cell is different from L-as- paragine hydrolysis.Recently, the amino-acid sequences of proteins with unrelated activities have revealed high homology to bacterial asparaginases [19][20][21].One of them is mammalian lysophospholipase (EC 3.1.1.5),which is believed to play a major role in the hydrolytic degradation of lysophosphatidylcholine.The N-terminal fragment of the enzyme from rat liver shares 45% identity with EcAI.The N-terminal domain is followed by a leucine-zipper motif and a C-terminal domain that comprises two ankyrin repeats [19].
Residues in red are involved in catalysis.In PROSITE [56] notation, -x(2)-stands for -x-x-and a bracketed position shows the variability at this site.
Apart from the lysophospholipase activity, the protein also possesses asparaginase and platelet-activating acetylhydrolase activities.
It has been postulated that the same active center is involved in all these reactions [19].The crystallographic structure is not known yet, but biochemical data indicate that the protein (both isolated and recombinant) is active in monomeric form [19].
In archaea, the synthesis of Gln-tRNA Gln can proceed via an unusual pathway that is an alternative to the standard aminoacylation mechanism [20].This pathway involves mischarging of tRNA Gln with the noncognate amino acid Glu, with subsequent modification of the tRNA-bound glutamic acid in a transamidation reaction.This second process is catalyzed in archaea by a two-domain amidotransferase (Glu-AdT) [21,22].The a subunit of these enzymes shares about 20% identity and 30% similarity with EcAI.The a subunit probably provides the amino group to modify the tRNA-bound glutamic acid.The enzyme can use both L-glutamine and L-asparagine as amide donors [21].The involvement of the amino-acid-metabolizing asparaginase in early protein biosynthesis machinery as well as the fact that in some organisms tRNA-dependent transamidation may be the sole biosynthetic route to asparagine, have led to an interesting implication of Asp and Glu as the earliest amino acids and lent support to the idea that amino-acid metabolism had an important role in the evolution of the protein synthesis system [21,23,24].
The reaction of L-asparagine hydrolysis in plants is catalyzed by a different class of enzymes, plant-type asparaginases, with no homology to the bacterial-type enzymes [25][26][27][28][29][30][31].The most studied enzymes in this class are from legume plants and are involved in metabolic pathways connected with the assimilation of atmospheric nitrogen [28,30,31].These enzymes are still awaiting their crystallographic characterization, but the existing biochemical data and sequence homology indicate that they belong to the family of N-terminal nucleophile (Ntn) amidohydrolases [28,[30][31][32][33][34][35][36].The affinity for L-as- paragine is low (K m = 10 -2 M), but the high concentration of L-asparagine (40 mM) presented to the enzyme when its activity is required, guarantees efficient processing [37].The asparaginases from plants show about 60% sequence similarity to aspartylglucosaminidases (glycosylasparaginases). Aspartylglucosaminidases (EC 3.5.1.26)in humans catalyze the last stage of degradation of glycosylated proteins, i.e. the hydrolysis of the glycosidic bond between the sugar chains and the L-asparagine side chain [38].They can also act as asparaginases, albeit with rather low substrate affinity (K m = 10 -3 M) [39,40].The crystallographic structures determined for aspartylglucosaminidases from Homo sapiens [41] and Flavobacterium meningosepticum [42][43][44] show them to be heterotetramers created by autoproteolytic cleavage of two precursor protein chains.The first residue (Thr, Ser, or Cys) of the C-terminal (b) domain liberated in the autocatalytic process, serves the role of the nucleophilic agent.The nucleophilic character of the oxygen or sulfur atom is increased by the free amino group of the same amino acid [34].It is intriguing that the E. coli genome contains a sequence, deposited in the genomic data bases as an open reading frame (ORF) with unknown function (NCBI access code: AAC73915) [45], that shows very high homology to plant-type asparaginases and glycosylasparaginases.We have shown earlier that the product of this ybiK gene is a functional protein and described some of its properties, demonstrating, for example, that it is related to aspartylglucosaminidases and plant-type asparaginases [46].A similar protein from Salmonella enterica serovar Typhimurium that has recently been expressed and purified also reveals the characteristic properties of an Ntn hydrolase.It is interesting that this protein has no detectable aspartylglucosaminidase activity but instead has been re-ported to have isoasparagine aminopeptidase activity [33].
The third class of asparaginase sequences is typified by the thermolabile asparaginase from Rhizobium etli (ReA), with no homology to other proteins with known function [47].Rhizobium etli uses the enzyme in reactions that are a source of carbon and nitrogen in its metabolism.

MATERIALS AND METHODS
The amino-acid sequences of the three Escherichia coli asparaginases encoded by the ansA, ansB, and ybiK genes (NCBI access codes: AAC74837, AAC75994, and AAC73915, respectively) and the ReA sequence encoded by the ansA gene (NCBI access code: AAF00929), were used as probes for searching the Gen-Bank [48] in BLAST [49,50].One file was created for those sequences that showed similarity to the E. coli sequences EcAI or EcAII (products of the ansA or ansB genes, respectively).A separate file held the sequences identified by the EcAIII probe (product of the ybiK gene) or by the ReA probe.Only complete sequences were used for alignments.Multiple sequence alignments for each file were performed using CLUSTALX [51].Subsequently, phylogenetic trees were created using the neighbor-joining method [52] for the bacterialand plant-type asparaginases.The sequences that showed similarity to Rhizobium etli asparaginese were too few for a reliable phylogenetic tree.Finally, the trees were drawn using the TreeView program [53].

Bacterial-type asparaginases
All analyzed sequences show absolute conservation of the two threonine-containing motifs (Fig. 2) and of the lysine residue corresponding to EcAII Lys162 (vide supra).The dendrogram for the bacterial-type asparaginases (Fig. 3) is in good agreement with the tree of life [54] and is clearly divided into two main parts.One of them comprises mainly sequences from archaea and eukaryota, while the other one includes eubacterial asparaginases.In the archaeal part of the dendrogram, the a subunit of the Glu-tRNA-modifying amidotransferases creates one subset.Another group of proteins, lysophospholipases, also form a discernible subset.It is interesting that differences in quaternary structure can be correlated with the results of the sequence alignments.In all the enzymes labeled in green in Fig. 3, but not in the other sequences, there is a characteristic extra insertion, about 25 amino acids long and with some sequence conservation, which follows the position of residue 42 in EcAII.Interestingly, all those "green" sequences consistently correspond to eukaryotic organisms, even though the branch where they are located on the dendrogram contains bacterial species as well.The 42 site in EcAII marks the end of a long, flexible loop that in all type II enzymes creates the environment of the active site.The presence of a longer loop in lysophospolipases explains why these enzymes can catalyze a different reaction and why they are active as monomers.For archaeal amidotransferases, the alignment reveals not only the presence of an additional archaeal motif at the N-terminus reported earlier [21], but also a short motif after residue 151 (EcAII sequence).In EcAII, residue 151 is the first amino acid of a b hairpin loop that is located at the interface between monomers A and B, and that seems to play a role in tetramer stability.
The second branch of the dendrogram comprises mainly eubacterial sequences known as type II asparaginases.Typically, these enzymes are secreted into the periplasm, in contrast to type I bacterial L-asparaginases, which are cytosolic.Yeasts, such as Saccharomyces cerevisiae and Schizosaccharomyces pombe, also produce two biologically and genetically distinct L-asparaginases.One of these is a cell-wall glycoprotein, while the other is a constitutively expressed cellular enzyme.Their amino-acid sequences clearly classify both these enzymes in the bacterial type II branch (Fig. 3), in line with the earlier suggestion that the asparaginase genes were independently duplicated in prokaryota and eukaryota after the divergence of these lineages [15].

Plant-type asparaginases
All sequences in this group show an absolute conservation of the catalytic threonine residue [40][41][42][43].The dendrogram for plant-type asparaginases (Fig. 4) is clearly divided into four branches.The aspartylglucosaminidase sequences form one of the branches.The enzymes from plants and their close homologs form another branch.The product of the E. coli ybiK gene (EcAIII) is also placed on this branch supporting the hypothesis based on biochemical data that it is a close relative of plant asparaginases [55].The third branch comprises mainly archaeal sequences with unknown function, while the fourth one includes sequences from eukaryota for which no function is known either.It is interesting that sequences from Homo sapiens, Drosophila melanogaster, and Arabidopsis thaliana are found on all branches, except the archaeal branch, sometimes in more than one example.Orange branch -type I bacterial asparaginases, blue branch -type II bacterial asparaginases.Lettering colors: blue -sequences with high homology to EcAII, brown -sequences with high homology to EcAI, green -sequences with high homology to lysophospholipases.Yellow box -subunit a of Glu-AdT amidotransferases, green ovallysophospholipases, pink oval -EcAI, reported to be active in dimeric form [18], blue ovals -tetrameric bacterial type II asparaginases with antitumor activity.gi is a unique sequence identifier assigned by the NCBI [48].The scale bar represents 10% divergence.This suggests a possible fundamental role of this class of proteins in the metabolism of higher organisms, especially that there are only very few eubacterial sequences in this dendrogram.Violet branch -aspartylglucosaminidases. Orange branch -predominantly archaeal plant-type asparaginases.
Brown branch -eukaryotic sequences with homology to plant asparaginases but of unknown biochemical characteristics.Green branch -biochemically characterized (green ovals) asparaginases from plants and their homologs in other organisms.Pink ovals -biochemically characterized aspartylglucosaminidases, yellow oval -the product of the E. coli ybiK gene with asparaginase activity.

Sequences with homology to Rhizobium etli asparaginase
The alignment of eight sequences similar to Rhizobium etli asparaginase (Table 1) reveals the presence of a highly conserved motif NCSGKH.Since this is the only fragment with high conservation, it could be implicated for a role in the catalytic mechanism.It is interesting to note that no homologs of ReA are present in many of the genomes that have been sequenced to-date.

General classification
Based on the above alignments, as well as on biochemical and crystallographic data, the known asparaginase sequences can be divided into three families (Fig. 5).The first family corresponds to bacterial-type asparaginases, the second to plant-type asparaginases and the third one to enzymes similar to Rhizobium etli asparaginase.The enzymes included in this classification catalyze a number of different enzymatic reactions, including reactions connected with protein biosynthesis [21].

Table 1 . Sequence identity (red) and similarity (blue) for the Rhizobium etli asparaginase family
4. Broome, J.(1961)Evidence that the L-as- paraginase activity of guinea pig serum is re-