Vol. 47 No. 1/2000 11–22 QUARTERLY

8-Hydroxy-2-[2-(3-hydroxy-4-methoxyphenyl)ethenyl]-7-quinoline carboxylic acid and 8-hydroxy-2-[2-(3-methoxy-4-hydroxyphenyl)ethenyl]-7-quinoline carboxylic acid inhibit the processing and strand transfer reactions catalyzed by HIV-1 integrase with an IC50 of 2 microM. Some of their spectral properties are briefly reported. Their fluorescence is so weak that it is of no use in an experimental determination of the binding to the protein and we resorted to computer simulation. Both styrylquinoline derivatives, in their monoanionic form, have several dozens of tautomers and each of these forms has four planar rotamers. In this work computer simulations have been performed to determine which tautomer is the most abundant in aqueous solution and which binds to the Rous sarcoma virus (RSV) integrase catalytic core. As the substituents on the quinoline moiety are the same as on salicylic acid, the energies of hydroxy benzoic acid tautomers were also computed both in vacuo and embedded in a continuous medium which had the dielectric constant of bulk water, using the recent CPCM technique. The CPCM method was then applied to the two integrase inhibitors to estimate the tautomer population in water. The binding site of the compounds on the RSV integrase catalytic core was determined through a docking protocol, consisting of coupling a grid search method with full energy minimization. The designed method is a way leading to identification of potent integrase inhibitors using in silico experiments.

to the Rous sarcoma virus (RSV) integrase catalytic core.As the substituents on the quinoline moiety are the same as on salicylic acid, the energies of hydroxy benzoic acid tautomers were also computed both in vacuo and embedded in a continuous medium which had the dielectric constant of bulk water, using the recent CPCM technique.The CPCM method was then applied to the two integrase inhibitors to estimate the tautomer population in water.The binding site of the compounds on the RSV integrase catalytic core was determined through a docking protocol, consisting of coupling a grid search method with full energy minimization.The designed method is a way leading to identification of potent integrase inhibitors using in silico experiments.
AIDS is essentially a viral disease and should be treated by antiretroviral agents (Ho et al., 1995).The advent of combination antiretroviral therapy has made it possible to suppress the replication of HIV-1 in infected persons to such an extent that the virus becomes undetectable in the plasma for more than two years, but seems to persist in peripheral-blood mononuclear cells.This means that HIV-1 infection can be controlled but not eradicated by current treatments (Zhang et al., 1999;Furtado et al., 1999).It is therefore important to search for agents that could block the virus at these steps of its replicative cycle, which are not affected by current treatments.From this standpoint, HIV DNA integration into the genomic DNA of host cells, a crucial step in the replicative life cycle of the virus, constitutes a particularly attractive target for AIDS chemotherapy drugs, including potential synergy with currently available HIV reverse transcriptase and protease inhibitors (De Clercq, 1995;Pommier & Neamati, 1999).In HIV-1 and other retroviruses such as RSV, integration is mediated by a viral enzyme, integrase, which is necessary for the production of progeny viruses (LaFemina et al., 1992).The integration function is composed of two steps both consisting of the nucleophilic attack of a phosphoester bond by the lone pair of a hydroxyl group (Engelman et al., 1991).In the first step, called 3¢ processing, the enzyme catalyzes the hydrolysis of a phosphoester bond, which removes two nucleotides from either of viral DNA strands and forms free 3¢-OH groups at a conserved CA dinucleotide sequence.In the second step, called strand transfer, these groups are used as nucleophilic agents and attack phosphoester bonds in opposite strands of the genomic DNA separated by five base pairs in HIV-1 and four base pairs in RSV (reviews: Brown, 1990; Katz & Skalka, 1994).
The HIV-1 integrase produced in an Escherichia coli expression system, can carry out both 3¢ processing and strand transfer in vitro in the presence of divalent cations such as Mg ++ and Mn ++ (Bushman & Craigie, 1991).It can also, in the same ionic environment, carry out the apparent reversal of the transfer step in the presence of a synthetic Y-shaped oligonucleotide (Chow et al., 1992).Not the entire protein, but only its catalytic core, IN(50-212), containing the residues 50 to 212, is required for the disintegration step, showing that the core contains the enzyme active site.Site-directed mutagenesis experiments showed that catalytic activity is abolished by the substitution of any of the three absolutely conserved carboxylate residues, two aspartic residues D64, D116 and a glutamic residue E152 (the so-called D,D-35E motif).The same arrangement of catalytically essential carboxylates is conserved not only in all retroviral integrases but also in enzymes that catalyze cognate reactions such as prokaryotic transposases.This observation strongly suggested that their spatial conformation should be similar to the conformation of the corresponding amino acids D64, D121 and E157 in the RSV integrase catalytic core.Although this seemed to be disproved by the first crystal structures of the HIV-1 catalytic core (Dyda et al., 1994), it was finally confirmed by two independent groups (Maignan et al., 1998;Goldgur et al., 1998).The dication is in the immediate vicin-ity of the essential carboxylates, which is a common feature among enzymes in nucleic acid biochemistry (Cowan, 1998).Some years ago, we designed molecules that could bridge two Mg ++ cations, because, at the time, in the absence of crystallographic data, the integrase active site was thought to have a conformation similar to that of the b polymerase Klenow fragment (Beese & Steitz, 1991).These compounds had two moieties: the quinoline part could bind a cation through its 7 carboxyl or the region between N and O8 since 8-hydroxy quinoline is known to chelate bivalent and trivalent metal ions (Bardez et al., 1997).The hydroxyl groups that reside on the phenyl ring could bind a second cation.Actually, styrylquinoline derivatives having a catechol or a resorcinol moiety have been shown to be potent HIV-1 integrase inhibitors in vitro with IC 50 in the 1 mM range (Mekouar et al., 1998).Moreover, these compounds displayed an antiviral activity in a de novo infection assay of CEM4fx cells and, therefore, they represent an exciting new lead for the design of new anti-HIV drugs.These styrylquinoline derivatives were likely to target the integrase catalytic core since they acted as inhibitors in the disintegration assay performed with the active site-containing deletion mutant (Mekouar et al., 1998).However, a more accurate knowledge of molecular interactions between these compounds and their presumed target was needed to further improve their pharmacological properties.Consequently, Ouali et al. (2000) determined the thermodynamic and geometric parameters of their binding to the catalytic core of RSV integrase by intensive use of computer power.Each of the 14 styrylquinoline derivatives was systematically rotated and translated on a computer relative to the RSV catalytic core crystallographic coordinates.The solid displacements that gave a good overlap between the drug and protein surfaces were memorized.The 1100 best conformations were then minimized to partially take into account the flexibility of the ligand and of the protein.This method allowed us to predict that the most likely binding sites of these 14 styrylquinoline derivatives were in the immediate vicinity of that Mg ++ cation which is essential for the activity of the enzyme and that is reported in the crystallographic data (Bujacz et al., 1995).The same could be achieved for various tautomeric forms.It was shown that the keto form (7-CO 2 H quinoline 8-hydroxylate) had a more favorable interaction energy than the enol form (8-OH quinoline 7-carboxylate).Moreover, we showed that the interaction energy between drugs and RSV integrase catalytic core calculated for the best-fitted models perfectly correlates with biological data.Finally and surprisingly, the catechol moiety was much less important than the quinoline part for the binding.This suggested that substituting one of the hydroxy group by a methoxy should have only a minor effect on the binding.This work was undertaken to check this hypothesis and we present here the results about two styrylquinoline derivatives containing methoxy substitutions.The inhibitory power of a tautomer depends on two conditions: the active form should be available in water and have the greatest affinity for the protein.How much of the active form is available can be determined if solvent effects are taken into account in the computation of the electronic structure.Usually this is done by placing the solute molecule in a cavity surrounded by a polarizable continuum whose reaction field modifies the energy and the properties of the solute.We used here the conductor-like solvation model (COSMO) first proposed by Klamt & Schürmann (1993) and now available as the conductor type polarisable continuum model (CPCM) option in the GAUSSIAN98 package (Frisch et al., 1998;Barone & Cossi, 1998;Rega et al., 1999).In this implementation, each atom of the solute is surrounded by a spherical cavity with a conductive boundary.Water is assumed to be a conductor.Hydrogens are not treated as independent atoms, but are united with adjacent heavier atoms.The radius of each atom is fitted to reproduce the solvation energies at a computational level (Hartree-Fock with a medium size basis set) allowing the study of relatively large systems (Barone et al., 1997).This approach should yield an estimate of the availability of the various tautomers in water.Moreover, it is possible to classify the inhibitory power by docking the drugs to the RSV integrase catalytic core.Modelling of protein structure.Before making any docking attempt, it was necessary to know the conformation of the catalytic core.First, the structure of the catalytic core of RSV integrase published by Bujacz et al. (1995) was minimized with our minimizer (Le Bret et al., 1991) and the AMBER force field (Cornell et al., 1995) to remove any close contact.The missing atoms in the crystallo-graphic coordinate file were set using the XRAY option in the EDIT module of AMBER.The total coulombic charge of the catalytic core (with the divalent bication) was +10 proton charges.The effect of the minimization on the RSV integrase catalytic core is very small.Docking algorithm.The docking method that was used in this study has been incorporated into our program MORCAD (Le Bret et al., 1991).It consists of several steps: a grid search fit by rotating (15°increments) and translating (0.916 Å increment) the drug while holding the protein rigid as described by Katchalski-Katzir et al. (1992).A score function that takes into account the surface overlap and the electrostatic interaction (Rogers & Sternberg, 1984;Gabb et al., 1997) is computed.The 1100 best conformations are minimized using a quasi newtonian algorithm (Le Bret et al., 1991) to allow some flexibility.The force field is the AMBER forcefield (Cornell et al., 1995) for amino acids, the Aqvist (1990) parameters for the Mg ++ divalent cation and a sigmoidal dielectric constant to mimic the aqueous medium (Lavery et al., 1986).The parameters for the drugs were taken from those of the unmethylated compounds (Ouali et al., 2000).A 10 Å cut-off was used in the nonbonded interactions during minimization.Once minimization of the whole system (protein + drug) was completed, the interaction energy was re-evaluated without the cut-off.The drug-catalytic core interaction energy is the sum of all interactions between any drug atom and any catalytic core atom.Such grid searches have the advantage of getting to the neighborhood of the correct solution (Shoichet, 1996).The full cycle requires 24 h of SGI R10000 195 Mhz processor time per tautomer.Quantum mechanical calculations.The structure of the drugs was optimized through the ab initio program GAUSSIAN98 (Frisch et al., 1998) at the restricted Hartree Fock approximation, using the 6-31G(d) basis.The electrostatic charges were calculated by the Merz-Kollman (Besler et al., 1990) procedure in each tautomer and each rotamer for use in the docking procedure.The Gibbs free energy difference was computed at 298.27 K using the Freq option of GAUSSIAN98.

Spectroscopic properties
The absorption spectrum of drug 1 is shown in Fig. 2. Drug 1 is weakly fluorescent in 0.1 M Tris buffer.This is not a homogeneous solution since the emission spectrum depends on the excitation wavelength.For instance, there is no detectable fluorescence when drug 1 is excited at 550 nm but a very weak fluorescence can be measured when it is excited at 350 nm.When the compound is left in the aqueous buffer, its absorption spectrum varies by 10% during the first hour and becomes stabilized after a day.The kinetics is rather slow with respect to the time required to measure the IC 50 .The weak fluorescence vanishes with time.Obviously the drug becomes degraded.The degradation might be related to the degradation of stilbene: the major products in the photolysis of trans-stilbene in the near ultra violet are trans-and cis-stilbene.In the case of cis-stilbene photolysis the major products are trans-and cis-stilbene but dihydrophenanthrene also can be formed (Waldeck, 1991).No fluorescence of drug 2 could be detected when it was directly dissolved in Tris buffer.Because of its low solubility, the drug was first dissolved in dimethylsulfoxide, then in Tris buffer.In contrast to drug 1, no degradation of drug 2 was observed after it had stayed for 24 h in 0.1 M Tris buffer.Both drugs are stable when dissolved in Tris buffer, ethanol or dimethylsulfoxide.Both compounds were fluorescent in ethanol, a solvent in which the solute can be quantitatively measured at 10 -5 M concentration.The emission maximum occured at 545 nm for drug 1 and 520 nm for drug 2 when they were excited at 350 nm.These fluorescence yields are too small for determination of binding constants and we resort below to computer simulations.Divalent cations such as Ca ++ or Mg ++ at 10 mM concentration have no detectable effect.In contrast, Zn ++ , Mn ++ and Fe ++ completely quench the fluorescence and modify the absorption spectrum in a way that is comparable to the effect of aging.

CPCM calculations on hydroxy benzoic acids
In our styrylquinoline derivatives, the carboxy and hydroxy substituents lie at adjacent positions R7 and R8, exactly as in salicylic acid.It has long been known that in salicylic acid, the neighborhood of hydroxyl and carboxyl substituents favors the formation of a chelate ring, in which a divalent hydrogen atom is a link (Branch & Yabroff, 1934).Hence, both pKs of salicylic acid significantly differ from those of meta and para hydroxy benzoic acids (Dunn & McDonald, 1969;Lange, 1967).It is generally accepted that the enol form of salicylic acid prevails both in ground and excited states (Lahmani & Molar extinction coefficients of a fresh solution (10 mg/l) of 1 (curve 1) and 24 h later (curve 2).The molar extinction coefficient of 2 in the same conditions did not vary (curve 3).
Zehnacker-Rentien, 1997; Sobolewski & Domcke, 1998).It was interesting to estimate the validity of this model in our styrylquinoline derivatives (see Fig. 1) and compute the energy and the free energies both in vacuo and in water in a case where the pKs are experimentally known.If the computations were correct, the experimental pKs of the two compounds should follow the expression: where RT is the Boltzmann factor, and DG is the free energy difference of the protonated and unprotonated forms of a compound.To predict a pK with an error of 1, we should have an error in the energy computations less than 1.4 kcal/M.Such an accuracy is still a challenge for the to-day state of art and we would be very satisfied if we could sort out the conformers and tautomers.
If we exclude the case in which both carboxylic oxygens are protonated, the ortho and meta hydroxy benzoic acids have 15 forms: 8 neutral forms, 6 with a charge of -1 and one with a charge of -2 (cf.Fig. 3).Because of its symmetry, the para compound has only 8 forms.Table 1 shows the results.From the in vacuo Hartree-Fock results, the ortho keto 00.form is predicted to be more favorable than the enol form .0 which is not realized experimentally in aqueous solution.When electron correlation is more accurately treated with B3LYP/6-311G(2d,p), the keto form is still more favorable than the enol form by 1.3 kcal/M (result not shown in Table 1).When solvent effects are included with CPCM, the enol form .0 gets more favorable than the keto form 00. (see Table 1).Moreover, at this level, the meta compound is more acidic than the ortho compound, in contrast to experimental data.We conclude that CPCM does not give good results when it is used with B3LYP.When CPCM is used at the HF/ 6-31+G(d) level, the DG correlates poorly with the experimental pKs, but the ordering is correct and this is a significant improvement.Finally, we note that the DG = 264.9kcal/M for water protonation is probably underevaluated.It should be greater than 277.2 kcal/M if all hydroxybenzoic acids have a charge of -1 when dissolved in water and less than half of 564.9 kcal/M (282.4 kcal/M) to prevent them from having a charge of -2.

Calculations on the styrylquinoline derivatives
Tables 2 and 3 show the results concerning drugs 1 and 2, respectively.The keto forms are favored in vacuo, but the enol forms prevail in the CPCM computations.
The docking results (Tables 2 and 3) may be summarized as follows : u both drugs bind closely to the Mg ++ divalent cation that is embedded in the crystallographic structure in the immediate vicinity of the D,D-35E motif.This remains valid whatever the tautomer or rotamer form.The atoms of the drug that are close to the cation are indicated in the Tables.In most cases, the drugs bind by the quinoline moiety.The keto form is favored.The conformations are coded by two binary numbers separated by a dot.If the substituent is not protonated the number is replaced by a blank.The first number concerns CO 2 H.The low bit of the first number is set to 1 if the oxygen that carries the proton is close to the hydroxyl group and is set to 0 otherwise.The high bit is set to 1 if the proton is close to the cycle.The second number concerns the hydroxyl group.A keto form is noted by a blank.If the proton points to the carboxyl group the bit is set to 0.

In vacuo
In water (CPCM) CO 2 dihedral (degrees)  The rotamer conformations are encoded as described in Fig. 2. In vacuo DEs, are the optimised ab initio RHF /6-31G(d) energies.In vacuo DGs contain the thermal correction to the Gibbs energy multiplied by the ad hoc factor 0.9.The CPCM DGs contain the latter thermal correction and the optimised CPCM total free energy (with non electrostatic terms) computed at either HF/6-31+G(d) or B3LYP/6-311G(2d,p) level in a medium of dielectric constant equal to 78.39.The last line reports the deprotonation energies of H 3 O + in vacuo and in water.
u The drugs seem to bind either in a "horizontal" slit as in Fig. 4 or "vertically" as in Fig. 5.This is shown by the characters h or v in Tables 2 and 3. When it binds horizontally, the drug separates the two aspartic acids from the glutamic residue on the catalytic core of the D,D-35E motif and touches the following residues: D64, N122, N149, Q153, N160.In the vertical binding site the drug touches the following residues: D64, Q153, E157, N160, W176, R195 and V196.u The drug that was planar in the rigid docking phase becomes slightly distorted after minimization.The distortion is always small so that only the final values of the angle a are indicated because they are always very close to their starting values (0°or 180°).

DISCUSSION -CONCLUSIONS
Integrase inhibitors of possible medical use should have IC 50 in a less than 10 -9 M range.The best drugs in the styrylquinoline series have IC 50 slightly smaller than 10 -6 M and the IC 50 of the inhibitors we have studied here is 2 ´10 -6 M. It is also necessary to get more insight into the conformation of the drug in solution and bound to what can be imagined as a target.This is rather simple in the case of many compounds.In the styrylquinoline series it is far from trivial because of the many possible tautomeric forms.The study of the tautomers in vacuo is only the first approach which should to be supported by computations such as CPCM where solvent effects are taken into account.Even if the method is not  Here the divalent cation (black) can be seen.
reliable enough for pKs to be predicted, the results reported here show that the enol form is favored in aqueous solution.Docking computations have been performed to get insights into the binding.The computed interaction energies are very large.They are not related in a simple way to experimental dissociation constants because entropy and solvent effects have not been fully taken into account in our computations.The obtained values, in any case, do not correspond to experimental dissociation constants around 10 -6 M, that are expected if the IC 50 are straightly translated into binding constants.Although the absolute values are probably incorrect, the calculated energies may be interpreted as a score.Ouali et al. (2000) report scores in the -117 to -113 kcal/M range for styrylquinoline derivatives that have an IC 50 less than 10 -6 M and in the -80 to -50 kcal/M range for drugs that have an IC 50 greater than 10 -4 M. The presented interaction energies are perfectly compatible with drugs that have an IC 50 of about 2 ´10 -6 M. Therefore docking of molecules seems to be, in this case, a valid tool.The remarkable fitness of ex nihilo calculation to experimental in vitro data will allow us to make predictions and eventually lead to design of more active drugs.
These results show that the drug inhibits the integrase by complexing the protein close to the active site.This binary complex is of course much simpler to model than a ternary complex in which DNA is somehow involved.Consequently, drug design should not be as difficult as could have been imagined.While this manuscript was being revised, another inhibitor 5CITEP, was found by X-ray crystallography to be bound centrally in the active site of the HIV-1 integrase catalytic core (Goldgur et al., 1999).As this compound has two cycles connected by a linker that is similar to ours, the localization found here in silico for styrylquinolines is probably right.We found that the keto tautomer showed a better interaction energy with RSV integrase than the end tautomer.Unfortunately, it is not abundant in solution.In the near future we should be able to design drugs that have the right tautomeric form.

Figure 4 .
Figure 4. Stereoview of drug 1 in its most favorable binding to RSV integrase catalytic core.The catalytic core is shown as a wire model.The three carboxylic acids D64, D121 and E157 are in light gray.The drug is darker and hides the Mg ++ divalent cation.

Figure 5 .
Figure 5. Stereoview of drug 2 in its most favorable binding to RSV integrase catalytic core.
This planar conformation is unstable.If it is distorted along one of its two imaginary normal modes it spontaneously goes to one of the other conformations. a

Table 2 . Energy and geometrical characteristics of various drug 1 tautomers, in vacuo, in wa- ter, and bound to the ASV integrase catalytic core.
Rotamer coding for R7 and R8 is the same as in Fig.2.If R3¢ points towards R4¢ the bit is set to 0. If R4¢ points towards R3¢ the bit is set to 0.