A systematic method for DNA fragment amplification and sequencing based on DNA indexing technology. Protocol and technical considerations*

DNA indexing is based on a presynthesized library of oligonucleotide adaptors (256 in total), named indexers, and type-IIS restriction endonucleases. It enables amplification and direct analysis of large DNA fragments with low overall redundancy and without subcloning. Here, we describe a detailed protocol for PCR-based amplification of DNA fragments followed by DNA sequencing by indexer walking and provide helpful hints on its practical use. The proposed protocol can be applied to the sequencing of plasmids, cDNA clones, and longer DNA fragments. It can also be used for gap filling at the final stage of genome sequencing projects.


INTRODUCTION
The present-day DNA research requires fast preparation of templates for sequencing so hundreds of samples can be analyzed simultaneously (Chan, 2005). Problems arise when there is a need to close gaps in big sequencing projects or determine the exact sequence of highly repetitive DNA fragments (Bzymek & Lovett, 2001). Moreover, research groups that concentrate their work on small plasmid molecules have often used protocols that are widely applied for sequencing of much larger molecules (Lee & O'Sullivan, 2006). All those issues seemed to be addressed by the introduction of DNA sequencing by primer walking (Strauss et al., 1986). This method allows a very systematic approach to DNA analysis. To be efficient, it requires a rapid synthesis of individually designed primer after every round of DNA sequencing, and the resulting collection of primers may never be used again. Even with considerably low prices of oligonucleotide synthesis, it is still a time-consuming process (Lashkari et al., 1995;Sanghvi & Schulte, 2004). Several modifications were introduced to limit the amount of once-used primers, like the application of short oligonucleotides that formed full-length primer during the annealing step (Szybalski, 1990;Kieleczawa et al., 1992;Kaczorowski & Szybalski, 1994). Still, the libraries created of 5-or 6-mers are challenging to handle, and the process of primer formation is not easily adapted to the modern sequencing chemistry (Lohdi & McCombie, 1996;Raja et al., 1997;Kaczorowski & Szybalski, 1996a;Kaczorowski & Szybalski 1996b). In 1990 it was proposed to use a short double-stranded oligonucleotide adaptor (called an oligo-cassette) to sequence fragments adjacent to the already known genomic DNA (Rosenthal and Jones, 1990). The subsequent development allowed to retrieve unknown DNA fragments obtained by digest with type-IIS restriction endonucleases without any cloning and was based on ligation of two oligonucleotide adaptors, the so-called indexers ( Fig. 1), chosen from a presynthesized library (Unrau & Deugau, 1994;Szybalski et al., 1991). With the idea of "the universal restriction enzyme," Prof. Waclaw Szybalski pioneered the use of type-IIS restriction endonucleases as tools in DNA processing which resulted in the development of many ingenious laboratory techniques (Szybalski, 1985;Podhajska & Szybalski, 1985;Hasan et al., 1986;Posfai & Szybalski, 1988;Velculescu et al., 1995). This has also started a long interest of Prof. Anna J. Podhajska laboratory (University of Gdansk, Poland) in biochemistry of shifters with special focus on FokI (Kaczorowski et al., 1989;Skowron et al., 1993;Kaczorowski et al., 1999), MboII (Sektas et al., 1992;Sektas et al., 1995;Furmanek-Blaszk et al., 2009), and MmeI enzymes (Tucholski et al., 1998;Nakonieczna et al., 2009).
Our laboratory applied the DNA indexing method to sequence unknown DNA molecules in a highly systematic mode. We called it DNA sequencing by indexer walking (Gromek & Kaczorowski, 2005). The idea was conceived during a stay in Prof. Waclaw Szybalski's laboratory at the University of Wisconsin-Madison. The basic outline for the method of DNA sequencing by indexer walking can be best summarized by four steps: cloning The common indexer primer (20-mer) is annealed to a longer oligonucleotide (24-mer) that defines the double-stranded indexer's specificity. Each indexing strand has a unique four-nucleotide long 5'-protruding end and is a part of the library of 256 oligonucleotides. unknown DNA fragment into a suitable vector, the first step of DNA amplification, sequencing, and analysis followed by a second step consisting of indexer ligation to DNA template and amplification, and finalized with a third step -DNA sequencing and analysis (Fig. 2). The second and third steps are repeated until the whole sequence of the DNA fragment is determined. Then, the process is repeated for the complementary DNA strand. During initial testing of this method, we concentrated on sequencing small plasmids up to 5-kb. Still, with the discovery of more robust thermostable DNA polymerases and variations introduced by the enzyme manufac-tures, the initial unknown fragment can be longer (Lee et al., 2007;Cline et al., 1996). The key element of this method, the double-stranded indexer, is formed by annealing a universal indexing primer (CIP) to one of the 256 longer oligonucleotides, which differ only by 4-nt at the 5'-end (Fig. 1). The small library of 256 indexers, with low prices of oligo synthesis and with the minimal amount used per one ligation reaction (100 fmol of indexer), is more than affordable. It is also universal because it can be used for any DNA sequencing project. The type-IIS restriction endonucleases used for systematic trimming of DNA molecules digest DNA leaving 4-nt  Gromek and Kaczorowski (2005). The protocol of DNA sequencing by indexer walking incorporates efficient ligation of double-stranded synthetic oligonucleotides (indexers) to DNA fragments produced by class IIS restriction endonucleases which generate four nucleotide long 5' protruding ends, and their subsequent amplification, which provides enough template for automated DNA sequencing. Data gathered in the first sequencing reaction permits further movement into the unknown DNA sequence by digestion with class IIS restriction endonuclease followed by ligation of the next indexer. The presynthesized library of indexers (256 oligonucleotides) enables bi-directional analysis of any DNA molecule and provides universal primers for sequencing.
5'-overhangs that are then matched to adaptors from our library (Table 1). Many manufacturers sell these enzymes, and several enzymes with new specificities might be commercially available in the future (http://rebase. neb.com). Now we present a very detailed protocol for DNA sequencing by indexer walking.
2) DNA purification Kit Gel Out (A&A Biotechnology, Poland). Indexing strands are phosphorylated at the 5'-terminus to make them suitable for ligation to DNA fragments produced by type-IIS restriction endonucleases.
3) Stop the reaction by heating at 80 o C for 2 min. 4) Allow to cool to room temperature (RT -20 o C). 5) Can be stored at -20 o C.
2) Heat at 65 o C for 5 min and then allow to cool slowly to room temperature.
2) Calculate plasmid DNA concentration using Nano-Drop 1000 UV-Vis spectrophotometer (Thermo Fisher Scientific). We also found that the One-Dscan ver. 1.33 software (Scanalytics, USA) works well for measuring DNA concentration against known standards. The choice of technique depends on equipment and software available for use in a given laboratory.

P.4. Cloning DNA of Interest into Vectors to Obtain Library of Inserts up to 5-kb in Length
We use the pGEM3Zf(+) vector (Promega, USA) as our primary choice because of high DNA yields, the wide selection of restriction enzymes for cloning, and the ability to use the universal primers M3/pUC, Forward (-23) 5'-GTTGTAAAACGACGGCCAGT, and Reverse (-28) 5'-CACAGGAAACAGCTATGACC.
1) Mix 1 μl T4 DNA ligase, 20 ng of linearized vector DNA, and 100 ng of target DNA (linearized natural plasmid or DNA fragment processed with restriction enzymes) in 10 μl total volume of 1× T4 DNA ligase buffer with ATP.
2) Incubate at room temperature for 2 hr.
3) Introduce ligated DNA molecules into bacterial cells by transformation or electroporation. 4) Screen for recombinant clones. 5) Purify plasmid DNA according to step P.3.

P.5. 1st
Step -DNA Amplification and Sequencing Amplify cloned DNA fragment by use of universal vector primers which flank the cloned insert. This step provides a DNA template for the next round of DNA analysis and processing. 1) Mix 60 ng of cloned plasmid DNA, 50 pmol of universal vector primers, and 0.8 mM dNTPs in a total volume of 50 μl 1× PCR buffer.
3) Add a mixture of two thermostable DNA polymerases (4 U DyNAzyme II DNA polymerase, Finzymes, and 0.2 U of Pfu DNA polymerase, Thermo Scientific).
4) Preheat at 94 o C, 2 min, and then cycle 30 times at 94 o C, 40 s, at 55-61 o C, 40 s, and 72 o C, 4-9 min. 5) Follow by additional extension step at 72 o C, 10 min, and hold at 4 o C as needed.
The melting temperature of both primers determines the annealing temperature, and the extension time should be adequate to the length of the amplified DNA fragment 6) Purify by isolating from 1% agarose gel after electrophoresis separation (2 h, 40 mA). 7) Calculate DNA concentration. First round of DNA sequencing is performed with one of the universal vector primers (forward or reverse). 10) Separate extension reaction products by use of ExTerminator spin columns according to the manufacturer's manual. 11) Denature purified extension product by heating at 95 o C for 5 min and fast cooling on ice for at least 2 min. 12) Separate sequencing reaction products on an ABI PRISM 310 Genetic Analyzer (Perkin Elmer, Applied Biosystems, USA) equipped with a 60-cm capillary filled with POP6 polymer, for 150 min under standard running conditions. P.6. 1st Step -DNA Analysis and Partial Digestion The purpose of this step is to shorten the amplified DNA molecule by 450-500 bp from either side to obtain fragments with 4-nt nonidentical 5'-overhangs.
1) Analyze the obtained DNA sequence using software (Vector NTI) to find restriction sites recognized by type-IIS restriction endonucleases.
2) Partially digest 300 ng of an amplified DNA fragment with 2 U of appropriate type-IIS restriction endonuclease in a total volume of 20 μl of 1× restriction enzyme buffer for 15 min. The incubation temperature depends on the properties of restriction endonucleases and is suggested by the enzyme manufacturer.
Alternatively, DNA can be extracted with phenolchloroform, precipitated with ethanol, and finally dissolved in 10 μl 0.1× Tris-EDTA pH 8.0 buffer.

P.7. 2nd Step -DNA Indexing and Amplification
Partially digested DNA is the template for attaching a double-stranded indexer with complementary 4-nt 5'-protruding end, followed by amplification with one universal vector primer and an indexer-specific primer.
1) Perform ligation reaction in 10 μl of 1× T4 DNA ligase buffer with ATP by mixing 60 ng of partially digested DNA, 10 U of T4 DNA ligase, and 100 fmol of the appropriate indexer.
2) Incubate at room temperature for 1 hr.
3) Inactivate the enzyme by heating at 65 o C for 10 min. 4) Prepare PCR mix by adding 12 ng of indexed DNA, 25 pmol primers (universal vector primer and a common indexer primer), 0.8 mM dNTPs, and 10 μl betaine to a total volume of 50 μl 1× PCR buffer. 5) Incubate at 94 o C for 2 min. 6) Add the mixture of two thermostable DNA polymerases (4 U DyNAzyme II DNA polymerase, Finzymes, and 0.2 U of Pfu DNA polymerase, Thermo Scientific).
7) Preheat at 94 o C, 2 min, and then cycle 30 times at 94 o C, 40 s, at 55-61 o C, 40 s, and 72 o C. 8) Follow by additional extension step at 72 o C, 10 min, and hold at 4 o C as needed.
The time for the extension step depends on the length of the DNA template. 9) Purify DNA fragment by isolating from 1% agarose gel after electrophoresis separation (2 h, 40 mA). 10) Calculate DNA concentration.

P.8. 3 rd
Step -DNA Sequencing, Analysis, and Partial Digestion The amplified DNA fragment, flanked both by the universal vector primer sequence and attached indexer, is a substrate for further analysis. DNA sequencing reaction is performed with the use of the common indexer primer.
2) Cycle 35 times at 96 o C, 30 s, at 51 o C, 20 s, and 60 o C, 4min. Hold at 4 o C as needed.
3) Separate extension reaction products by use of Ex-Terminator spin columns according to the manufacturer's manual. 4) Denature purified extension product by heating at 95 o C for 5 min and fast cooling on ice for at least 2 min. 5) Separate sequencing reaction products on an ABI PRISM 310 Genetic Analyzer (Perkin Elmer, Applied Biosystems, USA) equipped with a 60-cm capillary filled with POP6 polymer, for 150 min under standard running conditions. 6) Analyze obtained DNA sequence to find restriction sites recognized by type-IIS restriction endonucleases. 7) Partially digest 300 ng of an amplified DNA fragment with 2 U of appropriate type-IIS restriction endonucleases in a total volume of 20 μl 1× restriction enzyme buffer for 15 min. The incubation temperature depends on the properties of restriction endonucleases and is suggested by the enzyme manufacturer. Alternatively, DNA can be extracted with phenolchloroform, precipitated with ethanol, and finally dissolved in 10 μl of 0.1× Tris-EDTA pH 8.0 buffer.

P.9. Consecutive Cycles of 2 nd and 3 rd Step
Follow with cycles of the 2 nd and 3 rd step until the complete sequence of the analyzed DNA fragment is determined. Then, to obtain the second DNA strand sequence, follow the whole procedure (point P.5.8 to P.8) using the second universal vector primer.

COMMENTS
In testing DNA sequencing by indexer walking, we introduced several modifications and improvements to ameliorate the overall performance for this method.
1. Oligonucleotides used in this protocol are in a desalted form and are applied without any purification procedure.
2. The sequence of our CIP oligonucleotide is as follows: 5'-TAC ACT GGC TGC GTA TCT GG 3'. Table 2 presents the indexing strands used for sequencing of the pEC278 plasmid (GenBank accession no. 4. At present, there are 31 commercially available type-IIS restriction endonucleases that recognize 11 unique recognition sequences 5-7 bp in length (Table 1). The average size of the fragment produced ranges from 512 to 8196 bp. In the absence of any appropriate type-IIS restriction enzyme, it is possible to use regular type-II restriction enzymes that rarely cut the already sequenced DNA fragment. We have successfully employed this approach in sequencing the pEC278 plasmid (Gromek & Kaczorowski, 2005; GenBank accession no. AY589571). For the 3900-4000 region, the only enzyme of choice (a rare cutter) was HindIII. The same enzyme was used in determining the sequence of the corresponding region on the complementary strand.
5. For increased amplification specificity, it is possible to use a biotinylated universal vector primer in every amplification reaction. The biotin residue is attached to the 5'-end of the universal vector primer. The following protocol is optional: a) After ligation of indexer to the DNA target, recover biotinylated indexed DNA by mixing 5 μl of the ligation mixture with 40 μl of 1× saline washed streptavidin-coated beads (Streptavidin MagnaSphere Paramagnetic Particles, Promega, USA). Recently, we found that the addition of recombinases can improve the specificity of PCR-based DNA amplification (Stefanska et al., 2014;Stefanska et al., 2016). b) Isolate the DNA molecules with a newly attached indexer from the ligation mix by magnetic separation stand (Promega, USA). Incubate at room temperature for 5-10 min. c) Wash twice with 2× Binding and Washing solution (10 mM Tris-HCl, pH 7.5; 1 mM EDTA; 2 M NaCl). d) Suspend beads with attached DNA in 20 μl of 10 mM Tris-HCl, pH 8.0. e) Use 5 μl of suspended beads with attached DNA as a template in PCR amplification.
6. In parallel to standard procedure for isolation of DNA fragments from agarose gels, we developed a higly efficient protocol for electrophoretic transfer of DNA from a gel to DEAE-cellulose membrane .
7. The manufacturer of Terminator Ready Reaction Mix (TRRM) with AmpliTaq DNA polymerase (Perkin Elmer, Applied Biosystems, USA) used for sequencing recommends applying 8 μl TRRM per reaction. We found that using less TRRM and adding 5× Sequencing Buffer (also supplied by the same company) gave better resolution. The capillary injection time for the ABI Prism 310 Genetic Analyzer was in the range of 20 to 60 s depending on the quality of DNA used in sequencing reactions.