Alternative 3' Acceptor Site in the Exon 2 of Human Pax8 Gene Resulting in the Expression of Unknown Mrna Variant Found in Thyroid Hemiagenesis and Some Types of Cancers

PAX8 gene encodes one of the transcription factors engaged in the regulation of proper development of thyroid gland as well as Müllerian and renal/upper urinary tracts. So far, six alternatively spliced transcripts were reported , however, sequences of only four were deposited in the NCBI database. Here, we evaluate a fragment of a novel variant of PAX8 mRNA formed by an alternative 3' acceptor site located in the second exon. The molecular outcome encompasses extension of the 5' untranslated region of exon two by 97 nucleotides as is evident from mRNA. This new insert may impair binding of mRNA to the ribosome and in consequence significantly decrease expression of the PAX8 protein. Here, we show for the first time that the novel insert in exon two might be associated with congenital thyroid hemiagenesis and influence development of different types of cancer.


INTRODUCTION
PAX8 gene belongs to the paired-box (PAX) family of nine transcriptional factors (PAX1-9).They contain a conserved DNA-binding domain that forms a paired box across all family members (Xu et al., 1995).These transcription factors contribute to the development of eukaryotic organisms, playing a fundamental role in organogenesis as regulatory proteins.They are expressed in embryonic or neoplastic cells.Members of the PAX family are required for cell growth and differentiation in fetal tissues.Consequently, PAX8 participates in regulation of embryogenesis of the thyroid gland as well as Müllerian and renal/upper urinary tracts.
The PAX8 gene is located on the human chromosome 2q12-q14 (Stapleton et al., 1993) and consists of at least 12 exons.So far, NCBI database provided information about four (a, c, d, e) alternatively spliced transcripts which produce various PAX8 proteins.However, more isoforms are found in the literature (b, f in addition) (Kozmik et al., 1993).PAX8a (NCBI Reference Sequence: NM_003466) is encoded by all of the 12 exons resulting in the translation of the longest, 450 amino acid (AA) protein.PAX8c (NM_013952) has a 79 nucleotide long deletion in exon 9 which results in a frameshift and synthesis of a completely different amino acid sequence following the deletion.In consequence, PAX8c is an only 398 AA long protein.PAX8d (NM_013953) lacks two alternate exons, 8 and 9, compared to PAX8a variant, which also results in a frameshift and introduction of a premature stop codon (321 AA long protein).Furthermore, two isoforms, b and f, were described in the literature.PAX8b transcript arises by skipping the entire exon 8, thus leading to in-frame fusion of exon 7 and exon 9 (Kozmik et al., 1993).PAX8e and PAX8f were observed only in the placenta where isoforms c and d are missing.Interestingly, the expression of e and f isoforms is abundant at the early embryonic stages but then it gradually decreases, while PAX8a is the predominant transcript (Kozmik et al., 1993).
PAX8 demonstrates dual functionality resulting from two structural parts of the protein, i.e. (1) N-terminal DNA-binding region and (2) C-terminal transactivation region (Campagnolo et al., 2007).N-terminal region is composed of three domains: (1) paired box domain (Wang et al., 2008), (2) phylogenetically-conserved octapeptide, and (3) distal homeodomain.Paired box domain has an evolutionary conserved 128 amino acid long fragment which is formed by two subdomains known as PAI (N-terminal) and RED (C-terminal), each of them composed of a helix-turn-helix motif joined by a linker region.Consequently, both subdomains bind DNA independently (Bowen et al., 2007;Narlis et al., 2007).Cterminal domain has a sequence rich in proline, serine and threonine which forms a potential transactivation domain.Any alterations in this region may lead to the reduction of PAX8 activity, and further resulting in low expression of its target genes (Esperante et al., 2008) Universidad de Buenos Aires, Buenos Aires, Argentina.</auth-address><titles><title>Identificationand characterization of four PAX8 rare sequence variants (p.T225M, p.L233L, p.G336S and p.A439A. PAX gene family is considered to play a critical role in the speciation of cell types and their location in the developing embryo.Moreover, PAX genes are ex-pressed subsequently in progenitor and mature cells of adult organisms.PAX8 gene is expressed in the developing central nervous system, kidneys, and thyroid in mice (Narlis et al., 2007) (Poleev et al., 1995).In the adult organism PAX8 is expressed exclusively in the thyroid and kidneys (Parlato et al., 2004).However, PAX8 is the only member of the PAX gene family that is expressed in the thyroid (Poleev et al., 1995).Additionally, higher expression of the PAX8 protein was demonstrated in multiple human tumors i.e. ovarian carcinoma, kidney adenocarcinoma (Bowen et al., 2007;Tong et al., 2008).Therefore PAX8 gene expression can be a helpful marker for detection of cancer in its early stages (Nonaka et al., 2008).In consequence, all research devoted to the PAX8 gene will broaden our knowledge and shed new light on understanding embryogenesis as well as carcinogenesis.

MATERIALS AND METHODS
Biological material for study.The postsurgical tissue samples from patients were received either from Department of General, Gastrointestinal and Plastic Surgery and Department of Gynecological Oncology, Poznań University of Medical Sciences or Department of General and Trauma Surgery, with subdivision of Gastroenterological and Endocrine Surgery, District Hospital of Poznań.We used four samples derived from thyroid gland.One patient suffered from thyroid hemiagenesis and another three were operated due to non-cancer related disorders used as a control material.The second group of patients affected by different types of cancer was sampled as well and their tissue specimens were received from Department of Gynecological Oncology.All patients underwent surgery due to suspicion of ovarian cancer.Finally, this group was composed of two primary ovarian cancers and peritoneum cancer as well as diagnosed metastasis: Krukenberg tumor and breast cancer.All tissue samples were preserved in RNAlater ® (Life Technologies-Ambion) and maintained at -80°C.Blood samples for PAX8 gene sequencing were obtained from 20 patients affected by thyroid hemiagenesis and 3 healthy individuals as control group.
Patients participating in this research provided their written informed consent and all studies were approved by Bioethical Committee board of Poznan University of Medical Sciences.
RNA isolation.Total RNA was isolated using TRI Reagent ® method based on materials and procedures from Sigma-Aldrich.Shortly, approximately 1 g of each tissue sample was homogenized on ice in the presence of the TRI Reagent ® solution.The mixture was centrifuged at 12 000 × g for 10 min at 4°C in order to eliminate insoluble material.Then, supernatant was transferred into a fresh tube and 0.2 ml of chloroform was added per each 1 ml of the TRI Reagent ® solution.The mixture was shaken vigorously and centrifuged at 12 000 × g for 15 min at 4°C.The upper, water phase was collected, transferred into the fresh tube and 0.5 ml of 2-propanol was added per 1 ml of the TRI Reagent ® solution.After 10 min.incubation at room temperature, samples were subjected to subsequent centrifugation at 12 000 × g for 10 min at 4°C.The RNA precipitated and formed a pellet which was washed with 75% ethanol and centrifuged at 7 500 × g for 5 min at 4°C.The pellet was dried for few minutes at room temperature and finally dissolved in 50 µl of sterile water.RNA was kept at -80°C.
RT-PCR.Reverse transcription was carried out with 1.0 μg of total RNA following the manufacturer's instructions (First-strand cDNA synthesis kit, Fermentas).Template RNA, random hexamer primers (1 μl) and diethylpyrocarbonate (DEPC)-treated water were mixed together in a total volume of 11 μl and pre-incubated at 65°C for 5 min.in a Biometra thermocycler.The samples were then chilled on ice, centrifuged and transferred into new PCR tubes in order to prevent leakage of the cap.Subsequently, 5X reaction buffer (4 μl), Ribo-Lock™ RNase inhibitor 20 U/μl (1 μl), 10 mM dNTP mix (2 μl) and M-MuLV reverse transcriptase 20 U/μl (2 μl) were added to the pre-incubated solution, mixed by pipetting and incubated at 42°C for 60 min (Biometra thermocycler).Finally, cDNA was immediately used for subsequent amplification reactions.
2 μl of cDNA solution was used for each PCR reaction.INS primers (forward and reverse, Table 1) were added to the reaction to the final concentration of 0.5 µM.PCR reaction was performed using FastStart Taq DNA polymerase (Roche Applied Science) and the manufacturer protocol was followed precisely.PCR program was applied as follows: 95°C, 5 min (initial step); 95°C, 30 s (denaturation), 56-65°C, 30 s (annealing); 72°C, 1 min (extension); 35 cycles.Finally, the PCR product was analyzed on 1-1.5% agarose gel with ethidium bromide (Sigma-Aldrich).
DNA extraction and PCR.Genomic DNA was isolated from 200 µl of peripheral blood with the use of QIAamp ® Blood Mini Kit (QIAGEN, Hilden, Germany).The quality of isolated DNA was determined using 0.7% agarose gel electrophoresis.Resequencing primers used for determination of promoter region sequence were acquired from NCBI.Those primers are composed of target sequence and M13 phage universal tail.In our research, 5 selected primer pairs encompass the entire promoter region of the PAX8 gene (Table 1, Fig. 3A).50 ng of isolated DNA was used for the amplification (Fermentas, Thermo Scientific).PCR conditions were as followed: 95°C, 5 min (initial step); 95°C, 30 s (denaturation), 65°C, 30 s (annealing); 72°C, 1 min (extension); 35 cycles.Finally, the PCR products were analyzed on 1.5% agarose gel with ethidium bromide (Sigma-Aldrich).Before sequencing, samples were purified using PCR/ DNA Clean-Up Purification Kit (EURx).Sequencing.PCR products were sequenced on both strands with Forward and Reverse primers respectively.Capillary sequencing was conducted with the use of Big-Dye chemistry version 3.1, on an ABI 3130 DNA Analyzer according to the manufacturer instructions (Applied Biosystems).Samples were processed through 30 cycles of amplification consisting of 30 s at 94°C, 30 s at 60°C, and 45 s at 72°C.The final step was lengthened to 5 min.Analysis of sequence tracks was achieved with the use of CodonCode Aligner software or Bioedit Sequence Aligner Editor.

RESULTS AND DISCUSSION
The tissue samples analyzed were divided into two groups.The first group encompassed thyroid-derived tissues (isolated from a single lobe of thyroid from hemiagenesis patients), and from patients with bilobed thyroid serving as controls (individuals who suffered from non-cancerous thyroid tissue abnormalities).The second group consisted of cancers with different site of origin.
In order to study the sequence of PAX8 mRNA, a conventional RT-PCR method was applied.The primary intention for use of this approach was a comprehensive search for novel mutations on mRNA level, followed by subsequent confirmation by genomic DNA sequencing.Such methodology allowed an efficient and quick sequencing of PAX8 exons.Surprisingly, a novel isoform of PAX8 mRNA has been elucidated.
Reverse transcription, followed by gradient PCR, provided data on the product sizes from three different an-nealing temperatures, i.e. 56°C, 62°C, 65°C.Here, a pair of INS primers was used that allowed sequencing of a 587 bp long cDNA fragment.It encompasses a part of exon 1 and 5, as well as the entire sequence of exons 2, 3 and 4 (Fig. 2B).Surprisingly, electrophoretic examination of PCR products in the agarose gel revealed a new and unexpected product of 684 bp, 97 bp longer than the canonical one.Direct sequencing supplied information that the longer product represents a novel variant of PAX8 mRNA, characterized by an additional fragment, introduced between exon 1 and exon 2 (Fig. 2, A and  B).Further, genomic DNA analysis revealed an intron origin of the novel insert: this intron is located between exon 1 and 2, and is 263 bp in size (Fig. 2B).97 bp of its ultimate 3' site were not spliced out and were incorporated into a new PAX8 transcript.Conceivably, additional alternative 3' acceptor site in the second exon of PAX8 could be designated.
Presence of the new transcript was reproducible and turned out to be tissue specific.It was observed in the single lobe of thyroid hemiagenesis at all three annealing temperatures applied (Fig. 1A).Surprisingly, it was not present in the control thyroid.Instead, a previously described mRNA was observed there (a 587 bp long fragment obtained with the F and R primers, Fig. 1A).Further extensive analysis revealed that this additional alternative 3' acceptor site may be observed in various types of cancers.We identified this site in the following types of cancers: ovarian cancer (two patients) and metastasis to the ovary: Krukenberg tumour (one patient) as well as peritoneum (one patient) and breast cancers (one patient).All samples yielded longer RT-PCR prod- uct, however, the shorter variant could be also observed (Fig. 1B) at various annealing temperatures.
All products (either longer or shorter) were then sequenced in terms of double stranded reading.INS (forwad and reverse) primers were used to initiate each reading, see Material and Methods.Fig. 2A shows the final sequence between the INS primers which came from the longer product of RT-PCR.The shorter product exhibited the same sequence as that already deposited in the NCBI database (data not shown).Location of the novel insert in one of the PAX8 mRNAs is demonstrated in Fig. 2B.The insert represents a part of the first intron which was not subjected to the complete splicing.The 97 nucleotides therefore enlarge the second exon of PAX8 gene proximally, expanding it to an ultimate 197 nt.The sequence of this new transcript variant was deposited in the NCBI database (no.KC733810 annotated as the F variant).
Based exclusively on the limited region which was sequenced, only the alternative 3' acceptor site, undoubtedly designated here, can be discussed with confidence.As already mentioned in the introduction, 6 variants of PAX8 mRNAs were found but only sequences of 4 of them were deposited in the NCBI database.This suggests there might have been difficulties with fishing and sequencing of these varied isoforms.The alternative 3' acceptor site discovered by our group represents a novel variant of PAX8 mRNA.Unfortunately, it was impossible to determine whether any of the already known isoforms also uses the revealed alternative acceptor site.It may concern either all of them or only one designated variant.
The new insert does not change the size of PAX8 protein in a direct way.However, the location of this new insert is critical for binding of the ribosome and its presence can significantly impact the translation efficiency.The start codon of PAX8 protein is located 75 nucleotides apart from the intron-exon boundary.Extending the distance from the open reading frame could result in decreased expression of the protein by the mechanism of unpredictable mRNA folding i.e. formation of hairpin loop structures, which has been described as causative for reduction of the transcript affinity to the ribosome (Uemura et al., 2007).
Consequences of impaired mRNA-ribosome interaction are reflected not only in lesser protein expression, but in further perspective might be far more severe causing embryological abnormalities or cell cycle deregulation.Thyroid hemiagenesis is a rare inborn anomaly, and occurs if one thyroid lobe fails to develop (Ruchala et al., 2010), but mechanism of its origin remains unknown (Szczepanek et al., 2011).Here we suggest a possible explanation, regarding thyroid hemiagenesis as a consequence of impaired synthesis of PAX8 protein.This hypothesis is supported by previous literature findings, emphasizing PAX8 as an important contributor for accurate thyroid and ovary development.In addition, the new PAX8 variant was also observed in neoplastic ovarian cells.This can suggest its association with carcinogenesis but not as the major cancer inducing factor, but rather as a moderately disruptive factor enhancing the carcinogenic processes instead.Genomic analysis of the PAX8 promoter region was performed in order to determine the origin of intron-exon transition resulting in novel variant of PAX8 mRNA.Group of 23 patients was used to sequence the 5' region of PAX8 gene (Fig. 3A).Three single nucleotide polymorphism (SNP) sites were confirmed, one located in the exon 1 and two in the exon 2. Genetic analysis of allele frequency for these SNPs is presented in Fig. 3B.SNPs are located close to the new insert (two located in the same exon 2) however they are not associated with sites designated as those binding transcription factors.In silico analysis of such binding sites revealed 6 potential places in the promoter region (Fig. 3, A and C).

Figure 1 .
Figure 1.RT-PCR analysis of PAX8 mRNA with F and R primers.Two main PCR products, 684 bp and 587 bp were observed in the thyroid hemiagenesis and control thyroids, respectively (A).Various types of cancers demonstrated almost exclusive presence of 684 bp long PCR product (B).M, DNA ladder (bp)

Figure 2 .
Figure 2. Insert localization in exon 2 resulting from alternative 3' acceptor site.(A) Sequence of novel PAX8 mRNA variant (sequence of F and R primers is underlined, insert sequence is in bold, exon-intron junctions are indicated as slashes "/").(B) Illustration of alternative splicing of PAX8 mRNA resulting in the 97 nucleotide long extension of exon 2 (INSERT).

Figure 3 .
Figure 3. Genetic analysis of the 5'-end fragment of PAX8 gene.(A) Resequencing strategy for sequence determination of 5'-end fragment.(B) Detection of SNP in analyzed DNA sequence.(C) In silico analysis of transcription factor binding sites in the promoter region of PAX8 gene.