The EGFR rs2233947 polymorphism is associated with lung cancer risk: a study from Jordan

The epidermal growth factor receptor (EGFR) is a tyrosine kinase cell surface protein that plays a role in the process of carcinogenesis. In this study, we investigated the association between EGFR rs2233947 and rs884225 SNPs and the risk of lung cancer. A total of 258 participants (129 lung cancer patients and 129 healthy controls) took part in the study. Restriction fragment length polymorphism-polymerase chain reaction (RFLP-PCR) technique was used to genotype EGFR SNPs. A strong association was detected between rs2233947 and lung cancer (P<0.01). Compared with the rs2293347 GG genotype, the AA/AG genotypes were associated with a significantly decreased risk of lung cancer (adjusted OR = 0.28, 95% confidence interval [CI]=0.13–0.61, P<0.01). EGFR rs2233947 correlated with lung cancer in males, smokers, and in the squamous cell carcinoma lung cancer subtype (P<0.01). Haplotype analysis of rs2233947 and rs884225 showed that the AA haplotype was associated with a significantly decreased risk of lung cancer (P<0.01). The data presented in the current study support a protective role for the rs2233947 A allele against the development of lung cancer. This result, however, requires further validation in a larger population.


INTRODUCTION
Lung cancer is among the leading causes of cancer deaths worldwide (McIntyre & Ganti, 2017). It is estimated that 2 million people succumb to lung cancer on an annual basis (Wong et al., 2017). Factors which increase the risk of lung cancer include tobacco use, radon exposure and air pollution (Brennan et al., 2011;Field & Withers, 2012;Young & Hopkins, 2011). For example, exposure to tobacco smoke (active or second-hand) has been shown to account for the majority of lung cancer cases (Steliga & Dresler, 2011). In addition to the above environmental factors, genetic predisposition appears to play a role in determining or modifying lung cancer risk. Indeed, for several genes, it has been shown that a variation in sequence or copy number is implicated in the development of lung cancer or in modifying lung cancer risk. These genes include epidermal growth factor recep-tor (EGFR), vascular endothelial growth factor, ataxia telangiectasia mutated, the cytochrome P450 family 1 subfamily A member 1, and many others (Farago & Azzoli, 2017;Feng et al., 2014;Hussein et al., 2014;Xu et al., 2017;Yang et al., 2018).
Among the many genes shown to be associated with lung cancer, EGFR received considerable attention. EGFR is a member of the cell surface receptor tyrosine kinase family (Sato, 2013) and was shown in several studies to play a pivotal role in the carcinogenesis of several solid tumors (Mitsudomi & Yatabe, 2010). Following stimulation by its ligands, EGFR elicits a downstream cell-signaling cascade which promotes proliferation, invasion, metastasis, angiogenesis, and inhibition of apoptosis (Brand et al., 2013;Takahashi et al., 2015). Several studies demonstrated that genetic variations in EGFR were involved in determining or modifying the risk of lung cancer development and/or progression (Dubey et al., 2006;Zhang et al., 2006). Additionally, in several studies, genetic variations in EGFR, including single nucleotide polymorphisms, were shown to modulate the response of lung cancer patients to a number of lung cancer therapies including tyrosine kinase inhibitors; a class of targeted therapies effective against a number of solid malignancies (Bonomi et al., 2018;Jung et al., 2012;Li et al., 2018;X. Liu et al., 2017;Y. Liu et al., 2013;Metro et al., 2018).
In Jordan, lung cancer is the third most common cancer, following breast and colorectal cancers (Ismail et al., 2013). A recent study has shown an association between EGFR rs712829 and rs2072454 SNPs with the risk of lung cancer among Jordanians (Bashir et al., 2018). In the current investigation, the relationship between two other SNPs in EGFR (rs2233947 and rs884225) and lung cancer risk was studied.

Participants.
The study was case-control in design. Patient recruitment took place at two large tertiary referral centers in Jordan (King Abdullah University Hospital and King Hussein Cancer Center) between January 2016 and December 2017. A total of 129 healthy participants matched to patients by age, sex and smoking were recruited from the same geographical areas to act as a control group. The study was approved by the Institutional Review Board of Jordan University of Science and Technology (approval number 12/93/2016) and followed the guidelines of the Helsinki declaration regarding medical research involving human participants. Written consent was received from all the participants after full explanation of the study aims and procedures. Patients with primary lung cancer were included in the study regardless of the cancer type or stage. Medical diagnoses were performed by a respiratory consultant and were further confirmed by a histopathology examination. Patients with secondary lung cancer were excluded from the study (Bashir et al., 2018). Samples and data collection. A structured questionnaire was used to collect demographic data (e.g. age, gender and toxicant exposure). Clinical data were collected from patients' medical records. For molecular analysis, 3 mL of venous blood were collected from each participant to the EDTA sterile tubes.
DNA extraction. DNA was extracted from whole blood using commercially available kits (Puregene Blood Core Kit B, Germantown, MD, USA). The quality and concentration of the isolated DNA were analyzed using a Thermo Scientific NanoDrop ND-2000 UV-Vis Spectrophotometer (Waltham, MA, USA). Samples were stored at -80°C until further use.
The PCR products were digested overnight with the appropriate restriction enzymes (MLyI used for rs2233947 and ACiI for rs884225, New England Bio Labs, Beverly, MA, USA). Digestion reactions were performed according to the manufacturer's recommendations. The digested PCR products were resolved on a 3% agarose gel with ethidium bromide and visualized under UV light using a G:BOX UV with dedicated software (Syngene, Frederick, MD, USA). For rs2233947, digestion of the 240 bp PCR product indicates the presence of the G allele and yields 137 bp and 103 bp fragments. In the case of rs884225, digestion of the 279 bp PCR product indicates the presence of G allele and yields 152 bp and 127 bp fragments.
Statistical analysis. Deviation of the tested SNPs from Hardy-Weinberg equilibrium was evaluated using SNPStats s software (http://bioinfo.iconcologia.net/ snpstats/start.htm). Differences in the genotypes and alleles of each SNP and the haplotypes of the two SNPs were analyzed by SNP stat and IBM SPSS Statistics virgin 20.0 (IBM Co., Armonk, NY, USA). The Chi-square test and SNP stat tool for co-dominant, dominant, and recessive were utilized to analyze the association between tested polymorphisms and lung cancer. P value of <0.05 was applied as an indication of statistical significance. Table 1 shows demographics of study participants. The control group (n=129) was matched to the group of patients with lung cancer, considering the major environmental risk factors and did not show significant differences of gender, age and smoking status (P>0.05). About 81.4% of participants were males. The majority of participants were either ever smokers (83.7%) or exsmokers (9.3%), with an average of 1.5±0.43 pack per day and a smoking history of more than 20 years.

RESULTS
According to the histopathological diagnosis (Table 2), the lung cancer patients were classified into the following groups: adenocarcinoma (31.8%), squamous cell (38.8%), small cell/large cell (8.5%), and non-small cell cancer of unidentified histology (10.1%). The majority of cases (69.8%) was diagnosed at a late stage (III and IV) and 55.0% had distant metastasis ( Table 2).
The genotype/allele frequencies of the EGFR, rs2293347 and rs884225 polymorphisms among the patients and controls, and their association with lung cancer risk are shown in Table 3. The distributions of genotypes of both SNPs in the control group were within the Hardy-Weinberg equilibrium (P>0.05). The frequency of the rs2293347genotype differed significantly between the patients and controls (GG, AG and AA genotypes; 90%, 8% and 2% in patients versus 82%, 15% and 3% in controls, respectively (Table 3). Thus, A allele was significantly lower in patients with lung cancer than in the controls (6% versus 15% respectively, P<0.01). Com- The association between the rs2293347 genotype and the risk of lung cancer was further examined after stratifying the study population according to gender, smoking status, and histological type ( Table 4).
The correlation of the rs2293347 genotype with the increased risk of lung cancer was maintained in males, ever smokers and was pronounced in patients with squamous cell carcinoma (P<0.01, Table 4).
With respect to rs884225, there was no significant association between this SNP and lung cancer (P>0.05). Distribution of the genotypes was similar in lung cancer patients and in controls (GG, AG and AA genotypes; 2.3%, 17.1% and 80.6% in patients versus 0.8%, 17.8% and 81.4% in controls, respectively (Table 3). Similarly, the allelic distribution of rs884225 was similar between the two groups, with the A allele being the most common (about 90% in both groups, Table 3).
Finally, Table 5 summarizes haplotype analysis of EGFR rs2293347 and rs884225 SNPs. The G/A haplo-type was the most frequent haplotype in the population (80%), followed by the A/A haplotype (9%). The A/A haplotype was more frequent in controls (13.6%) than in patients with lung cancer (4.9%) and was associated with a significantly decreased risk of lung cancer (adjusted OR=0.37, 95% CI=1.19-2.72, P=0.004, global P<0.05. Table 5).

DISCUSSION
The current study addressed the association between EGFR rs2293347 and rs884225 SNPs and lung cancer. The results showed strong association between rs2293347 and lung cancer with the A allele being protective against the development of lung cancer.
The role of EGFR pathway in the regulation of cellular growth is well documented (Lui & Grandis, 2002). Activation of EGFR has been implicated in cellular proliferation, angiogenesis and survival (Brand, et al., 2013;Takeuchi & Ito, 2010). EGFR exerts its functions via dimerization and auto-phosphorylation of the tyrosine kinase domain of the protein. This activates signal   OD=Odds ratio, CI=95% confidence interval transduction pathways that upregulate transcription factors and control expression of downstream genes (Shepard et al., 2008). Somatic mutations in the tyrosine kinase region can lead to constitutive activation of the receptor and the initiation of carcinogenesis. Likewise, certain EGFR genetic variations have also been shown to change EGFR protein levels in the body, including lungs (Nie et al., 2015), and subsequently modulate cancer susceptibility. In accordance with this, the variations in EGFR gene have been shown to modulate risk of several types of cancers, including gastric (Torres-Jasso et al., 2015), bladder (Chu et al., 2013), breast (Sobti et al., 2012) and lung cancer (Bashir et al., 2018;Feng et al., 2014;X. Liu et al., 2017).
The results of the present study showed a strong association between EGFR rs2293347 and lung cancer, with the A allele being protective against cancer development. When patients were stratified according to type of lung cancer, the association was found to be prominent in the squamous cell carcinoma subtype. In accordance with this finding, a study conducted in the Korean population showed a strong relationship between rs2293347 SNP and lung cancer, with the A allele being associated with a significantly decreased risk of lung cancer (Choi et al., 2007). Similarly, rs2293347 SNP was suggested as a risk factor for the development of gastric cancer . In general, the clinical significance of the rs2293347 SNP has been reported. It was found to be associated with the efficiency of tyrosine kinase inhibitors in patients with non-small cell lung cancer Zhang et al., 2006), and it has been shown to contribute to benign prostatic hyperplasia (Kim et al., 2016) and airway hyperresponsiveness (Yoshikawa & Kanazawa, 2012). Furthermore, the rs2293347 SNP has been shown to be a significant predictor of the prolonged progression-free and overall survival in an independent cohort of EGFR mutation-positive lung cancer patients treated with erlotinib (Winther-Larsen et al., 2019). Finally, using a knowledge-based bioinformatic approach, the rs2293347 was found to be among the best predictors of TKIs response in gastric cancer (Takahashi et al., 2015). Thus, the results of the current study together with those of others indicate that rs2293347 SNP in EGFR gene plays a clinical role in some malignancies, including lung cancer.
It is worth to mention that the rs2233947 is a synonymous SNP located in exon 25 of the gene and it does not result in an amino acid change in the EGFR protein. Therefore, this polymorphism cannot influence the way by which the EFGR protein mediates its function. However, in some cases, synonymous SNPs can impact gene expression by several mechanisms that include modulation of RNA splicing, stability and its interactions with the translational machinery (Lee et al., 2011;Shastry, 2009;Venza et al., 2010). Alternatively, the observed effect of rs2233947 on risk of lung cancer can be due to the presence of linkage disequilibrium between this SNP with unidentified polymorphism in the EGFR gene (Choi et al., 2007). Further research is required to understand the mechanism by which the rs2233947 mediates its clinical effects.
The EGFR rs884225 SNP is located within a predicted binding site for micro RNA (miRNA-214). Therefore, this SNP might be of clinical significance as it has been shown to be associated with changes in EGFR gene expression (Chu et al., 2013). However, the results of the present study showed a lack of association between rs884225 and lung cancer. This is in agreement with a study that was conducted in a Chinese popula-tion and showed no association between rs884225 and nasopharyngeal carcinoma (Liu et al., 2013). However, in one study, the GG genotype of rs884225 was found to be associated with a significantly increased risk of bladder cancer (Chu et al., 2013) and with response to tyrosine kinase inhibitors in the Chinese lung cancer patients (Ruan et al., 2016). Thus, the effects of rs884225 SNP might have a population or cancer type specific component, being affected by the genetic background and/or interactions with environmental factors.
Literature suggests that the association tests based on haplotypes provide greater statistical power than the tests based on the underlying SNPs (Bader 2001;Lin & Schaid, 2009). Therefore, the association of lung cancer with haplotype of rs2233947 and rs884225 was performed, showing that the AA haplotype was associated with a significantly decreased risk of lung cancer. This result supports a role of the rs2233947 A allele in the development of lung cancer.
One of the limitations of the current study is that it did not address the relationship between rs2233947 and rs884225 SNPs of EGFR gene and the clinical parameters of lung cancer (prognosis, stages, and metastasis). Future studies that include such parameters are strongly recommended.
In conclusion, the rs2233947 SNP is associated with risk of lung cancer among the Jordanian population. More studies are required to confirm this finding in other populations.