Wayne State University Press
abstract

N-Acetyltransferase 2 (NAT2) is an important enzyme involved in the metabolism of a wide spectrum of naturally occurring xenobiotics, including therapeutic drugs and common environmental carcinogens. Extensive polymorphism in NAT2 gives rise to a wide interindividual variation in acetylation capacity, which influences individual susceptibility to various drug-induced adverse reactions and cancers. Striking patterns of geographic differentiation have been described for the main slow acetylation variants of the NAT2 gene, suggesting the action of natural selection at this locus. In the present study, we took advantage of whole-genome sequence data available from the 1000 Genomes project to investigate the global patterns of population genetic differentiation at NAT2 and determine whether they are atypical compared with the remaining variation of the genome. The nonsynonymous substitution c.590G>A (rs1799930) defining the slow NAT2*6 haplotype cluster exhibited an unusually low FST value compared with the genome average (FST = 0.006, P = 0.016). It was indicated as the most likely target of a homogenizing process of selection promoting the same allelic variant in globally distributed populations. The rs1799930 A allele has been associated with the slowest acetylation capacity in vivo, and its substantial correlation with the subsistence strategy adopted by past human populations suggests that it may have conferred a selective advantage in populations shifting from foraging to agricultural and pastoral activities in the Neolithic period. Results of neutrality tests further supported an adaptive evolution of the NAT2 gene through either balancing selection or directional selection acting on multiple standing slow acetylation variants.

key words

NAT2, Acetylation Polymorphism, Population Differentiation, Natural Selection, Linkage Disequilibrium, RS1799930

The human acetylation polymorphism is one of the oldest and best-characterized pharmacogenetic traits that underlie interindividual and interethnic differences in response to xenobiotics. Acetylation catalyzed by N-acetyltransferase 2 (NAT2) is a major [End Page 185] biotransformation pathway for aromatic and heterocyclic amines present in the environment and diet, which can be either detoxified or bioactivated into metabolites that have the potential to cause toxicity and cancer (Butcher et al. 2002; Hein 2002). From the clinical point of view, NAT2 acetylation is increasingly recognized as associated with significant health problems. Many clinically useful drugs are excreted by acetylation, some of them crucial in the treatment of diseases representing worldwide concerns, such as tuberculosis, AIDS-related complex diseases, and hypertension. An individual’s acetylation status has proven to be an important determinant of both the effectiveness of prescribed medications and the development of adverse drug reactions and toxicity during drug treatment (Ladero 2008; Meisel 2002). Moreover, epidemiological studies have associated acetylation phenotype with increased susceptibility to various cancers following exposure to aromatic amine carcinogens (Agúndez 2008; Hein 2006; Sanderson et al. 2007; Selinski et al. 2013).

Figure 1. Distribution of rs1799930 (c.590G>A) allele frequencies in human populations. Allele frequency data for the rs1799930 SNP were collected for 146 worldwide samples by performing an extensive survey of the literature. Description and numbering of samples and retrieved allele frequency data are provided in Supplemental Table S2. Only those samples comprising at least 20 individuals (N = 105) were represented on the map. Six samples could not be localized on the map because of unspecified sampling location (sample 70) or because of divergence between sampling location and region of origin (samples 42, 55, 60, 61, and 102); these samples are displayed in a box beneath the caption.
Click for larger view
View full resolution
Figure 1.

Distribution of rs1799930 (c.590G>A) allele frequencies in human populations. Allele frequency data for the rs1799930 SNP were collected for 146 worldwide samples by performing an extensive survey of the literature. Description and numbering of samples and retrieved allele frequency data are provided in Supplemental Table S2. Only those samples comprising at least 20 individuals (N = 105) were represented on the map. Six samples could not be localized on the map because of unspecified sampling location (sample 70) or because of divergence between sampling location and region of origin (samples 42, 55, 60, 61, and 102); these samples are displayed in a box beneath the caption.

N-Acetylation activity has been investigated in a wide range of populations, leading to classification of humans into two main phenotypes: fast acetylators, who exhibit the so-called wild-type or normal acetylation activity, and slow acetylators, characterized by decreased enzyme activity. The proportions of rapid and slow acetylators vary remarkably among populations of different ethnic and/or geographic origin (Sabbagh et al. 2011; Walker et al. 2009; Weber and Hein 1985). Depending on the test substrate administered, a trimodal rather than bimodal distribution can be observed, revealing an additional, intermediate phenotype (Cascorbi et al. 1995; Kilbane et al. 1990; Parkin et al. 1997). Moreover, recent results suggest that the slow acetylator phenotype is not homogeneous and that several slow acetylator phenotypes may exist, resulting from allelic heterogeneity and differential functional effects of the slow acetylation alleles (Ruiz et al. 2012; Selinski et al. 2013). A refinement in phenotype inference, notably by the consideration of an “ultra-slow” acetylator category, is advocated to help identify new clinically relevant associations with one or more of these phenotype subcategories. [End Page 186]

Acetylation polymorphism arises from allelic variations in the NAT2 gene, which result in the production of arylamine N-acetyltransferase 2 (NAT2) proteins with variable enzyme activity or stability. The NAT2 gene contains two exons with a relatively long intronic region of about 8.6 kb. Exon 1 is very short (100 bp), and the entire protein-coding region is contained within the 870-bp exon 2. Extensive polymorphism has been described in exon 2, with 38 nucleotide variations registered to date (see the NAT database at http://nat.mbg.duth.gr/). Of these, four common nonsynonymous substitutions at positions 191, 341, 590, and 857 are the most studied and characterize the major NAT2 slow haplotype clusters (NAT2*14, NAT2*5, NAT2*6, and NAT2*7, respectively). Individuals who are homozygous or compound heterozygous for two of these low-activity haplotypes are classified as slow acetylators.

NAT2 acetylation has attracted much research interest in evolutionary biology, and several population genetic studies have attempted to clarify the role that slow acetylation could have played in the adaptation of our species (Fuselli et al. 2007; Luca et al. 2008; Magalon et al. 2008; Mortensen et al. 2011; Patin et al. 2006; Sabbagh and Darlu 2006; Sabbagh et al. 2011). The high prevalence of slow acetylators in humans (well above 50% worldwide) is thought to be a consequence of the shift in modes of subsistence and lifestyle in the last 10,000 years, which triggered significant changes in diet and human exposure to xenobiotic compounds. Several surveys of NAT2 sequence variation have indeed provided compelling evidence that at least some of the slow acetylation variants of NAT2 have been driven to present-day frequencies through the action of natural selection (Luca et al. 2008; Magalon et al. 2008; Mortensen et al. 2011; Patin et al. 2006; Sabbagh et al. 2011). The slow acetylation phenotype may thus have been a key adaptation to increase our species fitness in response to the transition from foraging to farming.

Striking patterns of geographic differentiation have been described for the major NAT2 slow acetylation variants (García-Martín 2008; Sabbagh et al. 2011). The function of NAT2 in mediating the [End Page 187] interactions between humans and their chemical environment, which varies depending on diet and lifestyle, makes it an excellent candidate for population-specific selection pressures. Notably, an unusually high level of population differentiation between East Asians and other populations (FST values ~ 0.40) has been described for the c.341T>C slow acetylation variant (rs1801280), as well as the two linked c.481C>T (rs1799929) and c.803A>G (rs1208) nonfunctional SNPs, compared with an empirical distribution of FST computed across a 400-kb region encompassing the whole human NAT gene family (Sabbagh et al. 2008). In contrast, the slow 590A variant (rs1799930) was found to occur at roughly similar frequencies among widely dispersed populations (Luca et al. 2008; Sabbagh et al. 2011). Figure 1 describes the frequency distribution of this variant in 105 worldwide samples. Such a low level of geographic differentiation may suggest a homogenizing process of natural selection, promoting the same allelic variant in otherwise disparate populations (through either directional or balancing selection). Although many polymorphisms have been described in other regions of the NAT2 gene (Mortensen et al. 2011; Patin et al. 2006), limited data exist on the geographic distribution of these variants in worldwide populations.

In this study, we took advantage of wholegenome sequence data available from the 1000 Genomes (1KG) project to explore global patterns of population genetic differentiation for the whole set of variants occurring in the entire NAT2 gene sequence (~10 kb). An outlier approach was used to determine whether the patterns of geographic differentiation at this locus were atypical compared with those observed for the remaining variation of the genome. Selection tests based on the site frequency spectrum and extended haplotype homo-zygosity (EHH) were further applied to determine the possible role of natural selection in shaping the atypical patterns observed.

Materials and Methods

Data Retrieval

Whole-genome variation data generated by the 1KG project in 1,089 unrelated individuals were directly downloaded from the 1KG website (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/), using the phase 1 integrated release, version 3, dated April 2012 (1000 Genomes Project Consortium et al. 2012). The 1,089 individuals are drawn from 14 different populations in sub-Saharan Africa, Europe, East Asia, and the Americas (see Table 1). From the obtained .vcf (variant call format) files, we extracted exclusively the low-coverage VQSR (variant quality score recalibrator method) single nucleotide variant (SNV) calls to avoid any bias that might result from differences between low-coverage whole-genome calls and high-coverage exome SNV calls. Indels were not used. Functional annotation of the 36,382,866 SNVs retrieved was performed using classification from the dbSNP database, build 137. SNVs were assigned to two main classes: genic and nongenic. Genic SNVs were further classified as intronic, 5’-UTR, 3’-UTR, coding synonymous, coding non-synonymous, or splice site.

Population Genetic Differentiation

Global levels of population genetic differentiation at NAT2 (chr8: 18248755–18258723 in the human GRCh37/hg19 assembly) were evaluated by using the fixation index FST (Wright 1951), which quantifies the proportion of genetic variance explained by allele frequency differences among populations. FST ranges from 0 (for genetically identical populations) to 1 (for completely differentiated populations). FST scores were computed for all NAT2 SNVs occurring with a minor allele frequency (MAF) ≥ 0.05 in at least one of the 14 1KG populations, using the BioPerl module PopGen (Stajich et al. 2002). Extreme values of FST can result from natural selection but also from nonselective events linked to the demography of populations, such as genetic drift. Because such nonselective processes randomly act on the genome, they are expected to have the same average effect across the genome, in contrast to natural selection, which impacts population differentiation in a locus-specific manner. The genome-wide variation data provided by the 1KG project can thus be used to infer the action of natural selection by adopting an outlier approach (Kelley et al. 2006). For that purpose, we built eight empirical distributions of the FST statistic by considering different subsets of SNVs defined according to their physical location and/or functional impact. To obtain distributions of likely independent observations, a linkage disequilibrium [End Page 188] (LD)-based pruning procedure was applied to each of these eight subsets using Plink (Purcell et al. 2007) with default parameters (pruning based on a variance inflation factor of at least 2 within each sliding window of 50 SNVs with a step of five SNVs). This resulted in a total of 25,532,386 independent autosomal SNVs included in the genome-wide empirical distribution. These numbers were 15,141,160 nongenic, 11,282,100 genic, 10,477,050 intronic, 24,395 5’-UTR, 198,718 3’-UTR, 107,644 coding synonymous, 146,572 coding nonsynonymous, and 1,912 splice-site distributions. These eight distributions were then used as reference to assess whether the patterns of genetic differentiation observed at NAT2 are atypical. Empirical P-values were estimated as the proportion of FST scores in the empirical distribution that are either higher (diversifying selection) or lower (homogenizing selection) than the value observed at the locus of interest. Since FST strongly correlates with heterozygosity (Barreiro et al. 2008; Beaumont and Nichols 1996; Elhaik 2012), empirical P-values were calculated within bins of SNVs grouped according to their global MAF. A total of 27 bins were considered for the whole MAF range: ten bins of size 0.001 for MAF = 0–0.01, nine bins of size 0.01 for MAF = 0.01–0.10, and eight bins of size 0.05 for MAF = 0.10–0.50.

Selection Tests

To determine whether natural selection has played a role in the unusual patterns of geographic differentiation disclosed, we used two complementary approaches based on the allele frequency spectrum of segregating sites and on the local haplotype structure. Tajima’s D (Tajima 1989) is a classical neutrality test that compares estimates of the number of segregating sites and the average number of pairwise differences between nucleotide sequences (π). A zero value of the test statistic D is expected under the null hypothesis of selective neutrality, a positive D indicates balancing selection, and a negative value indicates directional selection. Tajima’s D scores were computed across the whole NAT2 coding region by using a sliding window approach with a window size of 1 kb and a step size of 100 bp. Statistical significance of the test statistic was assessed using an empirical approach. From the genome-wide data available from the 1KG project, we selected a set of unlinked noncoding regions expected to be mostly neutrally evolving. A total of 100 autosomal regions of 1 kb were selected that met the following criteria: (a) at least 100 kb away from any known or predicted genes or expressed sequence tag or region transcribed into mRNA; (b) outside any segmental duplication or region transcribed into a long noncoding RNA or conserved noncoding element, as defined in Woolfe et al. (2007); (c) distant from each other by at least 100 kb and not in LD with each other; and (d) containing a number of SNVs equal to the mean number of SNVs included in the 1-kb sliding windows spanning the NAT2 coding region. Tajima’s D scores were computed for these 100 regions so as to obtain the null (neutral) distribution of the test statistic in each population sample. An empirical P-value was estimated at each sliding window position within NAT2 by considering the proportion of regions showing a test statistic higher (excess of intermediate-frequency variants) or lower (excess of low-frequency variants) than the value observed at that specific position.

We next used methods based on the extended haplotype homozygosity (EHH) measure, that is, the sharing of identical alleles across relatively long distances by most haplotypes in a population sample (Sabeti et al. 2002). We calculated the integrated haplotype score (iHS) (Voight et al. 2006), which compares the rate of EHH decay observed for both the derived and ancestral allele at each core SNV. An extremely positive or negative value at the core SNV provides evidence of positive selection with unusually long haplo-types carrying the ancestral or the derived allele, respectively. The raw iHS scores were computed for all NAT2 SNVs using the iHS option implemented in WHAMM! software (Voight et al. 2006), which we slightly modified to speed up computation times: thresholds for EHH decay were modified from 0.25 to 0.15, and the size of the analyzed region was set to 0.2 Mb instead of 2.5 Mb. Information on ancestral allele state was obtained from a four-way alignment of human, chimpanzee, orangutan, and rhesus macaque species, provided by the 1KG consortium (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/supporting/ancestral_alignments/).

We also applied a cross-population test by computing the cross-population EHH (XP-EHH) statistic (Sabeti et al. 2007), which compares the integrated EHH computed in a test population [End Page 189] with that of a reference population. XP-EHH scores were computed using the same EHH decay parameters and window size as for iHS. The Yoruba (YRI) sample was used as a reference for samples outside Africa, and the Utah residents of European ancestry (CEU) as a reference for African samples.

iHS and XP-EHH scores were also computed for all available SNVs in the 100 neutral regions described above, thus providing a reference distribution for each test statistic to estimate empirical P-values. Raw scores of iHS and XP-EHH in NAT2 were standardized in bins of derived allele frequency (step size of 0.05) using the corresponding distribution of each statistic. For both iHS and XP-EHH, we used the 1KG phased data from the integrated phase 1 release, version 3 (April 2012), and genetic distances were obtained from the high-density genetic combined map based on 1KG pilot 1 data.

Estimation of the Age of the Derived Allele at rs1799930

The age of the derived A allele at SNP rs1799930 was estimated using the maximum likelihood method implemented in the Estiage software (Génin et al. 2004). This method was originally developed to estimate the age of the most recent common ancestor of a rare mutation involved in rare diseases using microsatellite data. It is based on the length of ancestral haplotype segments around the mutation shared by mutation carriers. Haplotypes in a 20-kb region centered around rs1799930 were obtained from the phased 1KG data (see Supplemental Table S1). Only SNVs were considered, and their genetic positions were obtained from the genetic maps provided on the Beagle website (http://bochet.gcc.biostat.washington.edu/beagle/genetic_maps/). Only one SNV every 0.01 cM was kept in the final analysis with Estiage. Age estimates were derived in each population separately. The mutation rate is a key parameter in Estiage that accounts for the fact that the most proximal marker where a different allele from the ancestral haplotype is observed in an individual might in fact indicate not a recombination but a mutation or a genotyping error. For SNVs, mutation rates are assumed to be very low, on the order of 10–6, but genotyping errors can be relatively high. Therefore, analyses were performed using mutation rate values between 10–6 and 10–4 per marker per individual per generation. Similar age estimations were obtained with the different mutation rates, and results are presented for the lower and upper bound values (Table 1). Population allele frequencies at the different markers were estimated on the haplotypes not carrying the A alleles at rs1799930.

In Silico Prediction of SNV’s Functional Effects

The F-SNP method (Lee and Shatkay 2008; see http://compbio.cs.queensu.ca/F-SNP/) was applied to assess the potential functional effect of SNVs. This integrative scoring method combines assessments from 16 independent computational tools and databases, using a probabilistic framework that takes into account both the certainty of each prediction and the reliability of the different tools, depending on the physical and functional annotation of the specific variant tested. It provides a functional significance (FS) score that quantitatively measures the possible deleterious effect of the tested SNV at the splicing, transcriptional, translational, and posttranslational levels. An FS score of 0.5 is considered the cutoff point for predicting a deleterious effect (Lee and Shatkay 2009).

Results and Discussion

Global patterns of population genetic differentiation were examined in the genomic region spanning the entire NAT2 gene (~10 kb), using the FST statistic (Wright 1951) and the sequence variation data provided by the 1KG project (1000 Genomes Project Consortium et al. 2012). P-values were estimated from empirical distributions built from the background genomic variation (see Materials and Methods). We assigned to each genetic variant of the NAT2 gene a main P-value derived from the genome-wide empirical distribution and a subset P-value derived from the distribution including the subset of SNVs having a similar location and/or functional impact as the SNV of interest (i.e., nongenic, intronic, 5’-UTR, 3’-UTR, coding synonymous, coding nonsynonymous, and splice site).

No SNVs in NAT2 exhibited significantly high FST values compared with the genomic background, when considering the global differentiation among the 14 worldwide populations from 1KG (all P > 0.05). No significant P-values were observed when contrasting East Asia to the rest of the world (data [End Page 190] not shown). Although high FST scores were observed for the three SNVs c.341T>C (rs1801280), c.481C>T (rs1799929), and c.803A>G (rs1208) in this specific pairwise comparison (FST ≈ 0.30), they could not be considered atypical compared with the rest of the genome (P-values = 0.06–0.09). This contrasts with previous findings indicating an atypical pattern of differentiation for these three variants when considering HapMap data and only the set of variants located within a 400-kb region surrounding the NAT2 gene as a reference distribution (Sabbagh et al. 2008). This difference may be due to the different set of populations surveyed or to the more accurate empirical distribution used to represent the background genomic variation in the present study.

Table 1. Estimation of the Age of the Derived Allele at rs1799930
Click for larger view
View full resolution
Table 1.

Estimation of the Age of the Derived Allele at rs1799930

In contrast, five NAT2 SNVs exhibited unusually low FST values compared with the genome average, with both main and subset P-values below 0.05 (Figure 2). Four of them (rs6984200, rs2087852, rs11996129, and rs1112005) are located in the intronic region of NAT2, whereas the fifth one (rs1799930) is a nonsynonymous substitution defining the NAT2*6 slow haplotype cluster (c.590G>A resulting in R197Q). Note that the four intronic SNVs are in high LD with the rs1799930 variant (r2 = 0.80–0.88; Table 2). They all occur at high frequencies in the global human population (within the 0.23–0.26 MAF range). Such low levels of population genetic differentiation suggest that at least one of these polymorphisms may be subject to balancing or species-wide directional selection, the rs1799930 being the most likely target given its gene location and functional impact.

To determine whether another putative candidate in the genomic region surrounding NAT2 might explain the patterns observed, through a significant LD with these variants, we extended the analysis to a 600-kb region centered on the human NAT multigene family on chromosome 8 (Figure 3). All the variants exhibiting an r2 value above the 0.10 threshold with the rs1799930 SNV are located within a 56-kb region (chr8: 18229877–18285763 in hg19) in the direct vicinity of the NAT2 gene (Figure 3A), making the involvement of another gene in the region unlikely. Table 2 provides the list of variants that show a significantly low interpopulation FST value for either the main or subset P-value and are in moderate to strong LD with the rs1799930 variant (r2 > 0.50). They are all either intergenic (located up to ~3 kb upstream or 22 kb downstream of the NAT2 gene) or intronic to NAT2. The prediction [End Page 191]

Table 2. List of Variants in the 600-kb Region Surrounding the NAT2 Gene with Significantly Low FST Values and in High Linkage Disequilibrium with rs1799930
Click for larger view
View full resolution
Table 2.

List of Variants in the 600-kb Region Surrounding the NAT2 Gene with Significantly Low FST Values and in High Linkage Disequilibrium with rs1799930

Figure 2. Distribution of –log10 (P-values) of interpopulation FST scores across the human NAT2 gene. FST scores were computed among the 14 worldwide populations from the 1000 Genomes project. P-values were estimated from the lower tail of the empirical distributions. (A) Main P-value derived from the genome-wide empirical distribution (including 25,532,386 SNVs). (B) Subset P-value derived from the empirical distribution, including the subset of SNVs having a similar location and/or functional impact as the SNV of interest (i.e., intronic, 3’-UTR, coding synonymous, and coding nonsynonymous). The red dotted line indicates the 0.05 significance threshold.
Click for larger view
View full resolution
Figure 2.

Distribution of –log10 (P-values) of interpopulation FST scores across the human NAT2 gene. FST scores were computed among the 14 worldwide populations from the 1000 Genomes project. P-values were estimated from the lower tail of the empirical distributions. (A) Main P-value derived from the genome-wide empirical distribution (including 25,532,386 SNVs). (B) Subset P-value derived from the empirical distribution, including the subset of SNVs having a similar location and/or functional impact as the SNV of interest (i.e., intronic, 3’-UTR, coding synonymous, and coding nonsynonymous). The red dotted line indicates the 0.05 significance threshold.

[End Page 192]

Figure 3. Distribution of –log10 (P-values) of interpopulation FST scores across a 600-kb region centered on the human NAT gene family for those variants in linkage disequilibrium (r2 > 0.10) with the rs1799930 polymorphism. (A) Level of linkage disequilibrium, as measured by the r2 statistic (), with the rs1799930 genetic variant. (B) Main P-value estimated from the lower tail of the genome-wide empirical distribution (including 25,532,386 SNVs). (C) Subset P-value estimated from the lower tail of the empirical distribution including the subset of SNVs having a similar location and/or functional impact as the SNV of interest (i.e., nongenic, intronic, 5’-UTR, 3’-UTR, coding synonymous, coding nonsynonymous, and splice site). The red dotted line indicates the 0.05 significance threshold. SNVs are displayed in different colors according to their r2 value with rs1799930, ranging from dark blue (r2 = 0.10) to dark red (r2 = 1.0). The rs1799930 polymorphism is represented as a black triangle. Coding genes and pseudogenes in the 600-kb region are represented below as dark gray and light gray boxes, respectively. The genomic position (in megabases) on chromosome 8 is indicated on the horizontal axis (human GRCh37/hg19 assembly).
Click for larger view
View full resolution
Figure 3.

Distribution of –log10 (P-values) of interpopulation FST scores across a 600-kb region centered on the human NAT gene family for those variants in linkage disequilibrium (r2 > 0.10) with the rs1799930 polymorphism. (A) Level of linkage disequilibrium, as measured by the r2 statistic (Hill and Robertson 1968), with the rs1799930 genetic variant. (B) Main P-value estimated from the lower tail of the genome-wide empirical distribution (including 25,532,386 SNVs). (C) Subset P-value estimated from the lower tail of the empirical distribution including the subset of SNVs having a similar location and/or functional impact as the SNV of interest (i.e., nongenic, intronic, 5’-UTR, 3’-UTR, coding synonymous, coding nonsynonymous, and splice site). The red dotted line indicates the 0.05 significance threshold. SNVs are displayed in different colors according to their r2 value with rs1799930, ranging from dark blue (r2 = 0.10) to dark red (r2 = 1.0). The rs1799930 polymorphism is represented as a black triangle. Coding genes and pseudogenes in the 600-kb region are represented below as dark gray and light gray boxes, respectively. The genomic position (in megabases) on chromosome 8 is indicated on the horizontal axis (human GRCh37/hg19 assembly).

of their functional impact with the F-SNP method revealed high FS scores for some of them (FS = 0.50), denoting a potentially deleterious effect by affecting either the transcriptional or splicing regulation. However, the highest FS score was observed for the rs1799930 polymorphism (0.87), which also displayed the lowest subset P-value (P = 0.016; Table 2). Altogether, these results point to the rs1799930 polymorphism as the most likely target of homogenizing selection in the genomic region surveyed.

Interestingly, recent evidence suggests that the NAT2*6 haplotype cluster (characterized by the rs1799930 A slow acetylation allele) is related with the slowest acetylation capacity in vivo and that the homozygous genotype NAT2*6/*6 thus defines a new category of “ultra-slow” acetylators (Ruiz et al. 2012; Selinski et al. 2013). Ultra-slow acetylators have about 30% lower activities for caffeine metabolism compared with other slow acetylators. This is of the same order of magnitude as the reduction in enzyme activity between rapid and intermediate acetylators (Ruiz et al. 2012). These findings are consistent with a previous study by Cascorbi et al. (1995) that demonstrated a markedly decreased NAT2 activity in vivo in NAT2*6/*6 compared with NAT2*5/*5 genotypes. Indirect evidence is also provided by clinical association [End Page 193] studies related to both drug toxicity and cancer risk. Anti-tuberculosis drug-induced hepatotoxicity risk has been shown to be particularly high in carriers of the NAT2*6/*6 genotype (An et al. 2012; Huang et al. 2002; Lee et al. 2010; Leiro-Fernandez et al. 2011; Selinski et al. 2014; Teixeira et al. 2011). Similarly, the ultra-slow genotype, and not the common slow NAT2 genotype, has been significantly associated with an increased risk of urinary bladder cancer (Selinski et al. 2013). Altogether, both in vivo phenotyping studies and clinical reports suggest that the NAT2*6 variant is likely to be associated with a specific acetylation phenotype. This could explain why this particular NAT2 slow acetylation variant, and not another one, may have been a specific target of natural selection.

Although a highly homogeneous distribution of the rs1799930 A allele is observed across worldwide populations (with a global frequency of 0.246), resulting in an unusually low FST value (interpopulation FST = 0.006), a threefold lower frequency of this allele has been reported in hunter-gatherers (~0.08) compared with agriculturalists and pastoralists (~0.25) in a comprehensive survey of human NAT2 variation comprising 128 population samples classified according to their major subsistence strategy (Sabbagh et al. 2011). This significant difference in the frequency of NAT2*6 alleles (P < 0.0001) was identified as the main genetic cause of the higher prevalence of the slow acetylation phenotype in populations practicing farming and herding compared with those relying mostly on hunting and gathering (46% vs. 22%, respectively) (Sabbagh et al. 2011). Given this marked correlation between the rs1799930 A allele and the subsistence strategy adopted by past populations in the last 10,000 years, it has been suggested that this slow acetylation allele may have conferred a selective advantage in populations shifting from foraging to agricultural and pastoral activities in the Neolithic period. New or more concentrated NAT2 substrates introduced in the chemical environment of food-producing communities have likely promoted a slower acetylation rate in these populations. This hypothesis is further supported by the age estimation of the rs1799930 A allele in the 1KG populations provided by a maximum-likelihood method implemented in the Estiage software (Génin et al. 2004; Table 1). Our estimations showed that this allele started to increase in frequency at similar and recent times in all populations, less than 10,000 years ago (except in Japanese, where estimations are just over 10,000 years ago), thereby supporting global expansion since the emergence of agriculture in the Neolithic. Consequently, the markedly low level of population differentiation observed at the rs1799930 locus may result from the convergent selection of the rs1799930 A allele in agriculturalist and pastoralist populations that are now present in most parts of the world.

Several lines of evidence support the hypothesis that the rs1799930 G>A nonsynonymous substitution (R197Q) has specifically occurred in the human lineage. First, the NAT2 197R residue appears to be highly conserved throughout primate evolution, with 100% of the orthologous NAT2 sequences generated in 19 distinct simian species harboring an arginine (R) at this position (Sabbagh et al. 2013). Second, the 197R position was found to be monomorphic in 103 individuals from six great ape species (Pan troglodytes, Pan paniscus, Gorilla beringei, Gorilla gorilla, Pongo abelii, Pongo pygmaeus) (Prado-Martinez et al. 2013; E. S. Poloni, unpublished observations), as well as in 28 rhesus monkeys (Macaca mulatta) fully sequenced for the NAT2 gene (A. Sabbagh, unpublished observations), making the R197Q polymorphism a specific feature of the human lineage. The hypothesis of a transspecies polymorphism maintained for several million years, through shared balancing selection pressures, therefore seems unlikely.

Assuming that the rs1799930 A allele has conferred a selective advantage to populations shifting from food collection to farming and animal breeding in the Holocene, this could have happened either through directional selection or balancing selection. A gene-dose effect has indeed been described for this variant, with a significant trend toward a slower acetylation capacity in individuals carrying an increasing number of NAT2*6 haplotypes (0, 1, or 2) (Ruiz et al. 2012; Selinski et al. 2013). Therefore, heterozygous individuals for this allele display an intermediate metabolic phenotype that may have been advantageous if one considers the competing needs of both maintaining an efficient detoxification of harmful xenobiotics and avoiding the damaging effects of the putative carcinogens that can be activated through NAT2 acetylation. In an attempt to provide further insights into the evolutionary mechanisms that might have driven [End Page 194] and maintained the rs1799930 A allele at high frequencies in most human populations worldwide, we carried out several tests of selective neutrality based on the allele frequency spectrum, Tajima’s D (Tajima 1989), and haplotype structure, iHS (Voight et al. 2006) and XP-EHH (Sabeti et al. 2007). An empirical approach using sequence variation data from 100 unlinked noncoding regions was adopted to assess statistical significance.

Table 3. Results of Selection Tests for the NAT2 rs1799930 Polymorphism
Click for larger view
View full resolution
Table 3.

Results of Selection Tests for the NAT2 rs1799930 Polymorphism

All iHS and XP-EHH scores computed for the rs1799930 SNV in all individual populations from 1KG were not significant at the 0.05 threshold (Table 3). This precludes a clear signal of positive selection for this variant as the one expected under a “hard sweep model,” which assumes the rapid fixation of a single newly arisen advantageous mutation (Pritchard et al. 2010). In contrast, we found significant Tajima’s D scores in the 1-kb regions encompassing the rs1799930 variant in five population samples: British, Finnish, Tuscans, Utah residents of European ancestry, and Puerto Ricans (P < 0.05; Tables 3, 4). We acknowledge that these results become nonsignificant when a correction for multiple testing is applied, but we also note that the ratio of five significant tests out of 14 is higher than the expected 5% proportion of false positives. Nonsignificant scores, but with P-values getting closer to the 5% threshold, were observed in two additional samples (Colombians, P = 0.07; Luhya, P = 0.08). Furthermore, although nonsignificant scores prevent rejection of the null hypothesis of selective neutrality, it is noteworthy that all populations tested but one (Japanese) gave positive Tajima’s D values, suggesting a trend toward an excess of intermediate-frequency variants compatible with the action of balancing selection. Such consistent results for populations with different demographic pasts make it unlikely that they are due to demography rather than to balancing selection. This is also in agreement with previous findings demonstrating globally positive and significant Tajima’s D values in different continental populations and the absence of any signature of positive selection, as detected by EHH-based tests (Fuselli et al. 2007; Luca et al. 2008; Magalon et al. 2008; Mortensen et al. 2011). A notable exception concerns the c.341C>T slow acetylation variant for which a selective sweep was detected in western and central Eurasian populations (Patin et al. 2006) with the long-range haplotype test (Sabeti et al. 2002). We did not confirm such signature of positive selection at this locus in any of the 14 populations from 1KG (both iHS and XP-EHH scores not significant at the 0.05 level). No significant scores were observed for any of the other slow acetylation variants (c.191G>A, c.341T>C, and c.857G>A; data not shown). Therefore, patterns of diversity at NAT2 seem compatible with either balancing selection or a more complex model of “multiallelic” directional selection where different slow variants of NAT2 may have simultaneously become targets of directional selection, thereby generating an excess of intermediate-frequency alleles. This would explain why our conventional tests of selection based on EHH, more suited to detect classical selective sweeps, failed to detect a signature of positive selection at the rs1799930 locus. The signature of selection at this individual position could have been weakened by the global increase in frequency of other NAT2-altering mutations. Note, however, that contrary to c.191G>A (NAT2*14), c.341T>C (NAT2*5), and c.857G>A (NAT2*7), which mainly cluster in specific continental regions (sub-Saharan Africa, Europe, and Asia, respectively; Sabbagh et al. 2011), the cosmopolitan distribution of the c.590G>A variant (NAT2*6) suggests that it may have been positively [End Page 195] selected in globally distributed food-producing communities. Finally, the hypotheses of balancing and directional selection are not mutually exclusive, and multiple modes of selection may have operated at the NAT2 locus on a population-specific basis, as previously suggested (Mortensen et al. 2011).

Table 4. Tajima’s D Scores in the 1-kb Sliding Windows Spanning the Whole NAT2 Coding Region
Click for larger view
View full resolution
Table 4.

Tajima’s D Scores in the 1-kb Sliding Windows Spanning the Whole NAT2 Coding Region

Conclusion

We have described an atypical pattern of geographic differentiation for five genetic variants of the NAT2 gene: the functional rs1799930 SNP defining the slow NAT2*6 haplotype series and four intronic SNPs in high LD with it. An extended analysis of a 600-kb region surrounding NAT2 pointed to the rs1799930 polymorphism as the most likely target of a homogenizing process of natural selection promoting the same allelic variant in most human populations, resulting in an unusually low FST value (FST = 0.006). The rs1799930 A allele has been associated with the slowest acetylation capacity in vivo and is much more frequent in agriculturalists and pastoralists compared with hunter-gatherers, [End Page 196] suggesting it may have been positively selected in food-producing communities that are now present in most parts of the world. Neutrality tests based on the allele frequency spectrum revealed a trend toward an excess of intermediate-frequency variants at NAT2, compatible with either balancing selection or a more complex model of multiallelic directional selection. Our findings provide further insights into the functional importance of the rs1799930 polymorphism and the role it may have played in human adaptation to fluctuating xenobiotic environments.

B. Patillon
IRD UMR216, Mère et enfant face aux infections tropicales, Paris, France.
PRES Sorbonne Paris Cité, Université Paris Descartes, Faculté de Pharmacie, Paris, France.
Université Paris Sud, Kremlin-Bicêtre, France.
P. Luisi
Institute of Evolutionary Biology, CEXS-UPF-PRBB, Barcelona, Spain.
E. S. Poloni
Laboratory of Anthropology, Genetics and Peopling History, Anthropology Unit, Department of Genetics and Evolution, University of Geneva, Geneva, Switzerland.
S. Boukouvala
Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, Greece.
P. Darlu
CNRS UMR7206, Muséum National d’Histoire Naturelle, Université Paris Diderot, Paris, France.
E. Genin
INSERM U1078, Université de Bretagne Occidentale, Etablissement Français du Sang-Bretagne, Brest, France.
These authors contributed equally to the work and share senior authorship.
A. Sabbagh
IRD UMR216, Mère et enfant face aux infections tropicales, Paris, France.
PRES Sorbonne Paris Cité, Université Paris Descartes, Faculté de Pharmacie, Paris, France.
These authors contributed equally to the work and share senior authorship.
Correspondence to: Audrey Sabbagh, UMR 216 IRD – Université Paris Descartes, Faculté de Pharmacie, 4 avenue de l’Observatoire, 75270 Paris Cedex 06, France. E-mail: audrey.sabbagh@ird.fr.
Received 10 June 2014; revision accepted for publication 12 November 2014.

acknowledgments

This work was financially supported by the Institut Médicament Toxicologie Chimie Environnement. B.P. is supported by a Ph.D. fellowship from the doctoral program in Public Health from Paris Sud University. P.L. is supported by a Ph.D. fellowship from “Acción Estratrégica de Salud, en el Marco del Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica 2008–2011” from Instituto de Salud Carlos III.

literature cited

1000 Genomes Project Consortium, G. R. Abecasis, A. Auton et al. 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65.
Agúndez, J. A. G. 2008. Polymorphisms of human N-acetyltransferases and cancer risk. Curr. Drug Metab. 9:520–531.
An, H.-R., X.-Q. Wu, Z.-Y. Wang et al. 2012. NAT2 and CYP2E1 polymorphisms associated with antituberculosis drug-induced hepatotoxicity in Chinese patients. Clin. Exp. Pharmacol. Physiol. 39:535–543.
Barreiro, L. B., G. Laval, H. Quach et al. 2008. Natural selection has driven population differentiation in modern humans. Nat Genet. 40:340–345.
Beaumont, M. A., and R. A. Nichols. 1996. Evaluating loci for use in the genetic analysis of population structure. Pro. Biol. Sci. 263:1,619–1,626.
Butcher, N. J., S. Boukouvala, E. Sim et al. 2002. Pharmacogenetics of the arylamine N-acetyltransferases. Pharmacogenomics J. 2:30–42.
Cascorbi, I., N. Drakoulis, J. Brockmöller et al. 1995. Arylamine N-acetyltransferase (NAT2) mutations and their allelic linkage in unrelated Caucasian individuals: Correlation with phenotypic activity. Am. J. Hum. Genet. 57:581–592.
Elhaik, E. 2012. Empirical distributions of F(ST) from large-scale human polymorphism data. PLoS One 7:e49837.
Fuselli, S., R. H. Gilman, S. J. Chanock et al. 2007. Analysis of nucleotide diversity of NAT2 coding region reveals homogeneity across Native American populations and high intra-population diversity. Pharmacogenomics J. 7:144–152.
García-Martín, E. 2008. Interethnic and intraethnic variability of NAT2 single nucleotide polymorphisms. Curr. Drug Metab. 9:487–497.
Génin, E., A. Tullio-Pelet, S. Lyonnet et al. 2004. Estimating the age of rare disease mutations: The example of triple A syndrome. J. Med. Genet. 41:445–449.
Hein, D. W. 2002. Molecular genetics and function of NAT1 and NAT2: Role in aromatic amine metabolism and carcinogenesis. Mutat. Res. 506–507:65–77.
Hein, D. W. 2006. N-Acetyltransferase 2 genetic polymorphism: Effects of carcinogen and haplotype on urinary bladder cancer risk. Oncogene 25:1,649–1,658.
Hill, W. G., and A. Robertson. 1968. Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38:226–231.
Huang, Y.-S., H.-D. Chern, W.-J. Su et al. 2002. Polymorphism of the N-acetyltransferase 2 gene as a susceptibility risk factor for antituberculosis drug-induced hepatitis. Hepatology 35:883–889.
Kelley, J. L., J. Madeoy, J. C. Calhoun et al. 2006. Genomic signatures of positive selection in humans and the limits of outlier approaches. Genome Res. 16:980–989.
Kilbane, A. J., L. K. Silbart, M. Manis et al. 1990. Human N-acetylation genotype determination with urinary caffeine metabolites. Clin. Pharmacol. Ther. 47:470–477.
Ladero, J. M. 2008. Influence of polymorphic N-acetyltransferases on non-malignant spontaneous disorders and on response to drugs. Curr. Drug Metab. 9:532–537.
Lee, P. H., and H. Shatkay. 2008. F-SNP: computationally predicted functional SNPs for disease association studies. Nucleic Acids Res. 36:D820–D824.
Lee, P. H., and H. Shatkay. 2009. An integrative scoring system for ranking SNPs by their potential deleterious effects. Bioinformatics 25:1,048–1,055.
Lee, S.-W., L. S.-C. Chung, H.-H. Huang et al. 2010. NAT2 and CYP2E1 polymorphisms and susceptibility to first-line anti-tuberculosis drug-induced hepatitis. Int. J. Tuberc. Lung Dis. 14:622–626.
Leiro-Fernandez, V., D. Valverde, R. Vázquez-Gallardo et al. 2011. N-Acetyltransferase 2 polymorphisms and risk of anti-tuberculosis drug-induced hepatotoxicity [End Page 197] in Caucasians. Int. J. Tuberc. Lung Dis. 15:1,403–1,408.
Luca, F., G. Bubba, M. Basile et al. 2008. Multiple advantageous amino acid variants in the NAT2 gene in human populations. PLoS One 3:e3136.
Magalon, H., E. Patin, F. Austerlitz et al. 2008. Population genetic diversity of the NAT2 gene supports a role of acetylation in human adaptation to farming in Central Asia. Eur. J. Hum. Genet. 16:243–251.
Meisel, P. 2002. Arylamine N-acetyltransferases and drug response. Pharmacogenomics 3:349–366.
Mortensen, H. M., A. Froment, G. Lema et al. 2011. Characterization of genetic variation and natural selection at the arylamine N-acetyltransferase genes in global human populations. Pharmacogenomics 12:1,545–1,558.
Parkin, D. P., S. Vandenplas, F. J. Botha et al. 1997. Trimodality of isoniazid elimination: Phenotype and genotype in patients with tuberculosis. Am. J. Respir. Crit. Care Med. 155:1,717–1,722.
Patin, E., L. B. Barreiro, P. C. Sabeti et al. 2006. Deciphering the ancient and complex evolutionary history of human arylamine N-acetyltransferase genes. Am. J. Hum. Genet. 78:423–436.
Prado-Martinez, J., P. H. Sudmant, J. M. Kidd et al. 2013. Great ape genetic diversity and population history. Nature 499:471–475.
Pritchard, J. K., J. K. Pickrell, and G. Coop. 2010. The genetics of human adaptation: Hard sweeps, soft sweeps, and polygenic adaptation. Curr. Biol. 20:R208–R215.
Purcell, S., B. Neale, K. Todd-Brown et al. 2007. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81:559–575.
Ruiz, J. D., C. Martínez, K. Anderson et al. 2012. The differential effect of NAT2 variant alleles permits refinement in phenotype inference and identifies a very slow acetylation genotype. PLoS One 7:e44629.
Sabbagh, A., and P. Darlu. 2006. SNP selection at the NAT2 locus for an accurate prediction of the acetylation phenotype. Genet. Med. 8:76–85.
Sabbagh, A., P. Darlu, B. Crouau-Roy et al. 2011. Arylamine N-acetyltransferase 2 (NAT2) genetic diversity and traditional subsistence: A worldwide population survey. PLoS One 6:e18507.
Sabbagh, A., A. Langaney, P. Darlu et al. 2008. Worldwide distribution of NAT2 diversity: Implications for NAT2 evolutionary history. BMC Genet. 9:21.
Sabbagh, A., J. Marin, C. Veyssière et al. 2013. Rapid birth-and-death evolution of the xenobiotic metabolizing NAT gene family in vertebrates with evidence of adaptive selection. BMC Evol. Biol. 13:62.
Sabeti, P. C., D. E. Reich, J. M. Higgins et al. 2002. Detecting recent positive selection in the human genome from haplotype structure. Nature 419:832–837.
Sabeti, P. C., P. Varilly, B. Fry et al. 2007. Genome-wide detection and characterization of positive selection in human populations. Nature 449:913–918.
Sanderson, S., G. Salanti, and J. Higgins. 2007. Joint effects of the N-acetyltransferase 1 and 2 (NAT1 and NAT2) genes and smoking on bladder carcinogenesis: A literature-based systematic HuGE review and evidence synthesis. Am. J. Epidemiol. 166:741–751.
Selinski, S., M. Blaszkewicz, K. Ickstadt et al. 2013. Refinement of the prediction of N-acetyltransferase 2 (NAT2) phenotypes with respect to enzyme activity and urinary bladder cancer risk. Arch. Toxicol. 87:2,129–2,139.
Selinski, S., M. Blaszkewicz, K. Ickstadt et al. 2014. Improvements in algorithms for phenotype inference: The NAT2 example. Curr. Drug Metab. 15:233–249.
Stajich, J. E., D. Block, K. Boulez et al. 2002. The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 12:1,611–1,618.
Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–95.
Teixeira, R. L., R. G. Morato, P. H. Cabello et al. 2011. Genetic polymorphisms of NAT2, CYP2E1 and GST enzymes and the occurrence of antituberculosis drug-induced hepatitis in Brazilian TB patients. Mem. Inst. Oswaldo Cruz 106:716–724.
Voight, B. F., S. Kudaravalli, X. Wen et al. 2006. A map of recent positive selection in the human genome. PLoS Biol. 4:e72.
Walker, K., G. Ginsberg, D. Hattis et al. 2009. Genetic polymorphism in N-acetyltransferase (NAT): Population distribution of NAT1 and NAT2 activity. J. Toxicol. Environ. Health B Crit. Rev. 12:440–472.
Weber, W. W., and D. W. Hein. 1985. N-Acetylation pharmacogenetics. Pharmacol. Rev. 37:25–79.
Woolfe, A., D. K. Goode, J. Cooke et al. 2007. CONDOR: A database resource of developmentally associated conserved non-coding elements. BMC Dev. Biol. 7:100.
Wright, S. 1951. The genetical structure of populations. Ann. Eugen. 15:323–354. [End Page 198]
Supplemental Tabel S1. Haplotypes and List of Single Nucleotide Variants Considered to Estimate the Age of the Derived Allele at rs1799930
Click for larger view
View full resolution
Supplemental Tabel S1.

Haplotypes and List of Single Nucleotide Variants Considered to Estimate the Age of the Derived Allele at rs1799930

[End Page 209]

Supplemental Table S2. Allele Frequency Data of the rs1799930 Single Nucleotide Polymorphism Retrieved from the Literature
Click for larger view
View full resolution
Supplemental Table S2.

Allele Frequency Data of the rs1799930 Single Nucleotide Polymorphism Retrieved from the Literature

supplemental table s2 notes

1. Loktionov, A., W. Moore, S. P. Spencer et al. 2002. Differences in N-acetylation genotypes between Caucasians and Black South Africans: Implications for cancer prevention. Cancer Detect. Prev. 26:15–22.

2. Dandara, C., C. M. Masimirembwa, A. Magimba et al. 2003. Arylamine N-acetyltransferase (NAT2) genotypes in Africans: The identification of a new allele with nucleotide changes 481C>T and 590G>A. Pharmacogenetics 13:55–58.

3. Matimba, A., J. Del-Favero, C. Van Broeckhoven, and C. Masimirembwa. 2009. Novel variants of major drug-metabolising enzyme genes in diverse African populations and their predicted functional effects. Hum. Genomics 3:169–190.

4. Patin, E., C. Harmant, K. K. Kidd et al. 2006b. Sub-Saharan African coding sequence variation and haplotype diversity at the NAT2 gene. Hum. Mutat. 27:720.

6. Delomenie, C., L. Sica, D. M. Grant et al. 1996. Genotyping of the polymorphic N-acetyltransferase (NAT2*) gene locus in two native African populations. Pharmacogenetics 6:177–185.

10. Al-Yahyaee, S., U. Gaffar, M. M. Al-Ameri et al. 2007. N-acetyltransferase polymorphism among northern Sudanese. Hum. Biol. 79:445–452.

11. Agúndez, J. A., K. Golka, C. Martínez et al. 2008. Unraveling ambiguous NAT2 genotyping data. Clin. Chem. 54:1,390–1,394.

12. Gil, J. P., and M. C. Lechner. 1998. Increased frequency of wild-type arylamine-N-acetyltransferase allele NAT2*4 homozygotes in Portuguese patients with colorectal cancer. Carcinogenesis 19:37–41.

13. Aynacioglu, A. S., I. Cascorbi, P. M. Mrozikiewicz, and I. Roots. 1997. Arylamine N-acetyltransferase (NAT2) genotypes in a Turkish population. Pharmacogenetics 7:327–331.

14. Krajinovic, M., C. Richer, H. Sinnett et al. 2000. Genetic polymorphisms of N-acetyltransferases 1 and 2 and gene-gene interaction in the susceptibility to childhood acute lymphoblastic leukemia. Cancer Epidemiol. Biomarkers Prev. 9:557–562.

16. Schnakenberg, E., M. Lustig, R. Breuer et al. 2000. Gender-specific effects of NAT2 and GSTM1 in bladder cancer. Clin. Genet. 57:270–277.

17. van der Hel, O. L., P. H. Peeters, D. W. Hein et al. 2003. NAT2 slow acetylation and GSTM1 null genotypes may increase postmenopausal breast cancer risk in long-term smoking women. Pharmacogenetics 13:399–407.

18. Deitz, A. C., W. Zheng, M. A. Leff et al. 2000. N-Acetyltransferase-2 genetic polymorphism, well-done meat intake, and breast cancer risk among postmenopausal women. Cancer Epidemiol. Biomarkers Prev. 9:905–910.

19. Smith, C. A., M. Wadelius, A. C. Gough et al. 1997. A simplified assay for the arylamine N-acetyltransferase 2 polymorphism validated by phenotyping with isoniazid. J. Med. Genet. 34:758–760.

20. Okkels, H., T. Sigsgaard, H. Wolf, and H. Autrup. 1997. Arylamine N-acetyltransferase 1 (NAT1) and 2 (NAT2) polymorphisms in susceptibility to bladder cancer: The influence of smoking. Cancer Epidemiol. Biomarkers Prev. 6:225–231.

21. Mrozikiewicz, P. M., I. Cascorbi, J. Brockmoller, and I. Roots. 1996. Determination and allelic allocation of seven nucleotide transitions within the arylamine N-acetyltransferase gene in the Polish population. Clin. Pharmacol. Ther. 59:376–382.

22. Habalová, V., J. Salagovic, I. Kalina, and J. Stubna. 2005. A pilot study testing the genetic polymorphism of N-acetyltransferase 2 as a risk factor in lung cancer. Neoplasma 52:364–368.

23. Rabstein, S., K. Unfried, U. Ranft et al. 2006. Variation of the N-acetyltransferase 2 gene in a Romanian and a Kyrgyz population. Cancer Epidemiol. Biomarkers Prev. 15:138–141.

24. Belogubova, E. V., E. Sh. Kuligina, A. V. Togo et al. 2005. “Comparison of extremes” approach provides evidence against the modifying role of NAT2 polymorphism in lung cancer susceptibility. Cancer Lett. 221:177–183.

25. Gaikovitch, E. A., I. Cascorbi, P. M. Mrozikiewicz et al. 2003. Polymorphisms of drug-metabolizing enzymes CYP2C9, CYP2C19, CYP2D6, CYP1A1, NAT2 and of P-glycoprotein in a Russian population. Eur. J. Clin. Pharmacol. 59:303–312.

26. Bu, R., M. I. Gutiérrez, M. Al-Rasheed et al. 2004. Variable drug metabolism genes in Arab population. Pharmacogenomics J. 4:260–266.

27. Woolhouse, N. M., M. M. Qureshi, S. M. Bastaki et al. 1997. Polymorphic N-acetyltransferase (NAT2) genotyping of Emiratis. Pharmacogenetics 7:73–82.

28. Al-Moundhri, M. S., M. Al-Kindi, M. Al-Nabhani et al. 2007. NAT2 polymorphism in Omani gastric cancer patients-risk predisposition and clinicopathological associations. World J. Gastroenterol. 13:2,697–2,702.

29. Torkaman-Boutorabi, A., M. Hoormand, N. Naghdi et al. 2007. Genotype and allele frequencies of N-acetyltransferase 2 and glutathione S-transferase in the Iranian population. Clin. Exp. Pharmacol. Physiol. 34:1,207–1,211.

30. Anitha, A., and M. Banerjee. 2003. Arylamine N-acetyltransferase 2 polymorphism in the ethnic populations of South India. Int. J. Mol. Med. [End Page 213] 11:125–131.

31. Singh, N., S. Dubey, S. Chinnaraj et al. 2009. Study of NAT2 gene polymorphisms in an Indian population: Association with plasma isoniazid concentration in a cohort of tuberculosis patients. Mol. Diagn. Ther. 13:49–58.

32. Srivastava, D. S., and R. D. Mittal. 2005. Genetic polymorphism of the N-acetyltransferase 2 gene, and susceptibility to prostate cancer: A pilot study in north Indian population. BMC Urol. 5:12.

34. Yuliwulandari, R., Q. Sachrowardi, N. Nishida et al. 2008. Polymorphisms of promoter and coding regions of the arylamine N-acetyltransferase 2 (NAT2) gene in the Indonesian population: proposal for a new nomenclature. J. Hum. Genet. 53:201–209.

35. Lin, H. J., C. Y. Han, B. K. Lin, and S. Hardy. 1994. Ethnic distribution of slow acetylator mutations in the polymorphic N-acetyltransferase (NAT2) gene. Pharmacogenetics 4:125–134.

36. Bechtel, Y. C., P. R. Bechtel, H. Lelouët et al. 2001. The acetylator polymorphism in a Khmer population: Clinical consequences. Therapie 56:409–413.

37. Cavaco, I., S. Asimus, M. Peyrard-Janvid et al. 2007. The Vietnamese Khin population harbors particular N-acetyltransferase 2 allele frequencies. Clin. Chem. 53:1,977–1,979.

38. Guo, W. C., G. F. Lin, Y. L. Zha et al. 2004. N-Acetyltransferase 2 gene polymorphism in a group of senile dementia patients in Shanghai suburb. Acta Pharmacol. Sin. 25:1,112–1,117.

39. Song, D. K., D. L. Xing, L. R. Zhan et al. 2009. Association of NAT2, GSTM1, GSTT1, CYP2A6, and CYP2A13 gene polymorphisms with susceptibility and clinicopathologic characteristics of bladder cancer in Central China. Cancer Detect. Prev. 32:416–423.

40. Tanaka, E., A. Taniguchi, W. Urano et al. 2002. Adverse effects of sulfasalazine in patients with rheumatoid arthritis are associated with diplotype configuration at the N-acetyltransferase 2 gene. J. Rheumatol. 29:2,492–2,499.

41. Deguchi, M., S. Yoshida, S. Kennedy et al. 2005. Lack of association between endometriosis and N-acetyl transferase 1 (NAT1) and 2 (NAT2) polymorphisms in a Japanese population. J. Soc. Gynecol. Investig. 12:208–213.

42. Sekine, A., S. Saito, A. Iida et al. 2001. Identification of single-nucleotide polymorphisms (SNPs) of human N-acetyltransferase genes NAT1, NAT2, AANAT, ARD1 and L1CAM in the Japanese population. J. Hum. Genet. 46:314–319.

43. Machida, H., K. Tsukamoto, C. Y. Wen et al. 2005. Crohn’s disease in Japanese is associated with a SNP-haplotype of N-acetyltransferase 2 gene. World J. Gastroenterol. 11:4,833–4,837.

44. Lee, K. M., S. K. Park, S. U. Kim et al. 2003. N-acetyltransferase (NAT1, NAT2) and glutathione S-transferase (GSTM1, GSTT1) polymorphisms in breast cancer. Cancer Lett. 196:179–186.

45. Lee, S. Y., K. A. Lee, C. S. Ki et al. 2002. Complete sequencing of a genetic polymorphism in NAT2 in the Korean population. Clin. Chem. 48:775–777.

47. Hegele, R. A., K. Kwan, S. B. Harris et al. 2000. NAT2 polymorphism associated with plasma glucose concentration in Canadian Oji-Cree. Pharmacogenetics 10:233–238.

48. Martinez, C., J. A. Agundez, M. Olivera et al. 1998. Influence of genetic admixture on polymorphisms of drug-metabolizing enzymes: Analyses of mutations on NAT2 and C gamma P2E1 genes in a mixed Hispanic population. Clin. Pharmacol. Ther. 63:623–628.

49. Jorge-Nebert, L. F., M. Eichelbaum, E. U. Griese et al. 2002. Analysis of six SNPs of NAT2 in Ngawbe and Embera Amerindians of Panama and determination of the Embera acetylation phenotype using caffeine. Pharmacogenetics 12:39–48.

50. Arias, T. D., L. F. Jorge, E. U. Griese et al. 1993. Polymorphic N-acetyltransferase (NAT2) in Amerindian populations of Panama and Colombia: high frequencies of point mutation 857A, as found in allele S3/M3. Pharmacogenetics 3:328–331.

51. Teixeira, R. L., A. B. Miranda, A. G. Pacheco et al. 2007. Genetic profile of the arylamine N-acetyltransferase 2 coding gene among individuals from two different regions of Brazil. Mutat. Res. 624:31–40.

52. Bailliet, G., M. R. Santos, E. L. Alfaro et al. 2007. Allele and genotype frequencies of metabolic genes in Native Americans from Argentina and Paraguay. Mutat. Res. 627:171–177.

53. Le Marchand, L., J. H. Hankin, L. R. Wilkens et al. 2001. Combined effects of well-done red meat, smoking, and rapid N-acetyltransferase 2 and CYP1A2 phenotypes in increasing colorectal cancer risk. Cancer Epidemiol. Biomarkers Prev. 10:1,259–1,266.

54. International HapMap 3 Consortium. 2010. Integrating common and rare genetic variation in diverse human populations. Nature 467:52–58.

Share