Genome-wide genetic variations are highly correlated with proximal DNA methylation patterns.

5-methyl-cytosines at CpG sites frequently mutate into thymines, accounting for a large proportion of spontaneous point mutations. The repair system would leave substantial numbers of errors in neighboring regions if the synthesis of erased gaps around deaminated 5-methyl-cytosines is error-prone. Indeed, we identified an unexpected genome-wide role of the CpG methylation state as a major determinant of proximal natural genetic variation. Specifically, 507 Mbp (∼18%)of the human genome was within 10 bp of a CpG site; in these regions, the single nucleotide polymorphism (SNP) rate significantly increased by ∼50% (P < 10(-566) by a two-proportion z-test) if the neighboring CpG sites are methylated. To reconfirm this finding in another vertebrate, we compared six single-base resolution methylomes in two inbred medaka (Oryzias latipes) strains with sufficient genetic divergence (3.4%). We found that the SNP rate also increased by ∼50% (P < 10(-2170)), and the substitution rates in all dinucleotides increased simultaneously (P < 10(-441)) around methylated CpG sites. In the hypomethylated regions, the "CGCG" motif was significantly enriched (P < 10(-680)) and evolutionarily conserved (P = ∼ 0.203%), and slow CpG deamination rather than fast CpG gain was seen, indicating a possible role of CGCG as a candidate cis-element for the hypomethylation state. In regions that were hypermethylated in germline-like tissues but were hypomethylated in somatic liver cells, the SNP rate was significantly smaller than that in hypomethylated regions in both tissues, suggesting a positive selective pressure during DNA methylationreprogramming. This is the first report of findings showing that the CpGmethylation state is significantly correlated with the characteristics of evolutionary change in neighboring DNA.
Enhanced by Zemanta

No comments: