Study of peculiarities of vertebrate promoter region evolution

Kolpakov F.A., Kolchanov N.A.

Institute of Cytology & Genetics, Laboratory of Theoretical Molecular Genetics, Lavrentieva 10, Novosibirsk, 630090 Russia, E-mail: fedor@bionet.nsc.ru

Hemoglobin, apolipoprotein, and several other families of isofunctional genes were employed to study the peculiarities of vertebrate promoter region evolution. In this work, the region of predominant location of transcription factor binding sites was considered as a promoter region. Analysis of over 2,000 actual site locations from TRANSFAC database allowed us to determine the latter region as region [-320; +20] relative to the transcription start.

The nucleotide sequences of promoter regions as well as the sequences of introns and exons of the corresponding genes were aligned pairwise within several gene families. The rate of neutral mutation fixation was determined through analysis of alignment of the introns and the third codon positions, which are evolving predominantly in the neutral mode. In the adaptive evolution mode, the rate of mutation fixation exceeds the rate of their emergence. Hence, the contribution of adaptive and stabilizing selections to the promoter evolution was estimated through comparison of the fixation rate of the substitutions in the promoter regions with the fixation rate of the neutral mutations. It was demonstrated that promoters of several genes of rather closely related organisms (for example, human and macaque) evolve predominantly in the adaptive mode (the rate of mutation fixation is higher than that in their introns), whereas in considerably distant organisms (for, example, human and swine), the evolution mode is predominantly stabilizing (the rate of mutation fixation in promoters is considerably lower than in the introns and approximately equal to that in the exons). A high abundance of deletions in promoters compared to the exons and equal to that in the introns was also demonstrated.

The mutational events were demonstrated to be uneven along the promoter regions. The regions [-40; -20] relative transcription start site (TATA box) and [-110; -90] (CAAT box) were shown to be most conservative for the sets of aligned sequences of promoter regions.

All potential transcription factor binding sites were found in the promoter regions. Dependence between the locations of sites and the locations of mutations was demonstrated for several gene groups.