Mironov A.A., Gelfand M.S.1
National Center for Biotechnology NIIGENETIKA, 113545, Moscow, Russia
1Institute of Protein Research, Russian Acad. Sci., 142292, Pushchino, Russia. E-mail: misha@imb.imb.ac.ru
Recognition of transcription regulation sites is one of the most difficult problems of computational molecular biology. In most cases small samle size and low degree of sequence conservation do not allow for construction of reliable recognition rules. We suggest a new approach to this problem based on simultaneous analysis of several related genomes. At that, we assume that groups of genes subject to the analyzed regulation are evolutionary stable. Thus, we select in each genome genes that have candidate sites in regulatory regions, and then find groups of homologous genes in the constructed sets. By the assumption, we can predict that the discovered groups are subject to the analyzed regulation.
This approach was applied to analysis of purine regulons in Escherichia coli and Haemophilus influenzae. We were able to identify PurR binding sites in regulatory regions of H.influenzae genes homologous to the E.coli genes subject to PurR regulation, and to find a new family of E.coli and H.influenzae permeases belonging to the purine regulon.
This study was partially supported by a grant from the Russian Foundation for Basic Research.