Frolov A.S.8, Lavryushev S.V., Grigorovich D.A., Kel A.E., Ptitsyn A.A.6, Kolchanov N.A., Podkolodny N.L.1,4, Solovyev V.V.7, Milanesi L.2, Bourne P.5, Wingender E.3, and Overton G.C.2
Institute of Cytology & Genetics, Novosibirsk, Russia;
1Institute of Computational Mathematics & Mathematical Geophysics, Novosibirsk, Russia;
2Istituto Di Tecnologie Biomediche Avanzate, Milano, Italy;
3Gesellschaft fur Biotechnologische Forschung mbH, Braunschweig, Germany;
4UPenn, Philadelphia, USA;
5San Diego Supercomputer Center, California, USA;
6South African National Bioinformatics Institute, SA;
7The Sanger Centre, Cambridge, UK
8Corresponding author: IC&G, Novosibirsk, 630090, Russia; FAX: +7(3832)351-278; E-mail: fas@bionet.nsc.ru
A number of the WWW-servers linking databases and programs for molecular genetic studies are available now. They are suggesting the current list of the WWW-resources to their user who should select one of them and, then, apply it independently of others. Nevertheless, some of the molecular genetic tasks require the use of several databases and programs simultaneously. For example, the genome annotation requires a homology search through the GeneBank and EMBL databases along with computer recognition of the functional sites by their patterns. The linking WWW-servers cannot provide this kind of the integrative studies. The genome annotation is becoming a pivotal task; hence, integrative servers are demanded. That is why we have created the integrative WWW-server, http://wwwmgs.bionet.nsc.ru, which has been especially designed to integrate the databases and programs for analyzing molecular genetic data. Its key idea is to link the programs analyzing a defined kind of data and the databases for storage of this kind of data. Thus, the programs recognizing transcription elements and eukaryotic promoters were cross-linked with Transcription Regulatory Region Database; the programs predicting activities of the functional sites, with the database for the functional site activity; the programs predicting translation efficiency, with the database of the leader mRNA sequences; and the computer system LIKENESS, fast-searching conformationally similar proteins through complete PDB, with the WWW-based version of this PDB base, named "MOOSE". To produce all the necessary programs, the automated generators of the C-code programs have been developed and integrated. Also, the relevant entities of all these databases were linked to one another as well as the TRRD, COMPEL, EMBL, GeneBank, TRANSFAC, GERD, EpoDB, SWISS-Prot, and PDB databases, cross-linked earlier. All these resulted in our integrative WWW-server for the complex analysis of DNA sequence via recognizing a promoter, predicting transcription elements with their activities, estimating the translation efficiency of the respective mRNA, and, finally, searching the potential protein by its similarity with all the known proteins, etc.
We are grateful to the Russian Found for Basic Research, N 97-07-90309, and Russian Human Genome.