GeneExpress: a WWW-oriented integrator for the databases and computer systems for studying the eukaryotic gene expression

Kolchanov N.A., Ponomarenko M.P., Kel A.E., Kondrakhin Y.V., Frolov A.S., Kolpakov F.A., Goryachkovskaya T.N., Kel O.V., Ananko E.A., Ignatieva E.V., Podkolodnaya O.A., Stepanenko I.L., Merkulova T.I., Babenko V.V., Vorobyiev D.V., Lavryushev S.V., Ponomarenko J.V., Kochetov A.V., Kolesov G.N., Podkolodny N.L.1, Milanesi L.2, Wingender E.3, Heinemeyer T. 3, Solovyev V.V. 4

Institute of Cytology & Genetics, 630090, Novosibirsk, Russia; FAX: +7(3832)356-558; E-mail: kol@bionet.nsc.ru;

1Institute of Computational Mathematics & Mathematical Geophysics, Novosibirsk, Russia;

2Istituto Di Tecnologie Biomediche Avanzate, Milano, Italy;

3Gesellschaft fur Biotechnologische Forschung mbH, Braunschweig, Germany;

4The Sanger Centre, Cambridge, UK;

The eukaryotic gene expression is one of the most complex biological phenomena involving a number of molecular events. It may start by the cellular reception of a definite stimulus which is, then, passed via the particular signal transduction pathway to initiate transcription of the relevant genes. Their pre-mRNAs are processed by 3'cutting/polyadenylation, capping, splicing, and finally the corresponding proteins are translated from these mature mRNAs. This totality of molecular events forms the particular gene network that provides the cell response to the stimulus. The cellular and organismic homeostases are maintained by their gene networks as well as cell/tissue differentiation and development. Thus, investigation of the gene expression is an integrative problem of biology. That is why we have developed GeneExpress - the WWW-oriented integrator for the databases and computer systems for studying the gene expression. The database GeneNet on molecular events forming gene networks was assigned its integrative core. To study transcription, this core was supplemented with the database TRRD on transcription regulatory regions and the compilation TFBSC of the sequence sets of transcription factor binding sites. The TRRD and TFBSC were linked to the computer system RgScan recognizing the sites in DNA sequences. For translation, the database on mRNA leaders was included and linked to the program predicting the High/Low translation levels from a given mRNA sequence. The gene expression is also quantitatively described by the system ACTIVITY compiling the functional site activity magnitudes and linked with the programs predicting the activities from site sequences. It is essential that the GeneExpress can and have to progress and expand continuously to update and integrate new resources for investigating other molecular events of the gene expression, such as splicing, DNA/protein interactions, etc. To navigate GeneExpress, the SRS, HTML, and Java viewers were developed (http://wwwmgs.bionet.nsc.ru/systems/GeneExpress/).

The work was supported by grants of Russian Human Genome and Russian Foundation for Basic Research.