Computer system MGL: a tool for sample generation, visualization, and analysis of regulatory genomic sequences

Kolpakov F.A.

Institute of Cytology & Genetics, Laboratory of Theoretical Molecular Genetics, 630090, Novosibirsk, Lavrentieva 10, Russia; E-mail: fedor@bionet.nsc.ru

A computer system MGL (Molecular Genetic Language) was designed for search, extraction from databases, sample generation, visualization, and analysis of regulatory genomic sequences (RGS): regions of DNA/RNA sequences involved in regulation of various molecular genetic processes (replication, transcription, splicing, translation, etc.).

The MGL system offers a wide range of possibilities for searching and extracting the information on RGS from the EMBL, TRRD, Compel, EPD, and Nucleo databases. MGL generates samples of promoters, transcription factor binding sites, and nucleosome positioning sites in an automatic mode using the EMBL database as a source of nucleotide sequences, while the data on RGS location in the sequences are extracted from the EPD, TRRD, and Nucleo databases, respectively. The MGL system also generates automatically nucleotide sequence samples of various RGS (promoters, splicing sites, mRNA, CDS, 5'- and 3'-untranslated regions, polyadenylation sites) basing on semantic analysis of the EMBL FEATURE TABLE information.

A number of tools for RGS analysis are included in MGL: calculation of nucleotide and oligonucleotide composition, pairwise general and local alignment, rapid estimation of pairwise general alignment significance, multiple local alignment, search for transcription factor binding sites, etc.

The MGL system provides visualization of data on the RGS location in nucleotide sequences from TRRD and EMBL databases as well as from results of transcription factor binding sites search in a map form.

The MGL system is provided with specialized high-level object-oriented language that allows a user to carry out sample generation and analysis of RGS in automatic mode.

The MGL system has a friendly user interface designed for Windows95 and Windows NT. It is available at http://wwwmgs.bionet.nsc.ru/systems/MGL/Mgl.html.