Prediction of RNA secondary structure using genetic algorithm with steepest descent

Titov I.I., Ivanisenko V.A., Kolchanov N.A.

Institute of Cytology and Genetics SD RAS, 630090, Novosibirsk, Ave. Lavrentieva 10, Russia; E-mail titov@bionet.nsc.ru; fax (8-3843) 33-12-78

According to the FDC criterion (fitness-distance correlation, Jones and Forrest, 1995), the RNA secondary structure prediction is not GA-easy, since the number of structures lacking common stems with the minimal-energy structure is exponentially great. On the other hand, the additivity of energetic rules for RNA secondary structure is a good basis for validity (important for GA performance) of building blocks hypothesis; in addition, a number of examples demonstrates successful GA application for secondary structure energy minimization.

We modified GA to provide the maximal distinction of its dynamics from the routes in static FDC-metric. The major peculiarities of our algorithm are: 1) initial diversity defines population size: the initial set is created of random individuals, their maximal noncoincidence provided; the set is formed when the stem set is exhausted; 2) crossover symmetry: we employ exchange of the clusters with approximately equal number of stems, whereas the recombinations evenly random in size are conventionally used; 3) local exhaustion instead of stochastic mutations: we remove a defined number of random stems and form new stems on the principle of the fastest energy optimization.

The described algorithm possesses a higher performance compared to the GA with conventional rules. On IBM PC-486, it takes our algorithm less than a minute to find the global minimum of RNA free energy in a space of 109 states. Thus, the criterion of GA quality in the space of its own operators is essential. However, some complications connected with dynamic character of operators' action may arise. For example, the absence of any convergence is observed when recombination frequency is considerable ("error catastrophe").

The work was supported by INTAS- Russian Foundation for Basic Research grant 95-0653.