Abstract
The GeneMark method has proven to be an efficient gene-finding tool for the analysis of prokaryotic genomic sequence data. We have developed a procedure of deriving and utilizing several GeneMark models in order to get better gene-detection performance. Upon applying this procedure to the 1.0 Mb contiguous DNA sequence of Synechocystis sp. strain PCC6803, we were able to cluster predicted genes into distinct classes and to produce the class-specific GeneMark models reflecting statistical characteristics of each gene class. One gene class apparently includes genes of exogenous origin. Using class-specific models reduces the gene under prediction error rate down to 1.7% in comparison with 8.1% reported in the previous study when only one GeneMark model was used.