Abstract
Identification of co-expressed genes sharing similar biological behaviours is an essential step in functional genomics. Traditional clustering techniques are generally based on overall similarity of expression levels and often generate clusters with mixed profile patterns. A novel pattern recognition method for selecting co-expressed genes based on rate of change and modulation status of gene expression at each time interval is proposed in this paper. This method is capable of identifying gene clusters consisting of highly similar shapes of expression profiles and modulation patterns. Furthermore, we develop a quality index based on the semantic similarity in gene annotations to assess the likelihood of a cluster being a co-regulated group. The effectiveness of the proposed methodology is demonstrated by applying it to the well-known yeast sporulation dataset and an in-house cancer genomics dataset.
Acknowledgements
The authors acknowledge the contributions of all team members involved in this research at the Institute for Information Technology and the Biotechnology Research Institute, National Research Council Canada. In addition, we would like to thank Bob Orchard who reviewed and provided valuable comments on an earlier version of this paper. This is publication NRC 48820 of the National Research Council Canada.