7,010
Views
7
CrossRef citations to date
0
Altmetric
Article; Bioinformatics

DeepFinder: An integration of feature-based and deep learning approach for DNA motif discovery

, , &
Pages 759-768 | Received 17 Feb 2017, Accepted 03 Feb 2018, Published online: 10 Feb 2018
 

ABSTRACT

We propose an improved solution to the three-stage DNA motif prediction approach. The three-stage approach uses only a subset of input sequences for initial motif prediction, and the initial motifs obtained are employed for site detection in the remaining input subset of non-overlaps. The currently available solution is not robust because motifs obtained from the initial subset are represented as a position weight matrices, which results in high false positives. Our approach, called DeepFinder, employs deep learning neural networks with features associated with binding sites to construct a motif model. Furthermore, multiple prediction tools are used in the initial motif prediction process to obtain a higher number of positive hits. Our features are engineered from the context of binding sites, which are assumed to be enriched with specificity information of sites recognized by transcription factor proteins. DeepFinder is evaluated using several performance metrics on ten chromatin immunoprecipitation (ChIP) datasets. The results show marked improvement of our solution in comparison with the existing solution. This indicates the effectiveness and potential of our proposed DeepFinder for large-scale motif analysis.

Acknowledgements

YS is supported by the Malaysian MyBrain15 MyPhD Scholarship. ON is supported by the Fundamental Research Grant Scheme FRGS/1/2014/SG03/UNIMAS/02/2.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This study was supported by the Malaysia Research Acculturation Grant Scheme of the Ministry of Higher Education Malaysia [grant number RAGS/b(5)/927/2012(28)].