1,936
Views
136
CrossRef citations to date
0
Altmetric
Reviews

An overview of molecular fingerprint similarity search in virtual screening

&
Pages 137-148 | Received 21 Sep 2015, Accepted 03 Nov 2015, Published online: 04 Dec 2015

Bibliography

  • • = of interest, •• = of considerable interest
  • Johnson MA, Maggiora G. Concepts and Applications of Molecular Similarity. Johnson MA, Maggiora G, editors. New York (NY): Wiley; 1990.
  • Bender A, Glen RC. Molecular similarity: a key technique in molecular informatics. Org Biomol Chem. 2004;2(22):3204–3218.
  • Roy K, Kar S, Das RN. Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk Assessment. New York (NY): Academic Press; 2015.
  • Tropsha A, Golbraikh A. Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Curr Pharm Des. 2007;13(34):3494–3504.
  • Varnek A, Tropsha A. Chemoinformatics approaches to virtual screening. Varnek A, Tropsha A, editors. Cambridge: RSC Publishing; 2008.
  • Willett P. Similarity-based approaches to virtual screening. Biochem Soc Trans. 2003;31(Pt 3):603–606.
  • Cereto-Massague A, Ojeda MJ, Valls C, et al. Molecular fingerprint similarity search in virtual screening. Methods. 2015;71:58–63.
  • Willett P. Similarity-based virtual screening using 2D fingerprints. Drug Discov Today. 2006;11(23–24):1046–1053.
  • Maggiora G, Vogt M, Stumpfe D, et al. Molecular similarity in medicinal chemistry. J Med Chem. 2014;57(8):3186–3204.
  • Walters WP, Stahl MT, Murcko MA. Virtual screening - an overview. Drug Discov Today. 1998;3(4):160–178.
  • Muegge I, Oloff S. Virtual Screening. In: Abraham DJ, Rotella DP, editors. Burger’s Medicinal Chemistry, Drug Discovery, and Development. 7th ed. Hoboken (NJ): Wiley; 2010. p. 1–46.
  • Klebe G. Virtual ligand screening: strategies, perspectives and limitations. Drug Discov Today. 2006;11(13–14):580–594.
  • Shoichet BK. Virtual screening of chemical libraries. Nature. 2004;432(7019):862–865.
  • Bajorath J. Integration of virtual and high-throughput screening. Nature Rev Drug Discov. 2002;1:882–894.
  • Baber JC, Shirley WA, Gao Y, et al. The use of consensus scoring in ligand-based virtual screening. J Chem Inf Model. 2006;46(1):277–288.
  • Zhu T, Cao S, Su PC, et al. Hit identification and optimization in virtual screening: practical recommendations based on a critical literature analysis. J Med Chem. 2013;56(17):6560–6572.
  • Kar S, Roy K. How far can virtual screening take us in drug discovery? Expert Opin Drug Discov. 2013;8(3):245–261.
  • Sheridan RP. Chemical similarity searches: when is complexity justified? Expert Opin Drug Discov. 2007;2(4):423–430.
  • Matter H, Potter T. Comparing 3D Pharmacophore Triplets and 2D Fingerprints for Selecting Diverse Compound Subsets. J Chem Inf Comput Sci. 1999;39:1211–1225.
  • Daylight fingerprints. Daylight Chemical Information Systems, Inc. Mission Viejo (CA); 2015 [cited 2015 Nov 19]. Available from: www.daylight.com.
  • Carhart RE, Smith DH, Venkataraghavan R. Atom Pairs as Molecular Features in Structure-Activity Studies: Definition and Application. J Chem Inf Comput Sci. 1985;25:64–73.
  • MACCS (Molecular ACCess System). 2002. MDL information Systems, San Leandro (CA).
  • Barnard JM, Downs GM, Von Scholley-Pfab A, et al. Use of Markush structure analysis techniques for descriptor generation and clustering of large combinatorial libraries. J Mol Graph Model. 2000;18(4–5):452–463.
  • PubChem Substructure Fingerprints. 2015 [cited 2015 Nov 19]. Available from: ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.txt.
  • Bender A, Mussa HY, Gill GS, et al. Molecular surface point environments for virtual screening and the elucidation of binding patterns (MOLPRINT 3D). J Med Chem. 2004;47(26):6569–6583.

• introduction of circular fingerprint concept.

  • Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50(5):742–754.
  • Schneider G, Neidhart W, Giller T, et al. “Scaffold-Hopping” by Topological Pharmacophore Search: A Contribution to Virtual Screening. Angew Chem Int Ed Engl. 1999;38(19):2894–2896.
  • McGregor MJ, Muskal SM. Pharmacophore Fingerprinting. 1. Application to QSAR and Focused Library Design. J Chem Inf Comput Sci. 1999;39:569–574.
  • McGregor MJ, Muskal SM. Pharmacophore Fingerprinting. 2. Application to Primary Library Design. J Chem Inf Comput Sci. 2000;40:117–125.
  • Mason JS, Cheney DL. Library design and virtual screening using multiple 4-point pharmacophore fingerprints. Pac Symp Biocomput. 2000;5:576–587.
  • Unity 2D fingerprints. Certara USA Inc; 2015 [cited 2015 Nov 19]. Available from: https://www.certara.com/.
  • Schwartz J, Awale M, Reymond JL. SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules. J Chem Inf Model. 2013;53(8):1979–1989.
  • Deng Z, Chuaqui C, Singh J. Structural interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions. J Med Chem. 2004;47(2):337–344.
  • Molecular Operating Environment (MOE), 2013.08; Chemical Computing Group Inc., 1010 Sherbooke St. West, Suite #910, Montreal, QC, Canada, H3A 2R7, 2015. 2015.
  • Rarey M, Dixon JS. Feature Trees: A new molecular similarity measure based on tree matching. J Comput Aided Mol Des. 1998;12:471–490.
  • Lesnik S, Stular T, Brus B, et al. LiSiCA: A Software for Ligand-Based Virtual Screening and Its Application for the Discovery of Butyrylcholinesterase Inhibitors. J Chem Inf Model. 2015;55(8):1521–1528.
  • McGaughey GB, Sheridan RP, Bayly CI, et al. Comparison of topological, shape, and docking methods in virtual screening. J Chem Inf Model. 2007;47(4):1504–1519.
  • Muegge I. Synergies of virtual screening approaches. Mini Rev Med Chem. 2008;8(9):927–933.
  • Sheridan RP, Kearsley SK. Why do we need so many chemical similarity search methods? Drug Discov Today. 2002;7(17):903–911.
  • Zhang Q, Muegge I. Scaffold hopping through virtual screening using 2D and 3D similarity descriptors: ranking, voting, and consensus scoring. J Med Chem. 2006;49(5):1536–1548.

• comparison of fingerprint similarities and docking in VS.

  • Awale M, Reymond JL. Atom pair 2D-fingerprints perceive 3D-molecular shape and pharmacophores for very fast virtual screening of ZINC and GDB-17. J Chem Inf Model. 2014;54(7):1892–1907.
  • Huang N, Shoichet BK, Irwin JJ. Benchmarking Sets for Molecular Docking. J Med Chem. 2006;49:6789–6801.
  • Mysinger MM, Carchia M, Irwin JJ, et al. Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem. 2012;55(14):6582–6594.
  • Awale M, Jin X, Reymond JL. Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints. J Cheminform. 2015;7:3.
  • Horvath D, Marcou G, Varnek A. Do not hesitate to use Tversky-and other hints for successful active analogue searches with feature count descriptors. J Chem Inf Model. 2013;53(7):1543–1562.
  • Patterson DE, Cramer RD, Ferguson AM, et al. Neighborhood behavior: a useful concept for validation of “molecular diversity” descriptors. J Med Chem. 1996;39(16):3049–3059.

• introduction of neighborhood behavior principle.

  • Papadatos G, Cooper AW, Kadirkamanathan V, et al. Analysis of neighborhood behavior in lead optimization and array design. J Chem Inf Model. 2009;49(2):195–208.
  • Brown RD, Martin YC. Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection. J Chem Inf Comput Sci. 1996;36:572–584.
  • Brown RD, Martin YC. The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding. J Chem Inf Comput Sci. 1997;37:1–9.
  • Martin YC, Kofron JL, Traphagen LM. Do structurally similar molecules have similar biological activity? J Med Chem. 2002;45(19):4350–4358.

•• an analysis of 150 HTS screens affords a quantitative correlation of molecular similarity and probability of activity.

• highlights the dependency of VS outcomes on databases searched against.

  • Muegge I, Oloff S. Advances in virtual screening. Drug Discov Today Technol. 2006;3:405–411.
  • Muegge I, Zhang Q. 3D virtual screening of large combinatorial spaces. Methods. 2015;71:14–20.
  • Hessler G, Zimmermann M, Matter H, et al. Multiple-ligand-based virtual screening: methods and applications of the MTree approach. J Med Chem. 2005;48(21):6575–6584.
  • Yap CW. PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem. 2011;32(7):1466–1474.
  • Klekota J, Roth FP. Chemical substructures that enrich for biological activity. Bioinformatics. 2008;24(21):2518–2525.
  • Steinbeck C, Han Y, Kuhn S, et al. The Chemistry Development Kit (CDK): an open-source Java library for Chemo- and Bioinformatics. J Chem Inf Comput Sci. 2003;43(2):493–500.
  • Ewing T, Baber JC, Feher M. Novel 2D fingerprints for ligand-based virtual screening. J Chem Inf Model. 2006;46(6):2423–2431.
  • Smusz S, Kurczab R, Satala G, et al. Fingerprint-based consensus virtual screening towards structurally new 5-HT(6)R ligands. Bioorg Med Chem Lett. 2015;25(9):1827–1830.
  • Hamza A, Wei NN, Zhan CG. Ligand-based virtual screening approach using a new scoring function. J Chem Inf Model. 2012;52(4):963–974.
  • Hamza A, Wagner JM, Evans TJ, et al. Novel mycosin protease MycP(1) inhibitors identified by virtual screening and 4D fingerprints. J Chem Inf Model. 2014;54(4):1166–1173.
  • JChem version 5.3.1, ChemAxon. 2015 [cited 2015 Nov 19]. Available from: http://www.chemaxon.com.
  • Gabrielsen M, Kurczab R, Siwek A, et al. Identification of novel serotonin transporter compounds by virtual screening. J Chem Inf Model. 2014;54(3):933–943.
  • Marcou G, Rognan D. Optimizing fragment and scaffold docking by use of molecular interaction fingerprints. J Chem Inf Model. 2007;47(1):195–207.
  • Irwin JJ, Shoichet BK. ZINC–a free database of commercially available compounds for virtual screening. J Chem Inf Model. 2005;45(1):177–182.
  • Jansen C, Wang H, Kooistra AJ, et al. Discovery of novel Trypanosoma brucei phosphodiesterase B1 inhibitors by virtual screening against the unliganded TbrPDEB1 crystal structure. J Med Chem. 2013;56(5):2087–2096.
  • Tsigelny IF, Kouznetsova VL, Biswas N, et al. Development of a pharmacophore model for the catecholamine release-inhibitory peptide catestatin: virtual screening and functional testing identify novel small molecule therapeutics of hypertension. Bioorg Med Chem. 2013;21(18):5855–5869.
  • Distinto S, Esposito F, Kirchmair J, et al. Identification of HIV-1 reverse transcriptase dual inhibitors by a combined shape-, 2D-fingerprint- and pharmacophore-based virtual screening approach. Eur J Med Chem. 2012;50:216–229.
  • Hert J, Willett P, Wilton DJ, et al. Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org Biomol Chem. 2004;2(22):3256–3266.
  • Hert J, Willett P, Wilton DJ, et al. Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures. J Chem Inf Comput Sci. 2004;44(3):1177–1185.
  • Hert J, Willett P, Wilton DJ, et al. Enhancing the effectiveness of similarity-based virtual screening using nearest-neighbor information. J Med Chem. 2005;48(22):7049–7054.
  • Hert J, Willett P, Wilton DJ, et al. New methods for ligand-based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching. J Chem Inf Model. 2006;46(2):462–470.
  • Heikamp K, Bajorath J. How do 2D fingerprints detect structurally diverse active compounds? Revealing compound subset-specific fingerprint features through systematic selection. J Chem Inf Model. 2011;51(9):2254–2265.
  • Bender A, Jenkins JL, Scheiber J, et al. How similar are similarity searching methods? A principal component analysis of molecular descriptor space. J Chem Inf Model. 2009;49(1):108–119.
  • Duan J, Dixon SL, Lowrie JF, et al. Analysis and comparison of 2D fingerprints: insights into database screening performance using eight fingerprint methods. J Mol Graph Model. 2010;29(2):157–170.
  • Riniker S, Landrum GA. Open-source platform to benchmark fingerprints for ligand-based virtual screening. J Cheminform. 2013;5(1):26.
  • Whittle M, Gillet VJ, Willett P, et al. Enhancing the effectiveness of virtual screening by fusing nearest neighbor lists: a comparison of similarity coefficients. J Chem Inf Comput Sci. 2004;44(5):1840–1848.
  • Whittle M, Gillet VJ, Willett P, et al. Analysis of data fusion methods in virtual screening: theoretical model. J Chem Inf Model. 2006;46(6):2193–2205.
  • Duesbury E, Holliday J, Willett P. Maximum common substructure-based data fusion in similarity searching. J Chem Inf Model. 2015;55(2):222–230.
  • Riniker S, Fechner N, Landrum GA. Heterogeneous classifier fusion for ligand-based virtual screening: or, how decision making by committee can be a good thing. J Chem Inf Model. 2013;53(11):2829–2836.
  • Whittle M, Gillet VJ, Willett P, et al. Analysis of data fusion methods in virtual screening: similarity and group fusion. J Chem Inf Model. 2006;46(6):2206–2219.
  • Muegge I. Selection Criteria for Drug-Like Compounds. Med Res Rev. 2003;23:302–321.
  • Wassermann AM, Geppert H, Bajorath J. Searching for target-selective compounds using different combinations of multiclass support vector machine ranking methods, kernel functions, and fingerprint descriptors. J Chem Inf Model. 2009;49(3):582–592.
  • Zang Q, Rotroff DM, Judson RS. Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods. J Chem Inf Model. 2013;53(12):3244–3261.
  • Liu R, Wallqvist A. Merging applicability domains for in silico assessment of chemical mutagenicity. J Chem Inf Model. 2014;54(3):793–800.
  • Zhou D, Alelyunas Y, Liu R. Scores of extended connectivity fingerprint as descriptors in QSPR study of melting point and aqueous solubility. J Chem Inf Model. 2008;48(5):981–987.
  • Clark AM, Ekins S. Open Source Bayesian Models. 2. Mining a “Big Dataset” To Create and Validate Models with ChEMBL. J Chem Inf Model. 2015;55(6):1246–1260.
  • Clark AM, Dole K, Coulon-Spektor A, et al. Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets. J Chem Inf Model. 2015;55(6):1231–1245.
  • Ajay, Walters WP, Murcko MA. Can we learn to distinguish between ‘drug-like’ and ‘nondrug-like’ molecules? J Med Chem. 1998;41:3314–3324.
  • Podlogar BL, Muegge I. “Holistic” In Silico Methods to Estimate the Synthetic and CNS Bioavailabilities of Potential Chemotherapeutic Agents. Curr Top Med Chem. 2001;1:257–275.
  • Balfer J, Bajorath J. Introduction of a methodology for visualization and graphical interpretation of Bayesian classification models. J Chem Inf Model. 2014;54(9):2451–2468.
  • Sutherland JJ, Higgs RE, Watson I, et al. Chemical fragments as foundations for understanding target space and activity prediction. J Med Chem. 2008;51(9):2689–2700.
  • Vieth M, Erickson J, Wang J, et al. Kinase inhibitor data modeling and de novo inhibitor design with fragment approaches. J Med Chem. 2009;52(20):6456–6466.
  • Bender A, Jenkins JL, Glick M, et al. “Bayes affinity fingerprints” improve retrieval rates in virtual screening and define orthogonal bioactivity space: when are multitarget drugs a feasible concept? J Chem Inf Model. 2006;46(6):2445–2456.
  • Olah M, Rad R, Ostopovici L, et al. WOMBAT and WOMBAT-PK: bioactivity databases for lead and drug discovery. In: Schreiber KL, Kapoor TM, Wess G, editors. Chemical Biology: From Small Molecules to Systems Biology and Drug Design. Weinheim (Germany): Wiley-VCH Verlag GmbH; 2008. p. 760–786.
  • Martin E, Mukherjee P, Sullivan D, et al. Profile-QSAR: a novel meta-QSAR method that combines activities across the kinase family to accurately predict affinity, selectivity, and cellular activity. J Chem Inf Model. 2011;51(8):1942–1956.
  • Mukherjee P, Martin E. Profile-QSAR and Surrogate AutoShim protein-family modeling of proteases. J Chem Inf Model. 2012;52(9):2430–2440.
  • Gaulton A, Bellis LJ, Bento AP, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40(Database issue):D1100–D1107.
  • O’Boyle NM, Banck M, James CA, et al. Open Babel: an open chemical toolbox. J Cheminform. 2011;3:33.
  • Steinbeck C, Hoppe C, Kuhn S, et al. Recent developments of the chemistry development kit (CDK) - an open-source java library for chemo- and bioinformatics. Curr Pharm Des. 2006;12(17):2111–2120.
  • Steinbeck C, Han Y, Kuhn S, et al. The Chemistry Development Kit (CDK): an open-source Java library for Chemo- and Bioinformatics. J Chem Inf Comput Sci. 2003;43(2):493–500.
  • Indigo. 2015 [cited 2015 Nov 19]. Available from: http://lifescience.opensource.epam.com/indigo/index.html
  • RDKit: 2015 [cited 2015 Nov 19]. Available from: http://www.rdkit.org.
  • Ambure A, Aher RB, Roy K. Recent Advances in the Open Access Cheminformatics Toolkits, Software Tools, Workflow Environments, and Databases. In: James KY, editors. Methods in Pharmacology and Toxicology. New York (NY): Springer Science+Business Media New York; 2014. p. 1–40.
  • Petrone PM, Wassermann AM, Lounkine E, et al. Biodiversity of small molecules–a new perspective in screening set selection. Drug Discov Today. 2013;18(13–14):674–680.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.