123
Views
0
CrossRef citations to date
0
Altmetric
Review

Chemoinformatic approaches for navigating large chemical spaces

Pages 403-414 | Received 12 Dec 2023, Accepted 30 Jan 2024, Published online: 05 Feb 2024
 

ABSTRACT

Introduction

Large chemical spaces (CSs) include traditional large compound collections, combinatorial libraries covering billions to trillions of molecules, DNA-encoded chemical libraries comprising complete combinatorial CSs in a single mixture, and virtual CSs explored by generative models. The diverse nature of these types of CSs require different chemoinformatic approaches for navigation.

Areas covered

An overview of different types of large CSs is provided. Molecular representations and similarity metrics suitable for large CS exploration are discussed. A summary of navigation of CSs in generative models is provided. Methods for characterizing and comparing CSs are discussed.

Expert opinion

The size of large CSs might restrict navigation to specialized algorithms and limit it to considering neighborhoods of structurally similar molecules. Efficient navigation of large CSs not only requires methods that scale with size but also requires smart approaches that focus on better but not necessarily larger molecule selections. Deep generative models aim to provide such approaches by implicitly learning features relevant for targeted biological properties. It is unclear whether these models can fulfill this ideal as validation is difficult as long as the covered CSs remain mainly virtual without experimental verification.

Article highlights

  • Large chemical spaces include compound collections, combinatorial libraries, DNA-encoded chemical libraries, and virtual chemical spaces explored by generative models.

  • Molecular representations and similarity metrics suitable for large CS exploration are discussed.

  • Approaches to characterizing CSs and comparing CSs are discussed.

  • Large chemical spaces require specialized algorithms for efficient navigation that are limited to neighborhoods of structurally similar molecules.

  • Smarter navigation approaches are required that focus on better but not necessarily larger molecule selections.

  • Validation of deep generative models remains challenging as long as CSs of these models remain virtual without experimental verification.

Declaration of interest

The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

Reviewer disclosures

Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.

Additional information

Funding

This paper was not funded.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 99.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,340.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.