1,124
Views
3
CrossRef citations to date
0
Altmetric
Data Note

Compound Dataset and Custom Code for Deep Generative multi-target Compound Design

& ORCID Icon
Article: FSO715 | Received 10 Mar 2021, Accepted 16 Apr 2021, Published online: 30 Apr 2021
 

Abstract

Aim: Generating a data and software infrastructure for evaluating multi-target compound (MT-CPD) design via deep generative modeling. Methodology: The REINVENT 2.0 approach for generative modeling was extended for MT-CPD design and a large benchmark data set was curated. Exemplary results & data: Proof-of-concept for deep generative MT-CPD design was established. Custom code and the benchmark set comprising 2809 MT-CPDs, 61,928 single-target and 295,395 inactive compounds from biological screens are made freely available. Limitations & next steps: MT-CPD design via deep learning is still at its conceptual stages. It will be required to demonstrate experimental impact. The data and software we provide enable further investigation of MT-CPD design and generation of candidate molecules for experimental programs.

Lay abstract

Small molecules with well-defined activity against multiple biological targets are increasingly considered for therapy of complex diseases. Generating such compounds is far from being trivial. Therefore, deep machine learning, a form of artificial intelligence, is applied to aid in this process. For this purpose, we have generated a data set and software that we make freely available to further advance deep learning for designing multi-target compounds.

Graphical abstract

A group of three compounds with multi- or single-target activity or no activity (no target). For grouping of compounds according to a set of targets, confirmed activity or inactivity against all targets is taken into account.

Author contributions

T Blaschke and J Bajorath conceived the study; T Blaschke generated the datasets and code and carried out the analysis; T Blaschke and J Bajorath analyzed the results and prepared the manuscript.

Financial & competing interests disclosure

The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

No writing assistance was utilized in the production of this manuscript.