Search in:

Journal of Enzyme Inhibition and Medicinal Chemistry Volume 21, 2006 - Issue 6

Submit an article Journal homepage

Free access

429

Views

CrossRef citations to date

Altmetric

Listen

Research Article

Modeling of acetylcholinesterase inhibition by tacrine analogues using Bayesian-regularized Genetic Neural Networks and ensemble averaging

Michael Fernández Molecular Modeling Group, Center for Biotechnological Studies, University of Matanzas, Matanzas, C.P, 44740, Cuba

M Carmo Carreiras Centro de Estudos de Ciências Farmacêuticas, Faculdade de Farmácia da Universidade de Lisboa, Av. das Forças Armadas, 1600-083, Lisboa, Portugal

José L Marco Laboratorio de Radicales Libres (IQOG, CSIC), C/ Juan de la Cierva 3, 28006, Madrid, Spain

Julio Caballero Molecular Modeling Group, Center for Biotechnological Studies, University of Matanzas, Matanzas, C.P, 44740, Cuba

Pages 647-661 | Received 02 Apr 2006, Published online: 04 Oct 2008

Cite this article
https://doi.org/10.1080/14756360600862366

In this article

Introduction
Materials and methods
Results and discussion
Conclusions
References

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF

Abstract

Acetylcholinesterase inhibition was modeled for a set of 136 tacrine analogues using Bayesian-regularized Genetic Neural Networks (BRGNNs). In the BRGNN approach the Bayesian-regularization avoids overtraining/overfitting and the genetic algorithm (GA) allows exploring an ample pool of 3D-descriptors. The predictive capacity of our selected model was evaluated by averaging multiple validation sets generated as members of diverse-training set neural network ensembles (NNEs). The ensemble averaging provides reliable statistics. When 40 members are assembled, the NNE provides a reliable measure of training and test set R values of 0.921 and 0.851 respectively. In other respects, the ability of the nonlinear selected GA space for differentiating the data was evidenced when the total data set was well distributed in a Kohonen Self-Organizing Map (SOM). The location of the inhibitors in the map facilitates the analysis of the connection between compounds and serves as a useful tool for qualitative predictions.

Keywords::

Bayesian regularized-Genetic Neural Networks
neural network ensemble
self-organizing maps
acetylcholinesterase inhibitors
QSAR

Introduction

Tacrine (9-amino-1,2,3,4-tetrahydroacridine) was the first drug for the symptomatic treatment of Alzheimer's disease (AD) [Citation1]. The rationale for its use is related to the elevation of the acetylcholine (ACh) levels by reversible inhibition of acetylcholinesterase (AChE), which can compensate for the cholinergic deficit associated to the brain lesions present in AD. However, there are several deficiencies of tacrine as a drug which are related to liver toxicity and peripheral cholinomimetic actions [Citation2]. In order to reduce these undesirable side effects, many analogues of tacrine have been reported Citation3-11, most of these are structurally closely related to the parent compound and retain the aminopyridine or aminoquinoline moiety ().

Figure 1 Structure of the tacrine analogues from modification of tacrine.

Computational-based rational design of drugs has increased in the last decade. Most of those approaches are focused on quantitative structure–activity relationship (QSAR) studies, using different kinds of molecular descriptors for encoding chemical information [Citation12]. After computing a set of descriptors, multivariate linear or/and nonlinear relationships are established between a reduced subset of variables and the inhibitory activity, leading to a mathematical model. AChE inhibitors have been approached by many kinds of computational strategies. A serious source of information about this topic is offered by Dimoglo et al. [Citation13]. In this paper, the authors made a comprehensive review of the ‘state of the art’ concerning AChE inhibition modeling.

Artificial neural networks (ANNs) arose from attempts to model the functioning of the human brain [Citation14]. They usually overcome methods limited to linear regression models such as multiple linear analysis or partial least square Citation15-18. Contrary to these methods, ANNs can be used to model complex nonlinear relationships. Since biological phenomena are complex by nature, this ability has promoted the employment of ANNs in drug design studies. In the current study we model the AChE inhibitory activity of a set of tacrine analogues Citation3-11. The data set includes 136 compounds with the biological activity reported as IC₅₀ values. The characteristics of the inhibitors were represented by relevant 3D descriptors extracted by genetic algorithm (GA) feature selection using Bayesian-regularized ANNs (BRANNs) as predictors. The model was carried out by a neural network ensemble approach which provides reliable statistics. In addition, the capacity of the selected variables for differencing the data was evaluated by means of the unsupervised training of Kohonen Self-Organizing Maps (SOMs).

Materials and methods

Dataset and molecular descriptors

AChE inhibitory activities [log(10⁵/IC₅₀)] of 136 tacrine analogues were taken from the literature Citation3-11. The chemical structures and experimental activities are shown in . IC₅₀ values represent the micromolar concentration that inhibits 50% of AChE activity. For inactive compounds the threshold value of log(10⁵/IC₅₀) = 3 (IC₅₀ = 100 μM) was assigned. Since the whole dataset was extracted from reports of several groups, there are some differences between the AChE sources for pharmacological assays. In order to evaluate the outcomes of these differences, we analyzed the correlation between inhibitory activities using bovine and human AChEs for 17 compounds reported by Camps et al. [Citation7,Citation8]. . shows that bovine log(10⁵/IC₅₀) and human log(10⁵/IC₅₀) values correlates for these compounds (R² > 0.8), which was improved when one unfitted compound was extracted. In the equation that describes this relationship, the slope tends to 1 and the intercept tends to zero. Furthermore, the equation fixing the intercept also adjusts well this relationship. According to this analysis, we consider that the differences between AChE sources do not cause a high error to our work.

Table I. Experimental and predicted AChE inhibitory activities (log(10⁵/IC₅₀)) of tacrine analogues^a.

Display Table

Table II. Correlation between log(10⁵/IC₅₀) using bovine AChE and log(10⁵/IC₅₀) using human AChE^a.

Download CSV Display Table

Prior to molecular descriptor calculations, 3D structures of the studied compounds were geometrically optimized using the semiempirical quantum-chemical method PM3 [Citation19] implemented in the MOPAC 6.0 [Citation20] computer software. The 3D descriptors from the Dragon software [Citation21] were calculated for each compound: aromaticity indices [Citation22,Citation23], Randic molecular profiles [Citation24], geometrical descriptors [Citation25], RDF (radial distribution function) descriptors [Citation26], 3D-MoRSE (molecule representation of structures based on electron diffraction) descriptors [Citation27], WHIM (weighted holistic invariant molecular) descriptors [Citation28] and GETAWAY (GEometry, Topology and Atom-Weight AssemblY) descriptors [Citation29]. In all, 721 descriptors were calculated. Descriptors that stayed constant or almost constant were eliminated, and pairs of variables with a correlation coefficient greater than 0.95 were classified as intercorrelated, and only one of these was included in the model. Finally, 271 descriptors were obtained.

Artificial neural network regression procedure

Only a subset of the available pool of 271 descriptors is statistically significant in terms of correlation with the modeled AChE inhibitory activity. Our general procedure is described in .

Figure 2 Flowchart of the modeling procedure.

Bayesian-Regularized Genetic Neural Network (BRGNN) is a framework that combines BRANNs with GA feature selection [Citation30]. Our BRGNN approach is a version of the So and Karplus GA feature selection method [Citation31] incorporating Bayesian regularization.

Bayesian networks are optimal devices for solving learning problems. They diminish the inherent complexity of ANNs, being governed by Occam's Razor, when complex models are automatically self-penalizing under Bayes's rule. The Bayesian approach to ANN modeling considers all possible values of network parameters weighted by the probability of each set of weights. The BRANN method was designed by Mackay [Citation32,Citation33] for overcoming the deficiencies of ANNs. Only a brief summary will be provided here. The Bayesian approach yields a posterior distribution of network parameters P(w|D,H) from a prior probability distribution P(w|H) according to updates provided by the training set D using the BRANN model H. Predictions are expressed in terms of expectations with respect to this posterior distribution. Bayesian methods can simultaneously optimize the regularization constants in ANNs, a process that is very laborious using cross validation. Instead of trying to find the global minimum, the Bayesian approach finds the (locally) most probable parameters.

Bayesian approach produces predictors that are robust and well matched to the data. These properties become BRANNs in accurate predictors for QSAR analysis [Citation34,Citation35]. They give models which are relatively independent of ANN architecture, above a minimum architecture, since the Bayesian regularization method estimates the number of effective parameters. The concerns about overfitting and overtraining are also eliminated by this method so that the production of a definitive and reproducible model is attained. The joining of BRANN and GA feature selection (BRGNN) increases the possibilities of BRANNs for modeling as we indicated in previous works [Citation16,Citation17,Citation30]. This method is relatively fast and considers the whole data set in the training process. For other hybrids of ANN and GA the use of the MSE as fitness function could lead to undesirable well fitted but poor generalized networks as algorithm solutions. In this connection, BRGNN avoids such results by two aspects: 1) keeping network architectures as simple as possible inside the GA framework and 2) implementing Bayesian regulation in the network training function.

Fully connected, three-layer BRANNs with back-propagation training were implemented in the MATLAB environment [Citation36]. In these nets, the transfer functions of input and output layers were linear and the hidden layer had neurons with a hyperbolic tangent transfer function. Inputs and targets took the values from independent variables selected by the GA and log(10⁵/IC₅₀) values, respectively; both were normalized prior to network training. BRANN training was carried out according to the Levenberg-Marquardt optimization [Citation37]. The initial value for μ was 0.005 with decrease and increase factors of 0.1 and 10, respectively. The training was stopped when μ became larger than 10¹⁰.

The GA implemented in this paper keeps the same characteristics of the previously reported in earlier work [Citation30]. Initially, a set of 50 chromosomes were randomly generated. The population fitness was then calculated and the members were rank ordered according to fitness. The 2 best scoring models were automatically retained as members for the next round of evolution. More progeny models were then created for the next generation by preferentially mating parent models with higher scores. Crossover operator and single-point mutations were used in the evolution process until the best MSE scoring model remains constant for at least 10 generations. Our GA was programmed within the MATLAB environment using the genetic algorithm and neural networks toolboxes [Citation36]. The predictors are BRANNs with a simple architecture (two or three neurons in a sole hidden layer). We tried the MSE of data fitting for BRANN models, as the case may be, as the individual fitness function. The best models according to R value (R > 0.8) were selected, and they were tested in cross-validation experiments for avoiding chance correlations ().

The predictive power of the model was measured by an external validation process that consists of predicting the activity of unknown compounds forming the test set. To avoid the influence of casual external sets, neural network ensembles (NNEs) were employed; building all members by the random partition of the whole data set into training (80%) and test sets (20%) following the Agrafiotis et al. proposition [Citation38]. As a result, averaging external predictions were obtained (). The quality and reliability was settled by examining the correlation coefficient R and the MSE of the test set fitting.

Assembling multiple versions of a predictor provides ‘smoother’ more stable predictions [Citation39]. Ensemble averaging minimizes uncertainty and produces more stable and accurate predictors. Recently, our group demonstrated the advantages of the ensemble solution for QSAR validation [Citation40]. The robustness of this method lies on the adequacy of many external predictions; therefore, it can replace the traditional validation processes.

Kohonen self-organizing maps

The Kohonen SOMs [Citation41] are ANNs related to classic clustering algorithms, in that they generate groupings of data points taken to be described by a single vector of typical values. However, the SOMs are distinct from standard clustering methods in that they do not operate with separate clusters: rather, they allocate data points to groups which are related [Citation42].

Kohonen SOMs are networks of spatially related nodes each of which represents a ‘prototype’ of a particular region of data (input) space. Each node comprises a set of weights corresponding to the dimensions of the data. Their characteristic feature is their ability to map nonlinear relations in multidimensional data sets into easily visualizable two-dimensional grids of neurons displaying the topology of a data set. Essentially, SOMs permit the perception of similarities in objects.

To settle structural similarities among the modeled AChE inhibitors, a Kohonen SOM was built. The 3D descriptors selected by GA were used for unsupervised training of 13 × 13 neuron maps. SOMs were implemented in the MATLAB environment [Citation36]. Neurons were initially located in a grid topology. The ordering phase was developed in 1000 steps with a 0.9 learning rate until a tuning neighborhood distance (1.0) was achieved. The tuning-phase learning rate was 0.02. Training was performed for a period of 2000 epochs in an unsupervised manner.

Results and discussion

Molecular descriptors

BRGNN methodology was applied for evaluating nonlinear relationships between the inhibitory activity and molecular descriptors. We found a seven descriptor space able to explain the AChE inhibition; the descriptors that constitute this space are shown in . The selected space includes the geometrical descriptor SPAM, the RDF descriptor RDF075m, four 3D-MoRSE descriptors and the GETAWAY descriptor HATS6u. It is noteworthy that there is no significant intercorrelation between these descriptors, as is seen in .

Table III. Symbols of the 3D-descriptors selected by Genetic Algorithm and their definitions.

Download CSV Display Table

Table IV. Correlation matrix of the descriptors selected by Genetic Algorithm.

Download CSV Display Table

Geometrical descriptor SPAM [Citation43] describes the 3D structures of tacrine analogues based on the Volkenstein approach for estimation of the statistical properties of polymer chains considering short-range interactions.

RDF descriptors [Citation26] are calculated from the radial distribution function of an ensemble of N atoms that can be interpreted as the probability distribution of finding an atom in a spherical volume of radius r. Equation (1) represents the radial distribution function code: where f is a scaling factor, N is the number of atoms, A_i and A_j are atomic properties of atoms i and j, r_ij represents the interatomic distances and B is an smoothing parameter, that defines the probability distribution of the individual distances. g(r) was calculated at a number of discrete points with defined intervals. Different atomic properties A_i were used, such as atomic mass, atomic van der Waals volumes, atomic Sanderson electronegativities, and atomic polarizabilities. The possibility for choosing an appropriate atomic property gives great flexibility to the RDF space for adapting it to the problem under investigation. RDF075m takes into account the atoms around 7.5Å in the atomic mass weighting scheme.

3D-MoRSE [Citation27] code considers the molecular information derived from an equation used in electron diffraction studies. Electron diffraction does not directly yield atomic coordinates but provides diffraction patterns from which the atomic coordinates are derived by mathematical transformations. 3D-MoRSE code is applied by Equation (2):

In this equation, A_i and A_j are atomic properties of atoms i and j, r_ij represents the interatomic distances, and s measures the scattering angle. The value of s (0,…, 31.0Å^{− 1}) is considered only at discrete positions within a certain range. Values of I(s) are defined at 32 evenly distributed values of s in the range of 0–31.0Å^{− 1}. These 32 values constitute the 3D-MoRSE code of the three-dimensional structure of a molecule. Like in the RDF approach, atomic properties A_i were used (atomic masses, atomic van der Waals volumes, atomic Sanderson electronegativities and atomic polarizabilities). The possibility for choosing an appropriate atomic property gives great flexibility to the 3D-MoRSE code for adapting it to the problem under investigation. In this work, 3D-MoRSE-selected descriptors were the unweighted Mor32u, weighted by van der Waals volumes Mor03v and Mor17v and weighted by atomic polarizabilities Mor32p. This code represents a restricted 3D space which captures relevant molecular information, which is related to the modeled AChE inhibitory activity.

GETAWAY descriptors [Citation29] are based on the molecular influence matrix (MIM). They match the 3D molecular geometry provided by the MIM and atom relatedness by molecular topology, with chemical information. The diagonal elements h_ii of the MIM, called leverages, encode atomic information and represent the ‘‘influence’’ of each molecule atom in determining the whole shape of the molecule; in fact, mantle atoms always have higher h_ii values than atoms near the molecule center. Each off-diagonal element h_ij represents the degree of accessibility of the jth atom to interactions with the ith atom. Specifically, HATS6u comes from H-GETAWAY descriptors based on Moreau-Brotto autocorrelation descriptors [Citation44]. In such descriptors, geometrical information provided by leverage values is combined with atomic weightings, accounting for specific physicochemical properties of molecule atoms. HATS indices consider the MIM diagonal elements. Like in the Moreau-Broto autocorrelations, HATSk(w) descriptors are defined as:

Where k (1, 2, …, d) is the path length (lag) in the molecular graph, d_ij is the topological distance between atoms i and j, while w_i and w_j are the A-dimensional property vector of the atoms i and j. The function δ(k; d_ij) is a Dirac-delta function defined as

The information contained in RDF, 3D-MoRSE and GETAWAY selected descriptors is related to distributions of relevant atomic properties across the entire molecules. In this sense, the elucidation of the molecular key features of the modeled data set from our proposed space is a difficult task. However, it can be perceived according to the nonlinear structural information here obtained, that an adequate distribution of atomic masses, van der Waals atomic volumes and polarizabilities has a great influence on the AChE inhibitory activities of the tacrine analogues. It suggests that molecular size, shape and atomic constitution, instead of electronic properties, play an important role in the tacrine analogues' activity. These facts agree well with reports where the access of the inhibitors to the AChE buried active site are strongly limited by the presence of a long and narrow gorge or channel leading from the surface of the enzyme [Citation45]; this aspect imposes shape and size-related constraints for AChE inhibitors. A striking feature of this gorge is that it is lined by 14 aromatic residues which make up approximately 40% of its surface and allow the AChE active site to be qualified as hydrophobic.

In order to gain a deeper inside on the relative effects of each 3D descriptor in our model, a recently reported weight-based input ranking scheme was carried out. The black-box nature of three layers ANNs has been “deciphered” in a recent report by Guha et al. [Citation46]. Their method allows an understanding of how an input descriptor is correlated to the predicted output by the network and consists of two parts. First, the nonlinear transform for a given neuron is linearized. Afterward, the magnitude in which a given neuron affects the downstream output is determined. Next, a ranking scheme for neurons in the hidden layer is developed. The ranking scheme is carried out by determining the Square Contribution Values (SCV) for each hidden neuron (see Reference 46 for details). This method for ANN model interpretation is similar in manner to the partial least squares interpretation method for linear models described by Stanton [Citation47].

Results of the ANN deciphering study appear in . The reported effective weight matrix for our model shows that the third hidden neuron has the major contribution to the model with a SCV value 4-fold higher in comparison to the first hidden neuron. On this neuron, SPAM and Mor03v descriptors have the highest impacts equal to − 2.774 and 2.676, respectively. From this analysis we can also derive the approximate effect of the selected descriptors. The sign of the weights indicates the trend of the output value. According to the sign of the effective weights in each hidden neuron, neither descriptor shows a completely positive or negative trend which suggests complex nonlinear effects. The most relevant descriptors SPAM and Mor03v, which have opposed effects in the third hidden neuron (negative and positive respectively), change their effects in other neurons.

Table V. Effective weight matrix for the optimum model for the AChE inhibitory activity of the studied tacrine analogues^a. Most relevant descriptors appear in bold letter.

Download CSV Display Table

Ensemble averaging

The above selected descriptors constitute the vectorial space that describes the modeled activity in a better way. For validating our vectorial space we evaluated the correlation coefficient and standard deviation (Q² and S_CV values) in LOO cross-validation process. The network hidden layer's architecture was explored, but no relevant differences were found due to the Bayesian-regularization. Q² reached the value of 0.745 and S_CV = 0.611 when the 7-3-1 architecture was employed. As can be seen, the correlation and predictive power behaviours were satisfactory so that the ability of the GA-selected descriptors to act as relevant information for the modeled activity is also confirmed in this case.

In order to obtain a more reliable model, we evaluated the predictive power by creating several training sets and predicting activities of unknown compounds. Instead of the selection of a sole test set, we generated multiple ones by means of the NNE approach. Then, we averaged external predictions. Members of NNEs were randomly generated by dividing the whole data set into 109 inhibitors for the training sets (80%) and 27 inhibitors for the test sets (20%), keeping the previous BRANN architecture. Averaged multiple correlation coefficients (averaged R) of the test sets for 20 instances of NNEs with 1, 5, 10, 20 and 40 predictors were examined (). All averaged R values stem from adding up 1, 5, 10, 20 and 40 external set R values containing 27 inhibitors each.

Figure 3 Multiple correlation coefficients (R) of 20 replies of neural network ensembles using 1, 5, 10, 20 and 40 members for test sets.

NNEs containing one member are cases when single training and test sets were selected without integrating them to any ensemble. It is obvious that random partitioning is highly unsatisfactory. Diverse partitions generate a broad scope of external sets, and some are far better predicted than others. Therefore, even when it is broadly employed, to assess the predictive power by means of a single external data set, random selection yields a rather fortuitous result. It is noteworthy that more reliable information can be acquired when the number of members in the ensemble is increased. Ensembles containing 40 members are similarly predictive according to averaged test set R values around 0.841. The accumulation of members leads to an averaged model that weights the contribution of each predictor; in this form, deceptive conclusions are suppressed. Since our test set consists of 20% of the whole data set, the probability that an inhibitor will be selected as part of a test set is low. When a new partition is carried out, the new test set can contain elements from the original test set and new elements. The assemblage of successive members allows storage of predictions for each compound. Whereas the higher is the number of members, most replies of predictions can be collected; therefore, we can establish a test set prediction for the ensemble by averaging them, which includes all compounds when the number of members is sufficiently large. The NNE provides a reliable measure of training and test set R values. According to this, we report training set R = 0.921 and test set R = 0.851 when 40 members are assembled. The plots of training and external predictions versus experimental activities employing 40-member NNEs are shown in .

Figure 4 Plot of predicted versus experimental log(10⁵/IC₅₀) values for AChE inhibition by tacrine analogues using 40-member NNEs. () training predictions; (○) external predictions.

Figure 4 Plot of predicted versus experimental log(105/IC50) values for AChE inhibition by tacrine analogues using 40-member NNEs. () training predictions; (○) external predictions.

Since the ensemble approach leads to stable ANN models, it becomes easeful to inspect the influence of some aspects that may affect the quality of our modeling procedure. The number of hidden nodes defines the number of parameters, which must be carefully adjusted in traditional ANNs. This is less critical in BRANNs, however it is useful to study the statistical behaviour of our ensemble when the number of hidden nodes are increased. shows the consequence of varying the number of hidden nodes in test set R value for 40-member NNE. We conclude that there is not an appreciable effect due to the Bayesian regularization. Otherwise, we explored other partitions instead of 20% for generating test sets. This analysis allows evaluating which proportion of the dataset is foreseeable by the rest, which means the redundancy in the whole dataset. shows the consequence of varying dataset partition in test set R value for 40-member NNE. The statistics of test set fitting declines when the test set elements are above 50% which indicates that one half of the data contains the information of all the modeled tacrine analogues.

Figure 5 Plot of test set R values vs. number of hidden nodes in neural networks for 40-member NNE.

Figure 6 Plot of test set R values vs. % of compounds in test set in neural networks for 40-member NNE.

Kohonen self-organizing neural network analysis

In order to achieve data differentiation, a Kohonen SOM with 13 × 13 neurons was mapped with GA selected descriptors as input vectors. In a self-organizing neural network, if two input data vectors are similar, they will be mapped into the same neuron or into very close neurons in the two-dimensional map. Therefore, either group in the map can be interpreted as a set of analogues defined by the vectorial space.

depicts a Kohonen SOM for the 136 AChE modeled inhibitors. It is clearly seen that compounds are adequately distributed across the entire map: 77 out of a total of 169 neurons were occupied. As observed, compounds with a similar range of activities were grouped into neighboring areas. It is noteworthy that there is a kind of gradient from the less-active compounds across the upper zone to the most-active compounds at the lower-right zone. As a consequence, this map can be used to carry out qualitative predictions. The position in the map would be able to assign an approximate range of activity for unknown compounds.

Figure 7 Kohonen SOM for the data set using descriptors selected by genetic algorithm. Maps at bottom represent the ranges of AChE activities (log10⁵/IC₅₀).

The analysis of test set predictions of the 136 modeled AChE inhibitors reported in , suggests that compounds 5, 24, 32, 46, 69, 71, 74, 79 and 125 can be considered as test set outliers (residuals > 2 S_CV) in our vectorial space. The reason for the wrong prediction of these compounds is related to inaccurate associations with similar molecules containing unequal activity. Some evidence can be derived from a closer inspection of the Kohonen SOM of . Compounds 5, 46, 69, 71, 74, 79 and 125 were predicted as more active than they really are. This fact can be attributed to the high molecular similarity that these compounds share with more-active compounds, thus, avoiding a correct discrimination; it is hinted from the position they occupy in the map. Compounds with a range of log(10⁶/IC₅₀) values between 3 and 3.99 are in the upper-left, but compounds 5 and 74 are out of this zone. Meanwhile, the inactive compound 125 is located far from the other inactive ones. Besides, compounds 46, 69, 71 and 79 were located in the lower-right zone, related to the most-active inhibitors. On the other hand, compounds 24 and 32 were related to lesser-active inhibitors. This can be inferred by their surroundings in the SOMs: compound 32 (log(10⁶/IC₅₀) = 5.77) was located in the upper-left zone where the majority of compounds have log(10⁶/IC₅₀) values between 3 and 4.99, while compound 24 is located in a complex zone where the majority of compounds present an inferior AChE inhibitory activity.

Despite the help that an analysis of the SOMs may offer a simple comparison of the structures of the outliers with the associated compounds in the data set does not reveal why these compounds are poorly predicted. As a result of the complexity of the modeled activity and the structural diversity, any obvious explanation is not feasible.

Comparison with previous computational models

Depicting the interaction of tacrine analogues with the AChE active site is not trivial. The flexibility of the ligands in conjunction with the rather large volume available to them inside the gorge allow incredibly high numbers of binding modes. There is experimental evidence that tacrine is able to bind also at a peripheral site [Citation48]. This knowledge originated from dimers of tacrine which can be more potent and selective than tacrine since their tacrine units simultaneous bind to the catalytic and peripheral sites of AChE [Citation49]. Since some of the tacrine analogues can bind at the catalytic site, while others can bind at the peripheral site, a broad QSAR model would represent an all-embracing relationship for subsets of compounds acting at different or both sites.

In general, previous SAR and QSAR studies have identified that hydrophobicity and the presence of ionizable nitrogen are essential features for the inhibitors to interact with AChE [Citation13]. In addition, docking and molecular dynamics approaches confirm that the 3D dimensional positioning of the inhibitor in the active site of the enzyme, i.e. the mode of interaction, varies among different chemical classes.

Some previous QSAR models for AChE inhibition modeling include tacrine analogues. Recanatini et al. carried out a comparative QSAR analysis aimed at individuating the physico-chemical properties governing the inhibitory activity of 13 series of compounds including benzylamines, physostigmine analogues and 2 series of tacrines [Citation50]. The QSARs for tacrine series were bilinear models in steric effects. However, the collinearity between steric and hydrophobic parameters did not allow the authors to draw any final conclusion about this model. In other work, Recanatini et al. derived the Hansch approach and CoMFA analysis for a series of 23 tacrine analogues substituted in positions‐6 and ‐7 of the acridine nucleus and bearing selected groups on the 9-amino function [Citation4]. Both methods provided two separate models that show a satisfactory consistency, pointing out the negative steric effect of substituents in position‐7 and the relative steric freedom of position‐6 as main SAR aspects of the tetrahydroacridine-based AChE inhibitors. The report of Martín-Santamaría et al. includes 21 tacrines, 7 huprines, and 7 dihydroquinazolines in a QSAR modeling study [Citation51]. These authors identified the key residues that modulate the inhibitory potencies of these classes of AChE inhibitors using the comparative binding energy (COMBINE) methodology. These authors report an interpretable COMBINE model that was able to fit and predict the activities of the three series of inhibitors reasonably well (Q² = 0.76). They also found a more robust predictive model when the same chemometric analysis was applied to the huprines set alone (Q² = 0.81), but the method was unable to provide predictive models for the other two families when they were treated separately from the rest.

QSAR studies may be classified as interpretive and predictive depending on the purpose of the study [Citation52]. Interpretive studies often use a relatively small number of compounds and molecular descriptors that can be easily related to structural characteristics. They attempt to illustrate how the descriptors found to be important in a model relate to the interactions between the ligand and target. Predictive QSAR studies usually use a large and diverse dataset and computationally efficient descriptors. In distinction of the above-mentioned earlier interpretive studies, our QSAR model is predictive, since it encompasses a diverse set of tacrine analogues, therefore, our model can be useful like a predictive tool more than for mechanistic evaluation of this information for molecular design. In consequence, we employed molecular descriptors chosen for their computational efficiency and information-rich character. The combination of BRANNs and GA leads to a highly satisfactory model taking into account its predictive ability.

Conclusions

In conclusion, a large and diverse data of tacrine analogues has been assembled for which AChE inhibitory activity has been assessed. A nonlinear QSAR model was developed with descriptors generated from 3D molecular structure using BRGNN methodology. The model successfully explains the modeled structure-activity relationship, based on both statistical significance and predictive ability. In analogy with previous studies, this work has generated a system capable of rapid virtual screening of tacrine-related compounds for AChE inhibition. Unlike previous studies, this model is derived from an ample and varied set of tacrine analogues which integrates the current trends of tacrine modification. In this sense, our model provides more robust extrapolation in chemical space compared to models created in previous reports.

Related Research Data

Genetic algorithm optimization in drug design QSAR: Bayesian-regularized genetic neural networks (BRGNN) and genetic algorithm-optimized support vectors machines (GA-SVM)

Source: Springer Science and Business Media LLC

Palladium(II)-catalyzed vinylic geminal double C–H activation and alkyne annulation reaction: synthesis of pentafulvenes

Source: Royal Society of Chemistry (RSC)

Docking and quantitative structure-activity relationship studies for the bisphenylbenzimidazole family of non-nucleoside inhibitors of HIV-1 reverse transcriptase.

Source: Wiley

Prediction of therapeutic potency of tacrine derivatives as BuChE inhibitors from quantitative structure-activity relationship modelling.

Source: Informa UK Limited

Linking provided by

References

Davis KL, Powchik P. Lancet 1995; 345: 625–630
PubMed Web of Science ®Google Scholar
Manning FC. Am Fam Physician 1994; 50: 819–823
PubMed Web of Science ®Google Scholar
Rampa A, Bisi A, Belluti F, Gobbi S, Valenti P, Andrisano V, Cavrini V, Cavalli A, Recanatini M. Bioorg Med Chem 2000; 8: 497–506
PubMed Web of Science ®Google Scholar
Recanatini M, Cavalli A, Belluti F, Piazzi L, Rampa A, Bisi A, Gobbi S, Valenti P, Andrisano V, Bartolini M, Cavrini V. J Med Chem 2000; 43: 2007–2018
PubMed Web of Science ®Google Scholar
Badia A, Baños JE, Camps P, Contreras J, Görbig DM, Muñoz-Torrero D, Simón M, Vivas NM. Bioorg Med Chem 1998; 6: 427–440
PubMed Web of Science ®Google Scholar
Camps P, El Achab R, Görbig DM, Morral J, Muñoz-Torrero D, Badía A, Baños JE, Vivas NM, Barril X, Orozco M, Luque FJ. J Med Chem 1999; 42: 3227–3242
PubMed Web of Science ®Google Scholar
Camps P, El Achab R, Morral J, Muñoz-Torrero D, Badía A, Baños JE, Vivas NM, Barril X, Orozco M, Luque FJ. J Med Chem 2000; 43: 4657–4666
PubMed Web of Science ®Google Scholar
Camps P, Gómez E, Muñoz-Torrero D, Badia A, Vivas NM, Barril X, Orozco M, Luque FJ. J Med Chem 2001; 44: 4733–4736
PubMed Web of Science ®Google Scholar
Marco JL, de los Ríos C, Carreiras MC, Baños JE, Badía A, Vivas NM. Bioorg Med Chem 2001; 9: 727–732
PubMed Web of Science ®Google Scholar
Marco JL, de los Ríos C, Carreiras MC, Baños JE, Badía A, Vivas NM. Arch Pharm (Weinheim Ger) 2002; 7: 347–353
Web of Science ®Google Scholar
Marco JL, de los Ríos C, García AG, Villarroya M, Carreiras MC, Martins C, Eleutério A, Morreale A, Orozco M, Luque FJ. Bioorg Med Chem 2004; 12: 2199–2218
PubMed Web of Science ®Google Scholar
Hansch C, Leo A. Exploring QSAR. Fundamentals and Applications in Chemistry and Biology. American Chemical Society, Washington DC 1995, ACS professional reference book.
Google Scholar
Dimoglo AS, Shvets NM, Tetko IV, Livingstone DJ. Quant Struct-Act Relat 2001; 20: 31–45
Web of Science ®Google Scholar
Zupan J, Gasteiger J. Anal Chim Acta 1991; 248: 1–30
Web of Science ®Google Scholar
Fernández M, Tundidor-Camba A, Caballero J. Mol Simulat 2005; 31: 575–584
Web of Science ®Google Scholar
González MP, Caballero J, Tundidor-Camba A, Helguera AM, Fernández M. Bioorg Med Chem 2006; 14: 200–213
PubMed Web of Science ®Google Scholar
Fernández M, Caballero J. Bioorg Med Chem 2006; 14: 280–294
PubMed Web of Science ®Google Scholar
Guha R, Jurs PC. J Chem Inf Comput Sci 2004; 44: 2179–2189
PubMedGoogle Scholar
Stewart JJP. J Comput Chem 1989; 10: 210–220
Google Scholar
MOPAC 6.0. Frank j seiler research laboratory. US Air Force Academy, Colorado Springs, CO 1993
Google Scholar
Todeschini V, Consonni V, Pavan M. DRAGON software version 2.1. 2002.
Google Scholar
Kruszewski J, Krygowski TM. Tetrahedron Lett 1972; 36: 3839–3842
Google Scholar
Jug K. J Org Chem 1983; 48: 1344–1348
Web of Science ®Google Scholar
Randic M. J Chem Inf Comput Sci 1995; 35: 373–382
Google Scholar
Kier LB, Hall LH. Molecular connectivity in structure-activity analysis. RSP-Wiley; Chichester, UK 1986
Google Scholar
Hemmer MC, Steinhauer V, Gasteiger J. Vib Spectrosc 1999; 19: 151–164
Web of Science ®Google Scholar
Schuur J, Selzer P, Gasteiger J. J Chem Inf Comput Sci 1996; 36: 334–344
Google Scholar
Todeschini R, Lansagni M, Marengo E. J Chemom 1994; 8: 263–272
Web of Science ®Google Scholar
Consonni V, Todeschini R, Pavan M. J Chem Inf Comput Sci 2002; 42: 682–692
PubMedGoogle Scholar
Caballero J, Fernández M. J Mol Model 2006; 12: 168–181
PubMed Web of Science ®Google Scholar
So SS, Karplus M. J Med Chem 1996; 39: 1521–1530
PubMed Web of Science ®Google Scholar
Mackay DJC. Neural Comput 1992; 4: 415–447
Web of Science ®Google Scholar
Mackay DJC. Neural Comput 1992; 4: 448–472
Web of Science ®Google Scholar
Burden FR, Winkler DA. J Med Chem 1999; 42: 3183–3187
PubMed Web of Science ®Google Scholar
Winkler DA, Burden FR. Biosilico 2004; 2: 104–111
Google Scholar
MATLAB 7.0. The Mathworks Inc Natick, MA http:// www.mathworks.com. 2004.
Google Scholar
Foresee FD, Hagan MT. Gauss-newton approximation to bayesian learning. 1997, Proceedings of the 1997 International Joint Conference on Neural Networks. IEEE, Houston 1930–1935;
Google Scholar
Agrafiotis DK, Cedeño W, Lobanov VS. J Chem Inf Comput Sci 2002; 42: 903–911
PubMedGoogle Scholar
Hansen LK, Salamon P. IEEE Trans Pattern Anal Machine Intell 1990; 12: 993–1001
Web of Science ®Google Scholar
Fernández M, Tundidor-Camba A, Caballero J. J Chem Inf Model 2005; 45: 1884–1895
PubMed Web of Science ®Google Scholar
Kohonen T. Biol Cybern 1982; 43: 59–69
Web of Science ®Google Scholar
Mangiameli P, Chen SK, West D. Eur J Oper Res 1996; 93: 402–417
Web of Science ®Google Scholar
Volkenstein MV. Configurational Statistics of Polymeric Chains. Wiley-Interscience, NY 1963
Google Scholar
Moreau G, Broto P. Nouv J Chim 1980; 4: 757–764
Google Scholar
Sussman JL, Harel M, Frolow F, Oefner C, Goldman A, Toker L, Silman I. Science 1991; 253: 872–879
PubMed Web of Science ®Google Scholar
Guha R, Stanton DT, Jurs PC. J Chem Inf Model 2005; 45: 1109–1121
PubMed Web of Science ®Google Scholar
Stanton DT. J Chem Inf Comput Sci 2003; 43: 1423–1433
PubMedGoogle Scholar
Radic Z, Reiner E, Taylor P. Mol Pharmacol 1991; 39: 98–104
PubMed Web of Science ®Google Scholar
Pang YP, Quiram P, Jelacic T, Hong F, Brimijoin S. J Biol Chem 1996; 271: 23646–23649
PubMed Web of Science ®Google Scholar
Recanatini M, Cavalli A, Hansch C. Chem-Biol Interact 1997; 105: 199–228
PubMed Web of Science ®Google Scholar
Martín-Santamaría S, Muñoz-Muriedas J, Luque FJ, Gago F. J Med Chem 2004; 47: 4471–4482
PubMed Web of Science ®Google Scholar
Polley MJ, Winkler DA, Burden FR. J Med Chem 2004; 47: 6230–6238
PubMed Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Modeling of acetylcholinesterase inhibition by tacrine analogues using Bayesian-regularized Genetic Neural Networks and ensemble averaging

Abstract

Introduction