114
Views
50
CrossRef citations to date
0
Altmetric
Original Articles

The Impact of variable selection on the modelling of oestrogenicity

Pages 171-190 | Received 13 May 2004, Accepted 26 Aug 2004, Published online: 01 Feb 2007
 

Abstract

Many oestrogenic chemicals exert their activity via specific interactions with the oestrogen receptor (ER). The objective of the present study was to identify significant descriptors associated with the ER binding affinities of a large and diverse set of compounds to drive quantitative structure–activity relationships (QSARs). To this end, a variety of statistical methods were employed for variable selection. These included stepwise regression and partial least squares (PLS) analyses, as well as a non-linear recursive partitioning method (Formal Inference-based Recursive Modelling). A total of 157 molecular descriptors including quantum mechanical, graph theoretical, indicator variables and log P were used in the study. Furthermore, cluster analysis of variables was performed to identify groups of descriptors representing similar molecular features. Hierarchical PLS analyses were performed, where the scores of the significant components of either PLS or principle component analysis (PCA), performed separately on each cluster, were used as the variables for the top model. This reduced the number of the variables representing the larger clusters, leading to a similar number of descriptors for each distinct molecular feature. The results showed that the most important molecular properties for stronger ER binding affinity are molecular size and shape, the presence of a phenol moiety as well as other aromatic groups, hydrophobicity and presence of double bonds. The best PLS model obtained, in terms of predictive ability, was a hierarchical PLS model. However, a rigorous validation study showed that the MLR model using descriptors selected by stepwise regression has greater predictive power than the PLS models.

Acknowledgements

This study was funded in part by the European Union EASYRING project (QLK4-2002-02286).

Notes

Presented at the 11th International Workshop on Quantitative Structure–Activity Relationships in the Human Health and Environmental Sciences (QSAR2004), 9–13 May 2004, Liverpool, England.

Additional information

Notes on contributors

M.T.D. Cronin

Presented at the 11th International Workshop on Quantitative Structure–Activity Relationships in the Human Health and Environmental Sciences (QSAR2004), 9–13 May 2004, Liverpool, England.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 543.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.