Abstract
Development of accurate quantitative structure–activity relationship (QSAR) models requires the availability of high quality validated data. International regulations such as REACH in Europe will now accept (Q)SAR-based evaluations for risk assessment. The number of toxicity datasets available for those wishing to share knowledge, or to use for data mining and modelling, is continually expanding. The challenge is the current use of a multitude of different data formats. The issues of comparing or combining disparate data apply both to public and proprietary sources. The ToxML project addresses the need for a common data exchange standard that allows the representation and communication of these data in a well-structured electronic format. It is an open standard based on Extensible Markup Language (XML). Supporting information for overall toxicity endpoint data can be included within ToxML files. This makes it possible to assess the quality and detail of the data used in a model. The data file model allows the aggregation of experimental data to the compound level in the detail needed to support (Q)SAR work. The standard is published on a website together with tools to view, edit and download it.
Acknowledgements
The authors wish to thank the current members of the Advisory Board for their support: R. Benz (US FDA), N. Jeliazkova (IdeaConsult Ltd), I. Tetko (VCC LAB), M. Manibusan (EPA), B. Dagallier (OECD) and S. Nath (PointCross Life Sciences). Thanks also to C. Yang (Altamira LLC) for her key role in initiating and launching the ToxML project.