Fingerprint-based detection of acute aquatic toxicity

Published: 04 May 2010

In this work we show the effectiveness of 2D structural fingerprints in the prediction of aquatic toxicity of chemical compounds, creating a self-contained system for structure-based aquatic toxicity classification. Using the data from the U.S. Environmental Protection Agency Fat Head Minnow (EPA-FHM) dataset [1] we build a non-linear RBF SVM [2] classifier that distinguishes acutely toxic compounds from less toxic compounds, loosely according to the criterion stipulated by the E.U. Reach legislation [3]. The classifier achieves up to 86% accuracy in leave-one-out validation using 580 of the dataset's 614 compounds. This performance is comparable with models built from the same dataset using more sophisticated molecular descriptors, such as AutoMEP and Sterimol descriptors [4]. We apply our classification model to predict the aquatic toxicity of 3M compounds in the MMsINC database [5]. Furthermore, we create a linear SVM model using the same technique and apply it to the MMsINC data, with the additional integration of the EXPLAIN system [6] which allows us to show which structural features are responsible for the model classifying a molecule as less toxic or acutely toxic.

