For virtual screening and similarity searching numerous descriptors can be employed to represent molecular structures and properties. Concurrently these descriptors always create a chemical property space. Typically, we lack information on how these spaces are structured and organized due to their high dimensionality. We present a projection method that allows for the visualization of such property spaces of large databases while maintaining the high-dimensional spatial structures and neighbourhood behaviour (Figure 1). The process of visualisation can help us understand how descriptors ‘perceive’ molecules and can give surprising insights which molecules are actually considered to be similar in these spaces. Furthermore we implemented a clustering algorithm using convex hulls for separating arbitrary molecule classes and automated feature extraction algorithm for identifying space defining features.
Institute of Pharmaceutical Sciences, ETH, Zurich, Switzerland
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.