Expanding the fragrance chemical space for virtual screening
© Ruddigkeit et al.; licensee Chemistry Central Ltd. 2014
Received: 25 March 2014
Accepted: 12 May 2014
Published: 22 May 2014
The properties of fragrance molecules in the public databases SuperScent and Flavornet were analyzed to define a “fragrance-like” (FL) property range (Heavy Atom Count ≤ 21, only C, H, O, S, (O + S) ≤ 3, Hydrogen Bond Donor ≤ 1) and the corresponding chemical space including FL molecules from PubChem (NIH repository of molecules), ChEMBL (bioactive molecules), ZINC (drug-like molecules), and GDB-13 (all possible organic molecules up to 13 atoms of C, N, O, S, Cl). The FL subsets of these databases were classified by MQN (Molecular Quantum Numbers, a set of 42 integer value descriptors of molecular structure) and formatted for fast MQN-similarity searching and interactive exploration of color-coded principal component maps in form of the FL-mapplet and FL-browser applications freely available at http://www.gdb.unibe.ch. MQN-similarity is shown to efficiently recover 15 different fragrance molecule families from the different FL subsets, demonstrating the relevance of the MQN-based tool to explore the fragrance chemical space.
Fragrance molecules are relatively small, lipophilic and volatile organic compounds that trigger the sense of smell by interacting with olfactory receptor neurons in the upper part of the nose which display a diverse array of olfactory G-protein coupled receptors [1–7]. These molecules are essential ingredient in foods, perfumes, soaps, shampoos or lotions, and can be classified according to their perceived smell into tens to hundreds of families . Fragrance molecules form an important class of compounds, [9, 10] and a sizable number of them have recently been collected in the public databases SuperScent  and Flavornet,  which list almost two thousand documented fragrance molecules and their properties.
However, global chemical space analyses of fragrance molecules have only been very limited so far [13, 14]. Chemical space is understood as the ensemble of all organic molecules in the context of drug discovery, [15–27] and comprises millions of known molecules collected in public databases such as PubChem,  ChemSpider,  ZINC, or ChEMBL,  and an even much larger number of theoretically possible molecules such as the Chemical Universe Databases GDB-11, [32, 33] GDB-13  and GDB-17,  listing all organic molecules possible up to 11, 13, and 17 atoms obeying simple rules for chemical stability and synthetic feasibility [30–33]. Herein we used the concept of chemical space to analyse and visualize fragrance molecules. Starting from the public databases Superscent and Flavornet, a “fragrance-like” property range was defined, and used to expand the fragrance chemical space by extracting fragrance-like molecules from the public databases ChEMBL, PubChem, ZINC and GDB-13 to form the corresponding fragrance-like subsets ChEMBL.FL, PubChem.FL, ZINC.FL and GDB-13.FL. The resulting fragrance-like chemical space was then analyzed using Molecular Quantum Numbers (MQN), a set of 42 simple integer value descriptors that count atoms, bonds, polar groups and topological features such as cycles. MQN provide a simple classification system for large databases with good performance in prospective virtual screening [36, 37] as well as for database visualization [38, 39]. The MQN-space approach was used to classify and represent the fragrance-like chemical space in form of an interactive application, the FL-mapplet, which is adapted from a previously reported MQN-mapplet application for the focused FL chemical space (freely available from http://www.gdb.unibe.ch) [40, 41]. FL-molecules stand out from this visualization as being relatively simple due to the low number of heteroatoms and functional groups, and therefore appealing from the point of view of organic synthesis.
Fragrance chemistry is constantly searching for new fragrance molecules. A series of 15 different subsets of fragrance molecules were extracted from the SuperScent database and used to test ligand-based virtual screening (LBVS). MQN-similarity sorting enabled the efficient recovery of these known fragrance molecule families from the various FL subsets with equal or better performance that binary substructure fingerprints (Sfp) or extended connectivity fingerprints (ECfp4), illustrating the relevance of the MQN-classification with regards to fragrance molecule properties. The search for MQN-nearest neighbours is enabled by the FL-browser, which might serve as as a guide to identify new fragrance molecules.
Results and discussion
Databases of molecules used in this work
Database of scents from literature
Volatile compounds from literature based on GC-MS
Database of carbohydrates and artificial sweeteners
Database of bitter Cpds from literature and Merck index
NIH repository of molecules
Commercial small molecules
Bioactive drug-like small molecules annotated with experimental data
possible small molecules up to 13 atoms of C, N, O, S, Cl
SuperScent + Flavornet
SuperSweet + BitterDB
Fragrance-like subset of FragranceDB
Fragrance-like subset of ChEMBL
Fragrance-like subset of PubChem
Fragrance-like subset of ZINC
Fragrance-like subset of GDB-13
In terms of polarity as estimated by the calculated octanol/water partition coefficient clogP, FragranceDB overlapped nicely with PubChem, ChEMBL and ZINC by covering the range 0 < clogP < 5, which is a polarity range suitable for rapid diffusion in biological media (Figure 1C). This probably reflects the necessity of fragrance molecules to diffuse from the gas phase to the olfactory neurons to reach their receptors, which requires properties similar to those necessary for drugs to reach their site of action. This property was also shared by the majority of TasteDB, however in this case a significant fraction of the database extended into negative clogP values, comprising mono-saccharides, disaccharides and related polyols, steviol glycosides, and amino acids and peptides such as aspartame. GDB-13, which reflects the combinatorial enumeration of the entire chemical space, peaked at clogP = 0 due to the large fraction of cationic polyamines in the database which extend into negative clogP values. Due to its size GDB-13 however still contained an extremely large number of molecules in the polarity range of fragrance molecules compared to the other databases.
FragranceDB further stood out as a collection of acyclic and structurally flexible molecules, with an abundance of acyclic aliphatic alcohols, aldehydes, acids and esters found for example in butter and fruit aroma (Figure 1D). Monocyclic molecules were also abundant, in particular cyclic terpenes such as limonene or menthol and aromatics such as cinnamaldehyde. By comparison PubChem, ChEMBL and ZINC were more abundant in polycyclic molecules due to the larger size of their molecules and the tendency to use rigid molecules for medicinal chemistry. On the other hand the combinatorial enumeration in GDB-13, which corresponds to the size-range of fragrance molecules, featured bicyclic molecules as the most frequent topology. TasteDB contained mostly monocyclic molecules, many of which were mono-saccharides, but also extended into polycyclic molecules due to the presence of oligosaccharides and steroids in the collection.
Fragrance-likeness and fragrance-like subsets
The property profiles above indicated that fragrance molecules formed a family of relatively small molecules with a low number of heteroatoms and few cycles, in contrast to taste molecules in TasteDB and drug-like molecules which covered a much broader range of structural properties. A simple “fragrance-like” (FL) property range was defined as molecules with HAC ≤ 21 containing only carbon, hydrogen, oxygen or sulfur atoms, with a maximum of three heteroatoms (S + O ≤ 3) and maximum one hydrogen-bond donor atom (HBD ≤ 1). These FL criteria retained 84% of the molecules listed in the combined database (FragranceDB) and were used to define the fragrance like subsets PubChem.FL (1.2% of PubChem), ChEMBL.FL (0.68% of ChEMBL), ZINC.FL (0.28% of ZINC) and GDB-13.FL (6.1% of GDB-13) (Table 1). Note that excluding nitrogen containing molecules from FL criteria eliminated important fragrance molecules such as pyrazines, however the extremely large number of nitrogen containing molecules in the reference databases rendered any nitrogen-containing subsets too strongly enriched in this molecule class which forms only a minor fraction of fragrance molecules.
The property profiles of the FL-subsets showed that FL criteria brought the subsets within the range of FragranceDB. In the HAC profile however, PubChem.FL, ChEMBL.FL and ZINC.FL peaked in the range 15–21 atoms following the abundance of larger molecules in the parent databases, which is substantially higher than the abundance peak of FragranceDB. GDB-13.FL had a sharp abundance peak at HAC = 13 like its parent database GDB-13 (Figure 1E). Most FL molecules from these databases contained three heteroatoms (S + O) while FragranceDB peaked at only two heteroatoms (Figure 1F). Nevertheless FL molecules from PubChem.FL, ChEMBL.FL and ZINC.FL had a somewhat higher clogP indicating higher lipophilicity reflecting their somewhat larger size at similar number of heteroatoms (Figure 1G). GDB-13.FL had a lower clogP value distribution due to the combinatorial enumeration of heteroatom substitutions giving a larger number of possibilities at high numbers of heteroatoms. In contrast to FragranceDB which contains mostly acyclic molecules, the FL subsets were most abundant in monocyclic and bicyclic molecules, again reflecting either the larger molecular size in PubChem.FL, ChEMBL.FL and ZINC.FL, or the larger diversity of cyclic structures formed by combinatorial enumeration in GDB-13.FL (Figure 1H).
Interactive visualization of the fragrance chemical space
To provide a uniform visualization all FL subsets were represented in the (PC1, PC2)-plane corresponding to the PCA of FragranceDB. As illustrated for GDB-13.FL (Figure 2B) and ZINC.FL (Figure 2C), the layout was similar to that observed previously with MQN datasets of a variety of small molecule databases . The MQN-maps appeared as a left-point triangle containing parallel diagonal stripes corresponding to groups of molecules with an increasing number of cycles. In these maps small molecules appeared at left and large molecules at right, acyclic molecules at bottom and polycyclic molecules at the top. Due to the heteroatom restrictions imposed in the FL criteria, the depth of the FL subsets in the PC3 dimension spanning polarity was rather limited.
An interactive FL-mapplet was then generated by modifying the data in the previously reported MQN-mapplet application . This Java application allows to directly view the structural formulae of compounds in each pixel of color-coded MQN-maps, and to subsequently access the compound information at the source database (e.g. DrugBank, ChEMBL, ZINC, PubChem). The FL-mapplet was also linked to the MQN-browser for fragrance molecules to enable MQN-nearest neighbour searches (see below). Similarly to the MQN-mapplet, the FL-mapplet can be downloaded as a Java application from gdb.unibe.ch, and contains a link to the same help page providing detailed explanations on how to use the application.
The main advantage of the interactive FL-mapplet is that one can rapidly inspect the structural formulae of the molecules in the various FL-subsets prearranged in the logical layout of the MQN based PCA maps. One of the striking aspects seen by inspecting the FL subsets is that FL-molecules are relatively simple due to the low number of heteroatoms and functional groups. FL compounds are clearly appealing from the point of view of organic synthesis because of their low number of polar functional groups which draws attention to the carbon skeletons classically at the center of synthesis planning. Concerning the FL-subsets presented here, inspecting GDB-13.FL where almost all molecules are novel might prove particularly inspiring for designing new yet tractable synthetic targets in the fragrance chemical space [47, 48].
Ligand-based virtual screening in the FL chemical space
Although fragrance molecules interact simultaneously with hundreds of different olfactory receptors, structure-activity relationships (SAR) in these compounds are not fundamentally different from those of drug-receptor interactions [13, 14]. Certain compound classes are well correlated with fragrance types, e.g. short chain aliphatic esters with fruity flavors. On the other hand completely different compound classes may elicit the same smell, for example the very different types of musks. Furthermore subtle differences such as chirality may erase the fragrant property or completely switch the fragrance type, e.g. the classical case of (−)- and (+)-carvone displaying spearmint respectively caraway flavor . Despite of many such cases of extreme sensitivity of activity to structural alterations representing activity cliffs in the SAR landscape,  we asked the question whether ligand-based virtual screening (LBVS) in the FL subsets, as is used to identify drug analogs, might also by useful to identify fragrance molecule analogs. To the best of our knowledge a systematic study of LBVS in the fragrance chemical space is unprecedented [51, 52].
Recovery of fragrance molecule families from various databases
FragranceDB recov. at 10%
PubChem.FL recov. at 1%
ChEMBL.FL recov. at 1%
ZINC.FL recov. at 1%
GDB-13.FL recov. at 0.1%
No. of best scores per series
The performance of LBVS for fragrance molecule analogs was further illustrated by displaying the average recovery of actives and of the various databases from the corresponding references as a function of the city-block distance (Figure 3C-F). MQN stood out from the other fingerprints by its ability to differentiate fragrance molecule analogs at low CBD over the other databases including FragranceDB. The sigmoidal shape of the recovery curve for MQN, Sfp and ECfp4, which was absent in the case of MW, illustrates why these fingerprints provide high enrichment factors of actives at low percentage coverage of the various databases.
Overall MQN performed as well as and sometimes better than ECfp4 and Sfp in LBVS for fragrance molecules despite the fact that Sfp and ECfp4 contain much more detailed representations of the molecular structure than MQN, suggesting that the MQN-based analysis and visualization presented above were relevant in terms of fragrance molecule properties. This observation confirmed our previous reports that MQN-similarity preforms quite well in LBVS of drug analogs such as the recovery of actives from decoys in the directory of useful decoys (DUD), [39, 55] and the recovery of shape and pharmacophore analogs from GDB-13 [36, 56].
Number of fragrance molecule analogs found by nearest-neighbour searches in the MQN-space of ZINC, ZINC.FL, GDB-13 and GDB-13.FL within the distance boundary CBD MQN ≤ 12
The general properties of fragrance molecules, which are relatively small organic compounds with few polar functional group such as to be volatile, were used to define a “fragrance-like” subset of the chemical space which was extracted from the public databases PubChem, ChEMBL, ZINC and GDB-13. The FL chemical space contains fragment-size, relatively non-polar molecules, and is clearly separate from the well-known drug-like chemical space . The representation of the FL chemical space using interactive color-coded MQN- maps illustrates the extent of the structural diversity at hand. The corresponding FL-mapplet for interactive visualization (Java application to download) and FL-browser for fast MQN-similarity searching of the various FL subsets are freely accessible at gdb.unibe.ch. Inspecting fragrance molecules through these interactive tools shows that FL-molecules appear as particularly appealing from the point of view of organic synthesis due to the low number of heteroatoms and functional groups.
The fragrance chemical space, although relatively narrowly defined, is currently only relatively sparsely populated compared to its potential, implying that many millions of additional fragrance molecules remain to be discovered. Here we showed the MQN-similarity searching efficiently recovers known fragrance molecule families collected from SuperScent from the various FL subsets, with equal or better performance than substructure fingerprints Sfp of the extended connectivity fingerprint ECfp4. The ability to perform efficient LBVS by MQN-proximity searching as enabled by the FL-browser suggests that this resource might facilitate the identification of new fragrance molecules by rapidly pointing to compound series to be evaluated.
FragranceDB and TasteDB
Structure representations from SuperScent  were retrieved from their chemical classes’ folder. The list was inspected visually and in some few cases corrected. Names from Flavornet  were retrieved and converted by Molconvert from ChemAxon Pvt. Ltd (http://www.chemaxon.com/). Furthermore, in some cases Msketch (from ChemAxon) was used. Both datasets were combined and checked for duplicates to a final list of 1760 fragrance molecule structures. For TasteDB structure representations were retrieved from the browsing option of BitterDB  and from the Sweet-tree of SuperSweet . Both datasets were combined and checked for duplicates to a final list of 806 taste structures.
FL-mapplet and MQN-browser for fragrance molecules
The FL-mapplet has been adapted from our previously published MQN-mapplet  by mapping the various FL-subsets (Table 1) on the (PC1,PC2)-plane of the PCA calculated for FragranceDB (see Figure 2), creating the corresponding color-coded maps, and importing the data into the MQN-mapplet. For the PCA maps and assembly of FL-mapplet, PC1-PC2 plane was represented by 1000x1000 grid points (pixels), followed by the assignment of the each of the database molecule on to the grid. Each of the point (pixel) was colour coded according to the average and standard deviation of property (for e.g. heavy atom count) of molecules residing in that pixel. HSL colour space was used for the colour coding. Base colour (H) changes from blue-cyan-green-yellow-red-magenta with increasing average value of property in the pixel, while base colour fades towards the grey with increasing standard deviation. The average molecule for each of the pixel was the determined as follows: a) 42 average MQN values were determined considering MQNs of all of the molecules in given pixel b) City block distance was calculated between 42 MQN values of each of the molecule in the pixel and the 42 average MQN values c) molecule with lowest city block distance to average MQN values was considered as “average molecule” for the pixel.
FL-mapplet is a Java application. Details of the application usage are available on the help page accessible from within the application.
The MQN-browser for fragrance molecules is a web-based application which is accessible from within the FL-mapplet or directly at gdb.unibe.ch. This browser was programmed as previously described for the MQN-browser for other databases to allow nearest neighbour searching of any query molecules within the FL-subsets using CBDMQN as similarity measure . Searching in database space is enabled by use of bit mask values to store the database information of the structures. Bits were assigned to each database. During similarity searching, choice of databases made by user defined as “wanted bit mask” using Bitwise OR operation.
Ligand-based virtual screening
Enrichment studies for the recovery of various fragrance molecule classes (actives) from the fragrance like databases (decoys) ChEMBL.FL, FragranceDB, PubChem.FL, ZINC.FL and GDB-13.FL were carried out using a java program written in-house using the JChem chemistry library from ChemAxon Ltd. as starting point. Fragrance classes were collected from the SuperScent database (http://bioinf-applied.charite.de/superscent/). Later, molecules within each of the fragrance class were filtered for duplicates and FL criteria. After processing, 15 fragrance classes containing at least 10 molecules in each, were retain for further study. In case of enrichment against GDB-13.FL, fragrance classes were additionally filtered to contain molecules with maximum of 13 heavy atoms. This results in the 12 fragrance classes with at least of 10 molecules in each of them.
Following the ionization of molecules at pH 7.4, Molecular Quantum Numbers (MQN, 42 dimensions), Daylight type binary substructure fingerprint (Sfp, 1024 bits, path length 7), circular Extended Connectivity fingerprint with bond diameter of 4 (ECfp4, 1024 bits) and Molecular weight (MW) were calculated for fragrance molecule classes and database molecules. Computation of molecular properties and fingerprints were enabled by JChem 5.4.1 Chemistry library from ChemAxon Pvt. Ltd. City block distance (CBD) was used as scoring function for virtual screening. Within each of the fingerprint space, enrichment studies were carried as follows: a) for each of the 15 fragrance molecule classes (defined above, 12 in case of GDB-13.FL) reference/query molecule was defined as compound which is most similar to all the other compounds (molecule with lowest CBD to all the other compounds) in the given fragrance molecule class. b) Each of the 15 fragrance molecule classes (12 in case of GDB-13.FL) was separately diluted in five FL like databases ((4*15) + 12 = 72 databases) c) diluted databases were screened against respective query molecule using city block distance as scoring function d) each of the screened database was sorted with increasing CBD to the query molecule, which was followed by the computation of ROC (receiver operator characteristic) curve, EF at 0.1%, 1% and 10%. Data in Figure 3A was obtained by averaging AUC values for 15 fragrance classes (12 in case of GDB-13.FL) within each of the fingerprint space.
This work was supported financially by the University of Bern and the Swiss National Science Foundation.
- Buck L, Axel R: A novel multigene family may encode odorant receptors: a molecular basis for odor recognition. Cell. 1991, 65: 175-187. 10.1016/0092-8674(91)90418-X.View ArticleGoogle Scholar
- Malnic B, Hirono J, Sato T, Buck LB: Combinatorial receptor codes for odors. Cell. 1999, 96: 713-723. 10.1016/S0092-8674(00)80581-4.View ArticleGoogle Scholar
- Shepherd GM: The human sense of smell: are we better than we think?. PLoS Biol. 2004, 2: e146-10.1371/journal.pbio.0020146.View ArticleGoogle Scholar
- Mason JR, Clark L, Morton TH: Selective deficits in the sense of smell caused by chemical modification of the olfactory epithelium. Science. 1984, 226: 1092-10.1126/science.6494927.View ArticleGoogle Scholar
- Briggs MH, Duncan RB: Odour receptors. Nature. 1961, 191: 1310-1311. 10.1038/1911310a0.View ArticleGoogle Scholar
- Lledo P-M, Gheusi G, Vincent J-D: Information processing in the mammalian olfactory system. Physiol Rev. 2005, 85: 281-317. 10.1152/physrev.00008.2004.View ArticleGoogle Scholar
- Pick H, Etter S, Baud O, Schmauder R, Bordoli L, Schwede T, Vogel H: Dual activities of odorants on olfactory and nuclear hormone receptors. J Biol Chem. 2009, 284: 30547-30555. 10.1074/jbc.M109.040964.View ArticleGoogle Scholar
- Kaeppler K, Mueller F: Odor classification: a review of factors influencing perception-based odor arrangements. Chem Senses. 2013, 38: 189-209. 10.1093/chemse/bjs141.View ArticleGoogle Scholar
- Kraft P, Bajgrowicz JA, Denis C, Fráter G: Odds and trends: recent developments in the chemistry of odorants. Angew Chem Int Ed. 2000, 39: 2980-3010. 10.1002/1521-3773(20000901)39:17<2980::AID-ANIE2980>3.0.CO;2-#.View ArticleGoogle Scholar
- Gautschi M, Bajgrowicz JA, Kraft P: Fragrance chemistry - milestones and perspectives. Chimia. 2001, 55: 379-387.Google Scholar
- Dunkel M, Schmidt U, Struck S, Berger L, Gruening B, Hossbach J, Jaeger IS, Effmert U, Piechulla B, Eriksson R, Knudsen J, Preissner R: SuperScent—a database of flavors and scents. Nucleic Acids Res. 2009, 37: D291-D294. 10.1093/nar/gkn695.View ArticleGoogle Scholar
- Arn H, Acree TE: Flavornet: A Database of Aroma Compounds Based on Odor Potency in Natural Products. Developments in Food Science. Volume 40. Edited by: Contis CTHCJMTHPFS ET. 1998, Spanier AM: Elsevier, 27-Google Scholar
- Boyle SM, McInally S, Ray A, Luo L: Expanding the olfactory code by in silico decoding of odor-receptor chemical space. Elife. 2013, 2: e01120-10.7554/eLife.01120.View ArticleGoogle Scholar
- Pal P, Mitra I, Roy K: A quantitative structure–property relationship approach to determine the essential molecular functionalities of potent odorants. Flavour Fragr J. 2013, doi:10.1002/ffj.3191Google Scholar
- Pearlman RS, Smith KM: Novel software tools for chemical diversity. Persp Drug Discovery Des. 1998, 9–11: 339-353.View ArticleGoogle Scholar
- Oprea TI, Gottfries J: Chemography: the art of navigating in chemical space. J Comb Chem. 2001, 3: 157-166. 10.1021/cc0000388.View ArticleGoogle Scholar
- Medina-Franco JL, Martinez-Mayorga K, Giulianotti MA, Houghten RA, Pinilla C: Visualization of the chemical space in drug discovery. Curr Comput-Aided Drug Des. 2008, 4: 322-333. 10.2174/157340908786786010.View ArticleGoogle Scholar
- Medina-Franco JL, Martinez-Mayorga K, Bender A, Marin RM, Giulianotti MA, Pinilla C, Houghten RA: Characterization of activity landscapes using 2D and 3D similarity methods: consensus activity cliffs. J Chem Inf Model. 2009, 49: 477-491. 10.1021/ci800379q.View ArticleGoogle Scholar
- Rosen J, Gottfries J, Muresan S, Backlund A, Oprea TI: Novel chemical space exploration via natural products. J Med Chem. 2009, 52: 1953-1962. 10.1021/jm801514w.View ArticleGoogle Scholar
- Singh N, Guha R, Giulianotti MA, Pinilla C, Houghten RA, Medina-Franco JL: Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model. 2009, 49: 1010-1024. 10.1021/ci800426u.View ArticleGoogle Scholar
- Akella LB, DeCaprio D: Cheminformatics approaches to analyze diversity in compound screening libraries. Curr Opin Chem Biol. 2010, 14: 325-330. 10.1016/j.cbpa.2010.03.017.View ArticleGoogle Scholar
- Reymond JL, Van Deursen R, Blum LC, Ruddigkeit L: Chemical space as a source for new drugs. Med Chem Comm. 2010, 1: 30-38. 10.1039/c0md00020e.View ArticleGoogle Scholar
- Le Guilloux V, Colliandre L, Bourg S, Guénegou G, Dubois-Chevalier J, Morin-Allory L: Visual characterization and diversity quantification of chemical libraries: 1. Creation of delimited reference chemical subspaces. J Chem Inf Model. 2011, 51: 1762-1774. 10.1021/ci200051r.View ArticleGoogle Scholar
- Reymond JL, Ruddigkeit L, Blum LC, Van Deursen R: The enumeration of chemical space. Wiley Interdiscip Rev Comput Mol Sci. 2012, 2: 717-733. 10.1002/wcms.1104.View ArticleGoogle Scholar
- Reymond JL, Awale M: Exploring chemical space for drug discovery using the chemical universe database. ACS Chem Neurosci. 2012, 3: 649-657. 10.1021/cn3000422.View ArticleGoogle Scholar
- Yu MJ: Druggable chemical space and enumerative combinatorics. J Chem inf. 2013, 5: 19-Google Scholar
- Virshup AM, Contreras-Garcia J, Wipf P, Yang W, Beratan DN: Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds. J Am Chem Soc. 2013, 135: 7296-7303. 10.1021/ja401184g.View ArticleGoogle Scholar
- Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH: PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res. 2009, 37: W623-W633. 10.1093/nar/gkp456.View ArticleGoogle Scholar
- Williams AJ: Public chemical compound databases. Curr Opin Drug Discov Devel. 2008, 11: 393-404.Google Scholar
- Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG: ZINC: a free tool to discover chemistry for biology. J Chem Inf Model. 2012, 52: 1757-1768. 10.1021/ci3001277.View ArticleGoogle Scholar
- Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP: ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40: D1100-D1107. 10.1093/nar/gkr777.View ArticleGoogle Scholar
- Fink T, Bruggesser H, Reymond JL: Virtual exploration of the small-molecule chemical universe below 160 daltons. Angew Chem Int Ed Engl. 2005, 44: 1504-1508. 10.1002/anie.200462457.View ArticleGoogle Scholar
- Fink T, Reymond JL: Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J Chem Inf Model. 2007, 47: 342-353. 10.1021/ci600423u.View ArticleGoogle Scholar
- Blum LC, Reymond JL: 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J Am Chem Soc. 2009, 131: 8732-8733. 10.1021/ja902302h.View ArticleGoogle Scholar
- Ruddigkeit L, van Deursen R, Blum LC, Reymond JL: Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inf Model. 2012, 52: 2864-2875. 10.1021/ci300415d.View ArticleGoogle Scholar
- Blum LC, van Deursen R, Bertrand S, Mayer M, Burgi JJ, Bertrand D, Reymond JL: Discovery of alpha7-Nicotinic receptor ligands by virtual screening of the chemical universe database GDB-13. J Chem Inf Model. 2011, 51: 3105-3112. 10.1021/ci200410u.View ArticleGoogle Scholar
- Bürgi JJ, Awale M, Boss SD, Schaer T, Marger F, Viveros-Paredes JM, Bertrand S, Gertsch J, Bertrand D, Reymond J-L: Discovery of potent positive allosteric modulators of the α3β2 Nicotinic acetylcholine receptor by a chemical space in ChEMBL. ACS Chem Neurosci. 2014, doi:10.1021/cn4002297Google Scholar
- Nguyen KT, Blum LC, van Deursen R, Reymond J-L: Classification of organic molecules by molecular quantum numbers. ChemMedChem. 2009, 4: 1803-1805. 10.1002/cmdc.200900317.View ArticleGoogle Scholar
- van Deursen R, Blum LC, Reymond JL: A searchable map of PubChem. J Chem Inf Model. 2010, 50: 1924-1934. 10.1021/ci100237q.View ArticleGoogle Scholar
- Awale M, van Deursen R, Reymond JL: MQN-mapplet: visualization of chemical space with interactive maps of DrugBank, ChEMBL, PubChem, GDB-11, and GDB-13. J Chem Inf Model. 2013, 53: 509-518. 10.1021/ci300513m.View ArticleGoogle Scholar
- Schwartz J, Awale M, Reymond JL: SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules. J Chem Inf Model. 2013, 53: 1979-1989. 10.1021/ci400206h.View ArticleGoogle Scholar
- Wiener A, Shudler M, Levit A, Niv MY: BitterDB: a database of bitter compounds. Nucleic Acids Res. 2012, 40: D413-D419. 10.1093/nar/gkr755.View ArticleGoogle Scholar
- Ahmed J, Preissner S, Dunkel M, Worth CL, Eckert A, Preissner R: SuperSweet—a resource on natural and artificial sweetening agents. Nucleic Acids Res. 2011, 39: D377-D382. 10.1093/nar/gkq917.View ArticleGoogle Scholar
- Temussi PA: Chapter six - new insights into the characteristics of sweet and bitter taste receptors. Int Rev Cell Mol Biol Volume 291. Edited by: Kwang WJ. 2011, Academic Press, 191-226.View ArticleGoogle Scholar
- Congreve M, Carr R, Murray C, Jhoti H: A rule of three for fragment-based lead discovery?. Drug Discov Today. 2003, 8: 876-877.View ArticleGoogle Scholar
- Ceunen S, Geuns JMC: Steviol glycosides: chemical diversity, metabolism, and function. J Nat Prod. 2013, 76: 1201-1228. 10.1021/np400203b.View ArticleGoogle Scholar
- Narula APS: The search for new fragrance ingredients for functional perfumery. Chem Biodivers. 2004, 1: 1992-2000. 10.1002/cbdv.200490153.View ArticleGoogle Scholar
- Plessis C: The search for innovative fragrant molecules. Chem Biodivers. 2008, 5: 1083-1098. 10.1002/cbdv.200890087.View ArticleGoogle Scholar
- Sell CS: On the unpredictability of odor. Angew Chem Int Ed. 2006, 45: 6254-6261. 10.1002/anie.200600782.View ArticleGoogle Scholar
- Bajorath J: Modeling of activity landscapes for drug discovery. Expert Opin Drug Discovery. 2012, 7: 463-473. 10.1517/17460441.2012.679616.View ArticleGoogle Scholar
- Martinez-Mayorga K, Medina-Franco JL: Chapter 2 Chemoinformatics—Applications in Food Chemistry. Advances in Food and Nutrition Research. Volume 58. Edited by: Steve LT. 2009, Academic Press, 33-56.View ArticleGoogle Scholar
- Nicholls A, McGaughey GB, Sheridan RP, Good AC, Warren G, Mathieu M, Muchmore SW, Brown SP, Grant JA, Haigh JA, Nevins N, Jain AN, Kelley B: Molecular shape and medicinal chemistry: a perspective. J Med Chem. 2010, 53: 3862-3886. 10.1021/jm900818s.View ArticleGoogle Scholar
- Hagadone TR: Molecular substructure similarity searching: efficient retrieval in two-dimensional structure databases. J Chem Inf Comput Sci. 1992, 32: 515-521. 10.1021/ci00009a019.View ArticleGoogle Scholar
- Rogers D, Hahn M: Extended-connectivity fingerprints. J Chem Inf Model. 2010, 50: 742-754. 10.1021/ci100050t.View ArticleGoogle Scholar
- van Deursen R, Blum LC, Reymond JL: Visualisation of the chemical space of fragments, lead-like and drug-like molecules in PubChem. J Comput-Aided Mol Des. 2011, 25: 649-662. 10.1007/s10822-011-9437-x.View ArticleGoogle Scholar
- Blum LC, van Deursen R, Reymond JL: Visualisation and subsets of the chemical universe database GDB-13 for virtual screening. J Comput-Aided Mol Des. 2011, 25: 637-647. 10.1007/s10822-011-9436-y.View ArticleGoogle Scholar
- Ruddigkeit L, Blum LC, Reymond JL: Visualization and virtual screening of the chemical universe database GDB-17. J Chem Inf Model. 2013, 53: 56-65. 10.1021/ci300535x.View ArticleGoogle Scholar
- Reymond J-L, Blum LC, van Deursen R: Exploring the chemical space of known and unknown organic small molecules atwww.gdb.Unibe.ch. Chimia. 2011, 65: 863-867. 10.2533/chimia.2011.863.View ArticleGoogle Scholar
- Medina-Franco JL, Martínez-Mayorga K, Peppard TL, Del Rio A: Chemoinformatic analysis of GRAS (Generally Recognized as Safe) flavor chemicals and natural products. PLoS One. 2012, 7: e50798-10.1371/journal.pone.0050798.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.