XMetDB: an open access database for xenobiotic metabolism
© The Author(s) 2016
Received: 30 March 2016
Accepted: 4 September 2016
Published: 15 September 2016
Xenobiotic metabolism is an active research topic but the limited amount of openly available high-quality biotransformation data constrains predictive modeling. Current database often default to commonly available information: which enzyme metabolizes a compound, but neither experimental conditions nor the atoms that undergo metabolization are captured. We present XMetDB, an open access database for drugs and other xenobiotics and their respective metabolites. The database contains chemical structures of xenobiotic biotransformations with substrate atoms annotated as reaction centra, the resulting product formed, and the catalyzing enzyme, type of experiment, and literature references. Associated with the database is a web interface for the submission and retrieval of experimental metabolite data for drugs and other xenobiotics in various formats, and a web API for programmatic access is also available. The database is open for data deposition, and a curation scheme is in place for quality control. An extensive guide on how to enter experimental data into is available from the XMetDB wiki. XMetDB formalizes how biotransformation data should be reported, and the openly available systematically labeled data is a big step forward towards better models for predictive metabolism.
Xenobiotic metabolism covers the biochemical modification of drugs and xenobiotics by living organisms. These biotransformations are usually carried out by specialized enzymatic systems such as the cytochrome P450s and the UDP-glucuronosyltransferases , with the aim to make compounds more soluble and more easily excreted from the body.
Understanding how xenobiotic metabolism occurs in the human body is important in two fields particularly: drug discovery and toxicology. In drug discovery, one needs to understand what metabolites of a potential drug are formed to be able to study what effects they have . Occasionally, some metabolites are more potent than the drug itself. For example, cyclophosphamide is a prodrug that is implicated to be activated by P450s . Furthermore, metabolism can cause unwanted side effects, for example by interfering with the potential use of other drugs by inhibiting certain enzymes and, for example, change blood concentrations of drugs . In the field of toxicology, understanding the metabolism of all types of chemicals is important, even if the compounds themselves are not toxic, and it is their metabolites that cause the toxic effects. This also makes it important to understand metabolism when building in silico models that predict toxicity, as the molecular properties of the original compound and its metabolites may differ significantly. The evaluation of the metabolic fate and metabolism similarity of target and analog compounds in the context of read-across is an essential part of the framework for toxicological assessment proposed by Wu et al.  and relevant methods and tools are emphasized in a recent review on in-silico approaches for predicting toxicity .
The creation and validation of in silico models for predicting the metabolism of xenobiotics is an active field of research [7–17]. However, the available of suitable data is limited in two ways. First, relevant data used to construct these models are either locked away in proprietary and commercial databases, such as the Accelrys Metabolite database (http://accelrys.com/products/collaborative-science/databases/bioactivity-databases/biovia-metabolite.html), or is put together specifically for each new model . The availability of data in a public database accessible in an open format and curated by the scientific community would be a major step forward in decreasing the work required to create new models, and to enable comparisons of different models.
Second, the experiments to accurately link the metabolic conversion to a specific enzyme are rare. Often microsomes and hepatocytes of animal and human origin are used , in which it cannot always fully be certain which enzyme really does the conversion. Currently, the most detailed experimental procedures used to verify which metabolites are formed use cDNA-expressed drug-metabolizing enzymes, followed by LC/MS analysis of the formed metabolites . A further complication is that even when reference compounds of the metabolites are available in such studies, it is still not always possible to accurately identify at which atomic position the conversion happens, for example, of an aromatic ring hydroxylation reaction. Other types of experiments give information about the metabolism, but few studies give the full picture. Nevertheless, it is this precise recording of experimental detail is critical, whatever the used method is.
Published data on xenobiotic metabolism, which may serve as an information source of a new database, are currently fragmented over many journals and publications, using different experiment types, and provided in different formats, conforming to different standards. There are several existing databases that contain information relevant for xenobiotic metabolism, including DrugBank , SuperCYP , hDBMdb , Metrabase , Human Metabolome Database (HMDB)  and Transformer ; Drugbank reports substrate—product—enzyme, and while it includes references, these are typically to other database rather than primary literature; Transformer reports substrate—enzyme—reference, but not what the product is; Metrabase focuses on transporters and xenobiotic metabolism; HMDB lists the enzyme and literature but not the species the transformation was observed in, and, SuperCYP reports substrate—product—enzyme. Importantly, these databases do not include atom–atom mapping which is crucial for building predictive models for site-of-metabolism. hDBMdb does not seem to be operational anymore. Further, none of the previously mentioned databases has an open application programming interface (API) for direct consumption in scripts and third party applications, making them less useful for large scale modeling [26–29]
Further, the commercially available databases are very expensive and sometimes contain unpublished data, leading to further literature studies to find validation data that can be used in publications. Also, since different formats and standards are used across different labs, there is no consistent evaluation system for these kinds of models, resulting in published statistics that are not directly comparable between publications, e.g. mapping from mechanism to atoms that have not been performed in an identical fashion.
This lack of publicly available data has led to numerous repetitive literature searches for many academic research groups interested in metabolite prediction modeling. However, finding appropriate literature with enough detail is a challenge in itself. The knowledge about biotransformations is limited and reports about it are scattered. Peer-reviewed literature is not the only source of information, nor does literature report all the details we want to capture. Anecdotal examples even shows literature citing conference posters and package inserts as primary sources . Other literature includes FDA submissions, but these too can provide only limited detail and may also lack information about products formed.
We here present XMetDB—an open access database for xenobiotic metabolism implemented as an online system for deposition and sharing of experimental data. It is the first database that contain atom reaction center mapping and also includes a new reporting standard for this, with data and software available under Open licenses.
XMetDB consists of a database, an application layer for interacting with the database, and a user interface. The core data stored in the database are Observations, which consists of an experimentally detected enzyme-catalyzed reaction of a substrate that yields a product. XMetDB contains the chemical structure of substrates and products, and also includes annotations of atoms which are affected by the metabolizing reaction, indicating experimentally derived site-of-metabolism annotations. The experimental conditions are limited to the type of experiment (one of enzymes, hepatocytes and microsomes) and enzymes involved. The species assumed is human. The initial design does not allow specifying the species, as the intention was to built a database for human metabolism. Associated structured information includes uncertainty of the atom mappings (either certain or uncertain), amount of product formed in the reaction (one of major, minor, or unknown), a literature reference, and comments as free text. Enzymes, such as members of the Cytochrome P450 family, can be added separately and are linked out to UniProt via UniProt ID.
In addition, each observation can be tagged as curated or not, and assigned free text comment.
All Observations have unique identifiers, for example the arbitrarily chosen entry XMETDB153 for the observation of the P450-mediated oxidation of quinoline into 3-hydroxyquinoline by the CYP2E1 isoform, and comprises a metabolic transformation (substrate, product), the enzyme, a literature reference, and a categorization of the experiment with the enzyme identified by its name and UniProt identifier. The reference in the example is provided as free text, while increasingly the DOI is given instead of a textual reference.
Version 1.0 of XMetDB contains 162 observations from 21 scientific papers from 14 journals, covering 117 chemical structures and 95 enzymes.
XMetDB provides two interfaces: an HTML-based interface for access via a web browser and aimed at humans, and an API set up for programmatic interaction. The key features of the interfaces are: (1) browse and search for observations, (2) submit new observations, and (3) add and browse enzymes. The API is described later, and we first explain the graphical user interface.
The user interfaces in XMetDB supports searching for observations by status, experiment type, enzyme, reference, XMETID, and allele. It is also possible to search for substrates and products by chemical similarity and substructure using a structure diagram editor or by chemical identifier (CAS, Chemical Name, SMILES, InChI or SMARTS for a substructure search). The resulting list of structures is ordered by similarity, clicking the folder icon returns a list of observations involving this structure.
XMetDB supports a curator role, with the purpose to ensure high quality data in the database. Data that has been verified by a curator besides the original submitter has been validated by at least one additional person, and the curator has been verified by the administrators as an expert on biotransformations for at least one enzyme family. The curators can edit all observations but not essential info such as experiment and enzymes. The curators can change atom highlighting and comments and typos in references. The curators can set the flag curated to yes for any observations. The users may indicate their availability to act as a curator, but the role must be assigned by an administrator.
Export column headers
XMetDB observation URI
XMetDB observation identifier
SMILES of the substrate structure
InChI of the substrate structure
SMILES of the product structure
InChI of the product structure
The product amount. One of Major, Minor or Unknown
The experiment. One of Hepatocytes, Microsomes, Enzymes
Enzyme UniProt ID
The publication from which the data is taken, either as a DOI or as a plain text reference
Free text comment
The XMetDB database exposes an open Application Program Interface (API) that allows the data to be accessed programmatically by software applications (e.g. workflow engines, scripts) and other web services .
GET—to retrieve the content of the resource
e.g. GET http://xmetdb.org/protocol/XMETDB1
PUT—update the content of the resource
e.g. PUT (some data to) http://xmetdb.org/protocol/XMETDB1
POST—-create a new resource
DELETE—remove the resource
e.g. after DELETE http://xmetdb.org/protocol/XMETDB1 this resource will no more exist
From a deployment perspective, XMetDB consists of two web containers: xmetdb.war and ambit2.war, available both as pre-deployed online web services at http://xmetdb.org, and as a downloadable web application which can be deployed in a compatible servlet container. The chemical structures and datasets are represented as described in , enhanced with JSON representation to facilitate the user interface implementation (http://ambit.sourceforge.net/api.html). The supported data formats are JSON and RDF (RDF/XML and RDF/N3) and asynchronous jobs are handled according to OpenTox API version 1.2 specifications .
Users and roles
The XMetDB application supports the following user roles: (1) regular user, (2) curator and (3) administrator. A regular user is any registered user who is logged in; and can add observations to the database, save searches as alerts, edit the user profile, and flag availability as curators. A curator is a regular user who has been approved as a curator. Beyond regular user rights, a curator can approve that data submitted by other users has been added in a consistent manner, and verify that the atom mappings and references are correct. As part of the curation process he/she can also edit atom mappings, and metadata of an observation. An administrator is a curator with additional rights (to grant the curator role to other users, to modify the enzyme list, delete observations, modify user information etc.). Searching and browsing the observations as well as exporting data through the web interface or API does not require the user to be logged in.
Discussion and conclusions
Extracting biotransformation data from literature is a cumbersome process, and crowdsourcing initiatives are needed in order to propel scientific discoveries and enable computational model building. The aim of XMetDB is to provide scientists with a public repository where xenobiotic metabolism data can be uploaded, shared, and integrated with other observations. Other databases related to metabolism do not include atom–atom mapping, and in this manuscript we propose a formalization of how xenobiotic metabolism data should be reported in order to improve computational model building. We acknowledge here, however, that current experimental methods may not always provide all the detail we ask to be reported. The open access philosophy of XMetDB allows for any content to be uploaded, and a curation system allows for curators to ensure that the database contents are of high quality. XMetDB hence provides the means for the community to collectively build up a knowledge base over time, and it is our hope that the community will adopt the system and deposit annotated metabolism data upon publication in scientific journals. For the future we also envision to import reaction data from other databases or possibly via text mining, which could encourage reporting and investigations on atom–atom mapping.
The publicly available systematically labeled data in XMetDB will be a big step forward towards improved models for predictive metabolism. Future plans include to integrate XMetDB in Bioclipse  and expand the content with more data.
OS, PR and NJ: Original idea, database and GUI design. EW, CE: Experimental data and interpretations. All authors read and approved the final manuscript.
Funding was received from Stiftelsen Olle Engkvist Byggmästare, and the Swedish strategic research program eSSENCE. During the XMetDB project, we were shocked by the tragic death of Patrik Rydberg. We would here like to acknowledge his scientific contributions in the field of xenobiotic metabolism and will continue the XMetDB project in his memory.
The authors declare that they have no competing interests.
XMetDB is available at http://www.xmetdb.org. The data is available under the CC-BY license while the software is licensed under the LGPL.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Testa B, Pedretti A, Vistoli G (2012) Reactions and enzymes in the metabolism of drugs and other xenobiotics. Drug Discov Today 17(11–12):549–60View ArticleGoogle Scholar
- Claesson A, Spjuth O (2013) On mechanisms of reactive metabolite formation from drugs. Mini Rev Med Chem 13(5):9–720View ArticleGoogle Scholar
- Huttunen KM, Raunio H, Rautio J (2011) Prodrugs—from serendipity to rational design. Pharmacol Rev 63(3):750–771View ArticleGoogle Scholar
- Ito K, Iwatsubo T, Kanamitsu S, Ueda K, Suzuki H, Sugiyama Y (1998) Prediction of pharmacokinetic alterations caused by drug–drug interactions: metabolic interaction in the liver. Pharmacol Rev 50(3):387–412Google Scholar
- Wu S, Blackburn K, Amburgey J, Jaworska J, Federle T (2010) A framework for using structural, reactivity, metabolic and physicochemical similarity to evaluate the suitability of analogs for SAR-based toxicological assessments. Regul Toxicol Pharmacol 56(1):67–81View ArticleGoogle Scholar
- Patlewicz GY, Fitzpatrick J (2015) Current and future perspectives on the development, evaluation and application of in silico approaches for predicting toxicity. Chem Res Toxicol 5:00388Google Scholar
- Rostkowski M, Spjuth O, Rydberg P (2013) WhichCyp: prediction of cytochromes P450 inhibition. Bioinformatics 29(16):2051–2052View ArticleGoogle Scholar
- Kirchmair J, Williamson MJ, Tyzack JD, Tan L, Bond PJ, Bender A, Glen RC (2012) Computational prediction of metabolism: sites, products, SAR, P450 enzyme dynamics, and mechanisms. J Chem Inf Model 52(3):48–617View ArticleGoogle Scholar
- Kirchmair J, Bender A, Kubinyi H, Folkers G (2014) Drug metabolism prediction. Methods and principles in medicinal chemistry, vol 63. Wiley, WeinheimGoogle Scholar
- Kirchmair J, Göller AH, Lang D, Kunze J, Testa B, Wilson ID, Glen RC, Schneider G (2015) Predicting drug metabolism: experiment and/or computation? Nat Rev Drug Discov 14(6):387–404. doi:10.1038/nrd4581 View ArticleGoogle Scholar
- Carlsson L, Spjuth O, Adams S, Glen RC, Boyer S (2010) Use of historic metabolic biotransformation data as a means of anticipating metabolic sites using MetaPrint2D and Bioclipse. BMC Bioinform 11:362View ArticleGoogle Scholar
- Piechota P, Cronin MTD, Hewitt M, Madden JC (2013) Pragmatic approaches to using computational methods to predict xenobiotic metabolism. J Chem Inf Model 53(6):92–1282View ArticleGoogle Scholar
- Rudik AV, Dmitriev AV, Lagunin AA, Filimonov DA, Poroikov VV (2014) Metabolism site prediction based on xenobiotic structural formulas and pass prediction algorithm. J Chem Inf Model 54(2):498–507View ArticleGoogle Scholar
- Rydberg P, Olsen L (2012) Ligand-based site of metabolism prediction for cytochrome P450 2D6. ACS Med Chem Lett 3(1):69–73View ArticleGoogle Scholar
- Kolanczyk RC, Schmieder P, Jones WJ, Mekenyan OG, Chapkanov A, Temelkov S, Kotov S, Velikova M, Kamenska V, Vasilev K, Veith GD (2012) MetaPath: an electronic knowledge base for collating, exchanging and analyzing case studies of xenobiotic metabolism. Regul Toxicol Pharmacol RTP 63(1):84–96View ArticleGoogle Scholar
- Rudik A, Dmitriev A, Lagunin A, Filimonov D, Poroikov V (2015) SOMP: web server for in silico prediction of sites of metabolism for drug-like compounds. Bioinformatics 31(12):2046–2048View ArticleGoogle Scholar
- Lapins M, Worachartcheewan A, Spjuth O, Georgiev V, Prachayasittikul V, Nantasenamat C, Wikberg JES (2013) A unified proteochemometric model for prediction of inhibition of cytochrome P450 isoforms. PLoS ONE 8(6):66566View ArticleGoogle Scholar
- Iwatsubo T, Hirota N, Ooie T, Suzuki H, Shimada N, Chiba K, Ishizaki T, Green CE, Tyson CA, Sugiyama Y (1997) Prediction of in vivo drug metabolism in the human liver from in vitro metabolism data. Pharmacol Ther 73(2):147–171View ArticleGoogle Scholar
- Ekins S, Ring BJ, Grace J, McRobie-Belle DJ, Wrighton SA (2000) Present and future in vitro approaches for drug metabolism. J Pharmacol Toxicol Methods 44(1):313–324View ArticleGoogle Scholar
- Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J (2006) DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 34:72–668 (Database issue)View ArticleGoogle Scholar
- Preissner S, Kroll K, Dunkel M, Senger C, Goldsobel G, Kuzman D, Guenther S, Winnenburg R, Schroeder M, Preissner R (2010) SuperCYP: a comprehensive database on cytochrome P450 enzymes including a tool for analysis of CYP–drug interactions. Nucleic Acids Res 38:43–237 (Database issue)View ArticleGoogle Scholar
- Erhardt PW (2003) A human drug metabolism database: potential roles in the quantitative predictions of drug metabolism and metabolism-related drug–drug interactions. Curr Drug Metab 4(5):22–411View ArticleGoogle Scholar
- Mak L, Marcus D, Howlett A, Yarova G, Duchateau G, Klaffke W, Bender A, Glen RC (2015) Metrabase: a cheminformatics and bioinformatics database for small molecule transporter data analysis and (Q)SAR modeling. J Cheminform 7:31. doi:10.1186/s13321-015-0083-5 View ArticleGoogle Scholar
- Wishart DS, Jewison T, Guo AC, Wilson M, Knox C, Liu Y, Djoumbou Y, Mandal R, Aziat F, Dong E, Bouatra S, Sinelnikov I, Arndt D, Xia J, Liu P, Yallou F, Bjorndahl T, Perez-Pineiro R, Eisner R, Allen F, Neveu V, Greiner R, Scalbert A (2013) HMDB 3.0—the human metabolome database in 2013. Nucleic Acids Res 41:7–801. doi:10.1093/nar/gks1065 (Database issue)View ArticleGoogle Scholar
- Hoffmann MF, Preissner SC, Nickel J, Dunkel M, Preissner R, Preissner S (2014) The Transformer database: biotransformation of xenobiotics. Nucleic Acids Res 42:7–1113 (Database issue)View ArticleGoogle Scholar
- Jeliazkova N (2012) Web tools for predictive toxicology model building. Expert Opin Dug Metab Toxicol 8(5):1–11View ArticleGoogle Scholar
- Frey JG, Bird CL (2013) Cheminformatics and the semantic web: adding value with linked data and enhanced provenance. Wiley Interdiscip Rev Comput Mol Sci 3(5):465–481View ArticleGoogle Scholar
- Goldmann D, Montanari F, Richter L, Zdrazil B, Ecker GF (2014) Exploiting open data: a new era in pharmacoinformatics. Future Med Chem 6(5):503–14View ArticleGoogle Scholar
- Williams AJ, Harland L, Groth P, Pettifer S, Chichester C, Willighagen EL, Evelo CT, Blomberg N, Ecker G, Goble C, Mons B (2012) Open PHACTS: semantic interoperability for drug discovery. Drug Discov Today 17(21–22):1188–98View ArticleGoogle Scholar
- DeLeon A, Patel NC, Lynn Crismon M, Aripiprazole A (2004) A comprehensive review of its pharmacology, clinical efficacy, and tolerability. Clin Ther 666(5):26–649Google Scholar
- Willighagen EL, Jeliazkova N, Hardy B, Grafström RC, Spjuth O (2011) Computational toxicology using the OpenTox application programming interface and Bioclipse. BMC Res Notes 4:487View ArticleGoogle Scholar
- Jeliazkova N, Jeliazkov V (2011) AMBIT RESTful web services: an implementation of the OpenTox application programming interface. J Cheminform 3:18View ArticleGoogle Scholar
- Jeliazkova N, Kochev N (2011) AMBIT-SMARTS: efficient searching of chemical structures and fragments. Mol Inform 30(8):707–720Google Scholar
- Burger M (2015) ChemDoodle web components: HTML5 toolkit for chemical graphics, interfaces, and informatics. J Cheminform 7(1):1–7View ArticleGoogle Scholar
- Pico AR, Kelder T, van Iersel MP, Hanspers K, Conklin BR, Evelo C (2008) WikiPathways: pathway editing for the people. PLoS Biol 6(7):184View ArticleGoogle Scholar
- Kutmon M, Riutta A, Nunes N, Hanspers K, Willighagen EL, Bohler A, Mélius J, Waagmeester A, Sinha SR, Miller R, Coort SL, Cirillo E, Smeets B, Evelo CT, Pico AR (2016) WikiPathways: capturing the full diversity of pathway knowledge. Nucleic Acids Res 44(D1):488–494View ArticleGoogle Scholar
- Spjuth O, Helmus T, Willighagen EL, Kuhn S, Eklund M, Wagener J, Murray-Rust P, Steinbeck C, Wikberg JES (2007) Bioclipse: an open source workbench for chemo- and bioinformatics. BMC Bioinform 8:59View ArticleGoogle Scholar