Predicting the mechanism of phospholipidosis
© Lowe et al; licensee Chemistry Central Ltd. 2012
Received: 4 November 2011
Accepted: 26 January 2012
Published: 26 January 2012
Skip to main content
© Lowe et al; licensee Chemistry Central Ltd. 2012
Received: 4 November 2011
Accepted: 26 January 2012
Published: 26 January 2012
The mechanism of phospholipidosis is still not well understood. Numerous different mechanisms have been proposed, varying from direct inhibition of the breakdown of phospholipids to the binding of a drug compound to the phospholipid, preventing breakdown. We have used a probabilistic method, the Parzen-Rosenblatt Window approach, to build a model from the ChEMBL dataset which can predict from a compound's structure both its primary pharmaceutical target and other targets with which it forms off-target, usually weaker, interactions. Using a small dataset of 182 phospholipidosis-inducing and non-inducing compounds, we predict their off-target activity against targets which could relate to phospholipidosis as a side-effect of a drug. We link these targets to specific mechanisms of inducing this lysosomal build-up of phospholipids in cells. Thus, we show that the induction of phospholipidosis is likely to occur by separate mechanisms when triggered by different cationic amphiphilic drugs. We find that both inhibition of phospholipase activity and enhanced cholesterol biosynthesis are likely to be important mechanisms. Furthermore, we provide evidence suggesting four specific protein targets. Sphingomyelin phosphodiesterase, phospholipase A2 and lysosomal phospholipase A1 are shown to be likely targets for the induction of phospholipidosis by inhibition of phospholipase activity, while lanosterol synthase is predicted to be associated with phospholipidosis being induced by enhanced cholesterol biosynthesis. This analysis provides the impetus for further experimental tests of these hypotheses.
Since the observation of phospholipidosis by Nelson and Fitzhugh in 1948 , many attempts have been made at understanding the underlying mechanism(s) [2, 3]. Phospholipidosis is the excess accumulation of phospholipids induced in several cell types by numerous cationic amphiphilic drugs (CADs). The most reliable way of determining whether a compound has induced phospholipidosis is by electron microscopy. This analysis is important in the drug development process where the occurrence of phospholipidosis can cause delays and possibly termination of a project (as more tests need to be carried out to satisfy regulatory bodies). It is still unclear whether an accumulation of phospholipids is harmful to human health , the process is often reversible upon withdrawal of the compound, and despite attempts to understand the mechanism of phospholipidosis there is still no mechanistic understanding of how CADs can induce the accumulation of phospholipids in various cell types across different species.
A build-up of phospholipids can be explained by an inhibition of the breakdown or an increase in the synthesis of the phospholipids. Early studies supported the idea that inhibition of the breakdown of phospholipids was a possible mechanism. Hostetler et al.  showed strong support for the theory that the action of CADs was located in the lysosomes and that inhibition of the lysosomal phospholipases A and C caused a build-up of phospholipids. However, there was no way to distinguish between a drug-enzyme or drug-phospholipid binding event as the cause of the inhibition. Joshi et al.  tried to address this problem by measuring binding of phospholipidosis-inducing drugs to L-α-dipalmitoyl phosphatidylcholine vesicles. This suggested that if a drug was found to bind, then drug-phospholipid binding would be the cause of the inhibition of the phospholipases. While most of the drugs tested did bind to L-α-dipalmitoyl phosphatidylcholine vesicles, chloroquine (a phospholipidosis-inducing CAD) did not bind, suggesting that its main mechanism is the direct inhibition of one or more phospholipase enzymes. Abe et al.  produced the first study that distinguished between lysosomal phospholipases A1 and A2. This showed that two CADs, amiodarone and D-threo-1-phenyl-2-decanoylamino-3-morpholino-1-propanol, caused inhibition of lysosomal phospholipase A2. They found that no inhibition occurred on exposure to tetracycline, despite its being a CAD. Hirode et al. , however, found evidence that at high doses tetracycline may induce phospholipidosis. Further studies on lysosomal phospholipase A2 inhibition by CADs have been performed in which Hiraoka et al.  used lysosomal phospholipase A2 (LYPLA2)-deficient mice to study the relationship between LYPLA2 and phospholipidosis. A deficiency of the enzyme resulted in foam cell formation, surfactant lipid accumulation, splenomegaly (enlargement of the spleen), and phospholipidosis. A smaller number of studies have also looked at the possibility of increased synthesis of phospholipids being the mechanism for phospholipidosis by showing that an increase or redirection of synthesis leads to increased amounts of acidic phospholipids [8, 9].
Inhibition of lysosomal phospholipase activity;
Inhibition of lysosomal enzyme transport;
Enhanced phospholipid biosynthesis;
Enhanced cholesterol biosynthesis.
Attempts have been made to predict the occurrence of phospholipidosis using in silico methods. Ploemen et al.  suggested that a compound would be phospholipidosis-inducing (PPL+) provided that it has pKa > 8 and ClogP > 1 and that the sum of the squares (ClogP2 + pKa2) is greater than 90, showing that ClogP and pKa are important descriptors. Other authors have developed increasingly sophisticated models, introducing more complicated Quantitative Structure-Property Relationship (QSPR) methods and descriptors [13–15].
In this study, our aim is to use an in silico approach to predict the possible targets that may be relevant for phospholipidosis. By predicting the targets for a database of phospholipidosis-inducing compounds, we can rank targets by their potential to cause phospholipidosis and compare them to targets previously suggested.
The study of off-target interactions, known as secondary pharmacology, is now recognised as crucial to the understanding of both drug action and toxicology. In favourable cases, one drug may modulate plural disease-relevant targets, a property known as polypharmacology. More commonly, off-target interactions present the risk of side-effects, as is the case with phospholipidosis. Given the prevalence, expense, and risk to patients associated with unforeseen side-effects related to drug-target interactions, studies in this area have particular relevance to the pharmaceutical industry.
This study uses a methodology more complex than many seen in cheminformatics. Our objective is not simply to appeal to the similar property principle. A prediction based on that would run something like this: molecule B is similar to molecule A, which induces phospholipidosis, hence we predict that molecule B induces phospholipidosis too. Here, by way of contrast, we are interested in teasing out a mechanistic understanding much richer than can be obtained by similarity searching or QSPR. Thus, our interest is in predicting compound-target associations that will allow us to understand how phospholipidosis is induced and in suggesting and informing experimental approaches directed towards gaining a deeper mechanistic understanding.
The ChEMBL database  was mined for compounds and their related protein targets. A number of rules were used to filter the dataset. Only compounds which had an associated structure were selected. If the target description included the word "enzyme", "cytosolic", "receptor", "agonist" or "ion channel" and the bioactivity record of the compound contained an IC50, Ki or Kd < 500 μM or had an activity > 50% binding affinity, then it was selected. Of course, we recognise that differences between these measures may sometimes be significant; for instance, Ki and Kd are not strictly equivalent quantities. This selection process produced a dataset which consisted of compounds and their corresponding targets, where a compound may be related to more than one target. A relatively high IC50, Ki or Kd threshold was used as the aim of the study is to look at off-target prediction and therefore potentially weak binding targets. This approach selected a total of 249358 compounds which are related to a total of 3493 different targets. A further stipulation was that for a target to be present in the dataset it must have at least 20 compounds associated with it. This reduced the total dataset to 241145 compounds with 1923 different targets. In other words, the procedure yields N (= 241145) molecules belonging to M (= 1923) classes.
where (x i - x j ) T (x i - x j ) corresponds to the number of features in which x i and x j disagree, while h is the so-called smoothing factor. In the scenario where equal probabilities are calculated for two classes, p(ω α )×p(x i |ω α ) = p(ω α )×p(x i |ω α ), these classes are ranked arbitrarily.
The mined ChEMBL dataset was partitioned into ten randomly split training and validation partitions, the size of which was determined by 99% of each class being present in the training and 1% in the validation set. For classes with fewer than 100 instances, a single instance was present in the validation and the rest in the training set. This produces a training data set with 238086 compounds and a validation set of 3059 compounds for each of the ten partitions. The Parzen-Rosenblatt Window method was applied to each of the ten splits with the smoothing factor h being varied according to 2-15, 2-13, ... , 23. We also carried out analogous calculations using the Naïve Bayes method, implemented as described in reference , allowing us to compare the results from these two techniques.
The ten different models produced on the ten different training partitions were then used to predict the targets of a phospholipidosis dataset with the Parzen-Rosenblatt Window method. The dataset consists of 182 compounds (100 are positive (PPL+) for phospholipidosis and 82 are negative (PPL-)) with a label indicating whether a compound is positive and induces phospholipidosis or is labelled negative and is experimentally confirmed to not induce it. We emphasise that all positives and negatives in our data are experimentally confirmed as such; there are no unverified assumed negatives. The data were primarily derived from Pelletier et al. , with a number of additional molecules taken from other literature sources such as , and are almost identical to the dataset we used in . The full dataset is presented as Additional File 1. We note that an instance is a compound-target relation and not simply a compound, so another target association of a compound from the phospholipidosis dataset may appear in our training set. As we are interested in obtaining as comprehensive as possible a set of targets for these compounds, the other known compound-target relations were not removed from the training set. Our approach allows experimentally known associations of these 182 compounds with other targets, not directly relevant to phospholipidosis, to contribute to our predictions. From the targets predicted for each compound, the top 100 were used as this corresponds to approximately 5% of the total targets. As we are interested in off-targets, the order in which the targets were predicted for each compound is of limited interest here and hence a scoring system was designed to account for this. For the phospholipidosis dataset we have a label, c p , which represents whether a compound, x i , is PPL+ (c p (x i )=+1) or PPL- (c p (x i )=-1). For each target, ω α , we calculate the phospholipidosis score PS α using equation (5):
Comparison of the Parzen-Rosenblatt Window and Naïve Bayes methods
Top 20 PS α scores for targets
5-hydroxytryptamine receptor 2B (r)
5-hydroxytryptamine receptor 2C (r)
D(2) dopamine receptor (r)
5-hydroxytryptamine receptor 1A (r)
Potassium voltage-gated channel subfamily H member 2 (h)
Sodium-dependent serotonin transporter (r)
D(3) dopamine receptor (r)
D(3) dopamine receptor (h)
Muscarinic acetylcholine receptor M5 (r)
Alpha-1D adrenergic receptor (r)
Alpha-1A adrenergic receptor (r)
Alpha-1B adrenergic receptor (r)
5-hydroxytryptamine receptor 2A (r)
Sodium-dependent serotonin transporter (h)
5-hydroxytryptamine receptor 1B (r)
Muscarinic acetylcholine receptor M1 (r)
Sodium-dependent dopamine transporter (r)
Sigma 1-type opioid receptor (h)
Sodium channel protein type 2 subunit alpha (h)
The average ranks of the actual targets in the validation set in Table 1 show that the models are on average able to predict the correct target in the top 1%. This suggests that using high IC50, Ki and Kd values, which correspond to low activity, to select the dataset still allows for good predictive models and hence that it is possible to predict weak binding. If the cut-off is increased to the top 5% of targets, then an increase is seen from 96.1% of the actual targets being present amongst those predicted to 98.8%. It was therefore decided to use the top 5% of targets (actually 100/1923) for the phospholipidosis dataset prediction. Using this higher number allows for more of the off-targets to be selected; as the top predicted targets will often be the intended drug target of the cationic amphiphilic drug (CAD) or targets closely related to it.
None of the expected phospholipidosis-relevant targets appear in the top 20 ranked targets using the PS α score. The highest scoring target that had been previously suggested was lanosterol synthase (LSS), which is in a tie for rank 114. A large number of the highly placed targets in our PS α rankings are the intended drug targets of CADs, which can be used as antiarrhythmics, α-blockers and antipsychotics targeting ion channel transporters (such as sodium-dependent serotonin transporter) , as well as D2/D3 dopamine and serotonin receptors . We also note that a number of the targets are within the same protein family and hence these fill a large number of the higher ranked positions.
PS α scores and ranks for phospholipidosis-relevant targets
Sphingomyelin phosphodiesterase (SMPD) (h)
Lysosomal Phospholipase A1 (LYPLA1) (r)
Phospholipase A2 (PLA2) (h)
Elongation of very long chain fatty acids protein 6 (ELOVL6) (h)
Acyl-CoA desaturase (SCD) (m)
3-hydroxy-3-methylglutaryl-coenzyme A reductase (HMGCR) (h)
Squalene monooxygenase (SQLE) (h)
Lanosterol synthase (LSS) (h)
Table 3 shows the ranked positions of the various targets predicted by Sawada et al. Sphingomyelin phosphodiesterase (SMPD) is responsible for the breakdown of sphingomyelin into phosphocholine and ceramide. Inhibition of SMPD would cause accumulation of the phospholipid sphingomyelin. A build-up of sphingomyelin is associated with Niemann-Pick disease which is often linked to phospholipidosis . Lysosomal phospholipase A2 (LYPLA2) has previously been linked with phospholipidosis, however, due to the lack of data in ChEMBL it was not present in the model. Only two compounds have an associated binding affinity with this target and hence the target did not meet the requirement of having data for at least 20 compounds. LYPLA1 and phospholipase A2 (PLA2) were present in the model and produced PS α scores of 90 and 97, respectively. We expect that lysosomal phospholipase A2 would produce a similar score. Both of these targets act by breaking down phospholipids and hence are shown in Table 3 as being associated with mechanism 1. Since there are no relevant targets present in the original training data, it is not possible to comment on the likelihood of mechanism 2. However, it is clear that our model predicts that the induction of phospholipidosis via the mechanism 3 targets ELOVL6 or SCD is unlikely, as neither is predicted to interact with any of the 100 positive phospholipidosis-inducing compounds. For mechanism 4, out of the targets included in our model, lanosterol synthase produced the best result of those related to Sawada et al.'s mechanisms. Lanosterol synthase is involved in steroid biosynthesis, catalysing the cyclisation of (S)-2,3 oxidosqualene to lanosterol; hence it is associated with enhanced cholesterol biosynthesis (mechanism 4).
Since three targets for mechanism 1 and one for mechanism 4 score highly, our results suggest that a combination of mechanisms 1 and 4 is responsible for inducing phospholipidosis. Thus we find support, from an independent source of evidence and a quite different methodology, for two of the four mechanisms (1 & 4) which Sawada et al. proposed on the basis of their gene expression experiments. A lack of data for suitable targets meant that we could not test any targets for their mechanism 2, while our study suggests that their mechanism 3 does not occur via the targets ELOVL6 or SCD. Our method can only predict drug-protein associations and cannot predict whether phospholipidosis will occur via drug-phospholipid binding. Therefore it can only predict a mechanism which involves direct interaction with the protein.
It is also interesting to observe from Figure 2 that the compounds which are predicted to bind to SMPD are mostly different to those which are predicted to bind to LSS. A Pearson correlation coefficient of -0.847 was calculated between these two targets which suggests that there is some anti-correlation. A chi-squared test was used to assess the null hypothesis that the compounds' scores for LSS and SMPD are independent. The calculated p-value is 5.76 × 10-5 and hence at the 5% significance level the null hypothesis is rejected. The lack of independence between the scores for these two targets, coupled with the observed anti-correlation, suggests that different compounds induce phospholipidosis via each of these two targets, which are associated with different mechanisms. We have also investigated the correlation between scores for other pairs of targets; the independence of scores between SMPD and LYPLA1 has an associated p-value of 0.507, and hence at the 5% level the null hypothesis that they are independent is not rejected. The Pearson correlation coefficient was calculated to be -0.247, suggesting that LYPLA1 and SMPD are anti-correlated.
Using the Parzen-Rosenblatt Window method, predictive models of protein target associations were constructed based on compound structures. For our validation set, experimentally known targets were ranked (on average) in the top 1% of predicted targets. When applied to a dataset of phospholipidosis-inducing and non-inducing compounds, it was found that a number of targets may be linked to phospholipidosis. Sphingomyelin phosphodiesterase, lysosomal phospholipase A1, phospholipase A2 and lanosterol synthase all score highly according to our phospholipidosis score, PS α . It was shown that predicted activities against different targets are often uncorrelated or even anti-correlated. More simply put, different phospholipidosis-inducing compounds are predicted to interact with different putative phospholipidosis-relevant targets. This strongly suggests that different compounds induce phospholipidosis via different targets, and therefore also by different mechanisms. We note that, considering only the four different targets found to be significant here, there remain a number of PPL+ compounds for which a relevant target cannot be identified. This may indicate that further protein targets are mechanistically relevant, or that binding of the compound directly to the lipid is a possible mechanism.
HYM and RCG would like to thank Unilever for financial support. RL and JBOM would like to thank the EPSRC (Grant EP/F049102/1) and the Scottish Universities Life Sciences Alliance (SULSA) for funding. FN thanks the Education Office of the Novartis Institutes for BioMedical Research for a Presidential Postdoctoral Fellowship. We acknowledge the use of the CamGrid service in carrying out this work.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.