Identifying Drug–Drug Interactions by Data Mining
A Pilot Study of Warfarin-Associated Drug Interactions
Background—Knowledge about drug–drug interactions commonly arises from preclinical trials, from adverse drug reports, or based on knowledge of mechanisms of action. Our aim was to investigate whether drug–drug interactions were discoverable without prior hypotheses using data mining. We focused on warfarin–drug interactions as the prototype.
Methods and Results—We analyzed altered prothrombin time (measured as international normalized ratio [INR]) after initiation of a novel prescription in previously INR-stable warfarin-treated patients with nonvalvular atrial fibrillation. Data sets were retrieved from clinical work. Random forest (a machine-learning method) was set up to predict altered INR levels after novel prescriptions. The most important drug groups from the analysis were further investigated using logistic regression in a new data set. Two hundred and twenty drug groups were analyzed in 61 190 novel prescriptions. We rediscovered 2 drug groups having known interactions (β-lactamase-resistant penicillins [dicloxacillin] and carboxamide derivatives) and 3 antithrombotic/anticoagulant agents (platelet aggregation inhibitors excluding heparin, direct thrombin inhibitors [dabigatran etexilate], and heparins) causing decreasing INR. Six drug groups with known interactions were rediscovered causing increasing INR (antiarrhythmics class III [amiodarone], other opioids [tramadol], glucocorticoids, triazole derivatives, and combinations of penicillins, including β-lactamase inhibitors) and two had a known interaction in a closely related drug group (oripavine derivatives [buprenorphine] and natural opium alkaloids). Antipropulsives had an unknown signal of increasing INR.
Conclusions—We were able to identify known warfarin–drug interactions without a prior hypothesis using clinical registries. Additionally, we discovered a few potentially novel interactions. This opens up for the use of data mining to discover unknown drug–drug interactions in cardiovascular medicine.
WHAT IS KNOWN
Warfarin is well known for its drug interactions; coadministration of many drugs can cause sudden changes in international normalized ratio levels.
Data-driven (in contrast with hypothesis-driven) analyses may help identify novel drug–drug interactions.
WHAT THE STUDY ADDS
We investigated whether drug–drug interactions could be discovered without a prior hypothesis using a combination of random forest approach and logistic regression in international normalized ratio–stable warfarin-treated patients with nonvalvular atrial fibrillation.
We identified several known drug groups causing changes in international normalized ratio levels soon after initiation and at least one plausible interaction involving a drug group that has not been described previously.
Polypharmacy is common in cardiovascular medicine, and drug–drug interactions may cause many unwarranted adverse effects. Today, most interactions are identified from premarket studies, based on knowledge of mechanisms of action, or from reports of potential adverse drug reactions.1 Presumably, many interactions are unknown because only a small fraction of adverse drug effects is reported.2 Important drug–drug interactions may, therefore, be unknown for a long period, and some may never be revealed. A way of discovering interactions without a prior hypothesis is therefore warranted to increase safety and efficacy of drug treatment.
Data mining is a data-driven approach that operates without a hypothesis. The idea is to build a prediction model and subsequently identify important variables. To verify that the machine-learning model captures important and true trends, the ability to predict new and unseen data is tested. Often, data are divided into 2 parts, where one is used to construct the model (training set) and one is used to test the performance (test set).3,4 Random forest has been shown to be useful in agnostic gene association analyses, which shares many of the same methodological issues as studies aiming to search for drug–drug interactions.5–8 Because of the wealth of variables relative to the number of samples available in such analyses, a standard statistic approach like logistic regression would not be ideal, and multiple testing would require a high degree of correction (eg, Bonferroni). Random forest is more flexible (ie, handles interactions without interactions terms), is able to handle many variables compared with the size of the data set, have a built-in test set, and may be better at capturing weak signals compared with logistic regression.9 We hypothesized that it was possible to determine drug–drug interactions without a prior hypothesis using a combination of random forest and logistic regression. Random forest reduced the number of drug groups (from 220 to 10), and logistic regression specified the direction of the association between the independent variable and the dependent variables in addition to the level of significance in the lower dimensional space. We used warfarin–drug interactions in patients with nonvalvular atrial fibrillation as the prototype because of its many recognized interactions that could serve as positive controls.10 Warfarin affects the prothrombin time and can be monitored using the international normalized ratio (INR). Other drugs may affect the metabolism of warfarin, causing an unwarranted increase or decrease of INR. We assumed that sudden (within days to weeks) change in INR after a novel prescription in patients with a stable prior INR served as a proxy for possible warfarin–drug interactions. Although it is theoretically possible that initiation of other drugs could cause changes in INR independently of warfarin, we are not familiar with any drugs that have major effects on INR in the absence of warfarin (and, thus, be misclassified as interacting with warfarin).
Therefore, we built a method to detect sudden changes to INR after a novel prescription as a proxy for interactions.
This study was based on data from the Danish administrative healthcare registries. A permanent identification number was used for cross-linking 4 registries containing medical and social information. Information on sex, dates of birth, and deaths was collected from the National Population Registry. The Danish National Patient Registry contains diagnoses from hospitalizations based on the International Classification of Diseases, currently the 10th version (ICD-10). This information is reliable because hospital departments are reimbursed based on correct diagnostic and procedural registration.11 Prothrombin time was collected from 3 areas in Denmark containing samples from both general practitioners and hospitals. The diagnosis of atrial fibrillation (ICD-8 codes 427.93 and 427.94; ICD-10 code I48) has been validated with a positive predictive value of 99% in the Danish National Patient Registry.12
Prothrombin time (measured as INR) was collected from 4 registries administrated by the authorities: (1) a regional registry including all blood samples in the Northern region of Denmark from 1995 to 2012 (4645 patients included in this study); (2) a regional registry including all blood samples drawn on in and out hospital patients from the Copenhagen area from 2001 to 2011 (2190 patients included in this study); (3) all blood samples from several general practitioners in Copenhagen from 2000 to 2012 (7365 patients included in this study); and (4) all samples drawn at Roskilde Hospital between 2000 and 2011 (843 patients included in this study).
We identified patients with nonvalvular atrial fibrillation with available INR values between 1995 and 2012. The INR value had to proximate the therapeutic level (1.8<INR<3.2) at least 60 days before a novel prescription. INR values that were considerably out of therapeutic range (INR≤1.5 or INR≥4.0) ≤45 days after a novel prescription were defined as events. INR values in the therapeutic range (2.0<INR<3.0) after a novel prescription were defined as nonevents. In-between INR values (1.5>INR>2.0 or 3.0>INR>4.0) after a novel prescription were defined as residual INR values (residual category). A drug that had not been prescribed in a 2-year period was defined as a novel prescription. Each drug group had to have at least 15 observations (sum of events and nonevents) to be included. Warfarin usage (Anatomic Therapeutic Chemical [ATC] Classification system codes B01AA03) was assessed by estimating the daily dose from ≤5 consecutive prescriptions adjusted for hospital stays when INR was below 2. Patients were allowed to appear in all groups but not in relation to the same prescription. In this study, we refer drug–drug interactions to the biological net effects of coadministration, that is, when one drug affects the activity of another drug (not to be confused with statistical interactions).
Except for diabetes mellitus and hypertension, which were based on claimed prescriptions, all comorbidities were identified using the Danish National Patient Registry. Diagnostic codes are provided in the Data Supplement. Diabetes mellitus was identified by a claimed prescription of glucose-lowering medications (ATC A10) within 120 days using the Danish Registry of Medicinal Product Statistics. Hypertension was identified using a validated algorithm combining at least 2 classes of antihypertensive medications.13
The following variables were included in the analyses: age, sex, calendar year, household income, the level of education, hospitalization in the period, the total number of novel prescriptions claimed in the period, and variables included in the HAS-BLED and CHA2DS2VASc scores (the compliance of warfarin therapy has previously been associated with comorbidity, household income, and education14,15).13,16 The first 5 characters of the ATC code were used to define drug groups. In the random forest analyses, values were normalized between zero and one. To reduce the number of variables in the logistic regression analyses, calendar year was not included in the main analysis but in a sensitivity analysis.
We applied random forest to identify important variables. Random forest consists of decision trees built using a unique bootstrapped sample.5,17 A bootstrap is a random sample selected as if taking one observation out of a bag, registering it and putting it back in before drawing the next. The bootstrapped sample will have repeated observations, and not all observations will be used in each tree (the unused observations are named out-of-the-bag). In the final forest, the prediction accuracy is evaluated tree by tree using the observations from the out-of-the-bag. The majority votes of all trees determine the overall outcome. Random forest provides a measure of the contribution of each variable to the prediction accuracy (named variable importance). This measure does not specify if an association is positive or negative but is used for ranking the independent variables.
The effect of multiple testing is often reduced by changing the cut off P value (eg, Bonferroni).3 Another way of reducing the effect of multiple testing is by retesting new findings using an unused data set.3 In our study, this was done by both cross-validation (explained later in this section) and by the nature of random forest where the variable importance was measured using out-of-bag data. In a sensitivity analysis, we applied registries 2, 3, and 4 as a training set and registry 1 as a test set. This way the training and test sets were collected in different parts for the country, making them more disjoint and the result more reliable.
The general idea of this study was to locate the most important drug groups by random forest using a training set. Logistic regression then tested these drug groups in the lower dimensional space using another part of the data set (test set; Figure 1). In detail, the most important drug groups were identified using the permutation variable importance in an optimized random forest model with 3 targets—increasing INR (INR>4.0), decreasing INR (INR≤1.5+warfarin), and nonevents (2.0≤INR≤3.0).6 The drug groups were ranked relative to the variable importance. The 10 most important drug groups were then tested using 2 logistic regression models. The first model tested the selected drug groups+clinical variables (specified under the clinical variables section) in relation to increasing INR (INR>4.0) versus nonincreasing INR (INR<3.0). The second model tested the selected drug groups+all clinical variables in relation to decreasing INR (INR≤1.5) versus nondecreasing INR (INR>2.0). To be considered as having a true impact on INR, the drug group had to have a significant positive association (positive β-estimate) in the logistic regression model.
Cross-validation divides the entire data set into k folds. On shift, each of the kth folds serves as a test set (=kth fold) and the rest of the folds (≠kth fold) serves as a training set. Fourfold cross-validation was applied. Because the data set was divided into 4 folds, the procedure was repeated 4 times. If the same result was found in all 4 folds, the result was more reliable than if it was only found in one.
Random forest has 3 parameters that need to be tuned to optimize the prediction accuracy (hyperparameters: the number of trees in the model, the number of divisions in each tree [depth of the trees], and the number of independent variables [predictors] tested at each branching in the decisions trees [mtry]). For each branching, a random sample of possible independent variables (predictors) is chosen and the one that is best separating the dependent variable is selected. Because of the unbalanced data set (more nonevents than events), the trees were grown until each observation had its own branch (full depth).5 The effect of the number of trees was tested. Because data were unbalanced, we applied the best cutoff relative to the area under the curve (AUC) for each target (increasing INR; decreasing INR; nonevent) to balance misclassification. To exclude nonimportant drug groups, we applied backward selection using the following algorithm: (1) optimizing mtry relative to AUC; (2) creating a random forest model using the 50 best drug variables for predicting increasing INR and the 50 best drug variables for predicting decreasing INR; (3) optimizing mtry relative to AUC; (4) creating a random forest model using the 20 best drugs for predicting increasing INR and the 20 best for predicting decreasing INR; and (5) optimizing mtry relative to AUC and (6) the 10 best drugs for each end point (decreasing INR; increasing INR) were together with the clinical variables statistically tested using logistic regression. The hyperparameters in the random forest algorithm were chosen based on the accuracy of the out-of-bag estimates. The statistical tests were performed using the test set (nonapplied kth fold).
Because of the possibility of repeating prescriptions from the same patients, a sensitivity analysis using generalized estimated equations was performed using patients as cluster ID. We analyzed 1 drug at the time, adjusting for the CHA2DS2VASc score, house income, and hospitalization.
For the tests of the regression parameters being equal to zero in the logistic regression model, a 2-sided test with a P value <0.05 was considered statistically significant. All statistical calculations were performed using SAS (version 9.4 for Windows; SAS Institute Inc, Cary, NC) and R (version 3.2.3 for Windows; The R Foundation).
The study was approved by the Danish Data Protection Agency (Ref no 2007-58-015 I-Suite nr: 02720, GEH-2014-012). The data were available at an individual level, but individuals could not be identified. In Denmark, such retrospective register-based studies do not need ethical approval.
We analyzed 61 190 novel perceptions and 220 drug groups in 13 695 patients. Of these, 1348 patients had blood samples in more than one registry. Five patients were observed in both registry 1 (test set in the sensitivity analysis) and in one of the registries 2, 3, and 4 (training set in the sensitivity analysis). We observed 6022 events of increasing INR after a novel prescription and 5843 events of decreasing INR (Figure 2). In 40 120 cases, INR remained in therapeutic range after a novel prescription (nonevents). Lastly, 9205 prescriptions were in between these categories (ie, residuals). Characteristics are outlined in Table 1. In general, patients with events were older, had more comorbidity, and had a lower household income compared with the patients with nonevents.
We identified 87 possible interactions based on information from the Danish Health Authority (Table I in the Data Supplement). Forty-seven of these were available in the data set, of which 7 were rediscovered in our analyses (Tables 2 and 3). Already known interactions are marked if reported in the Danish Drug Interaction Database.10 The robustness of the findings is illustrated by the number of folds (of the 4-fold cross-validation) where the drug group had a significant pattern of interaction together with the P values for each fold. Table 4 illustrates how the significant drug groups from Tables 2 and 3 were distributed between INR categories. Additional drug groups were included to illustrate the distribution of undiscovered interactions and noninteracting drug groups. The novel finding of natural opium alkaloids was confirmed (P<0.001) by a sensitivity analysis applying registries 2, 3, and 4 as a training set and a disjoint test set (registry 1). The same sensitivity analysis did, however, not confirm an association with increasing INR for the other novel finding antipropulsives. In contrast, we observed another interacting drug group in the analysis (imidazole and triazole derivatives [ATC; D01AC]; Tables 2 and 3).
In another sensitivity analysis, including calendar year, we observed a pattern similar to that in the main analysis (data not shown). The generalized estimated equations analysis (adjusting for repeated prescriptions) illustrated the same pattern as the main analysis (data not shown). However, the combinations of penicillins, including β-lactamase inhibitors, was only significant in 1 fold (instead of 2). Two times the generalized estimated equations did not converge (heparin group; insulins and analogues for injection, long-acting) possibly because of a low number of data points for these drugs in the respective folds (n=11; n=5).
The AUC for each of the 3 classes (nonevent; increasing INR; and decreasing INR) were between 0.66 and 0.72 using data not applied in the training of the model (the kth-fold and the out-of-bag data; Table II in the Data Supplement). The sensitivity and specificity for the first fold are illustrated in Table III in the Data Supplement.
When the number of trees was >50, the AUC did not change considerably. Through the optimization cycle, the optimal mtry was between 5 and 20. The optimal cutoff was 0.85, 0.08, and 0.07 for nonevents, increasing INR, and decreasing INR, respectively.
We were able to identify 7 out of 47 possible warfarin–drug interactions (at a 5-digit ATC-code level) without a prior hypothesis. This was possible even though the physicians were expected to pay particular attention to patients after a prescription of an interacting drug and adjust the warfarin dose accordingly. A known interaction would, therefore, be expected to be harder to detect than an unknown interaction. One drug group with an unknown interaction had a strong signal (natural opium alkaloids). Six drug groups (antiarrhythmics class III [amiodarone], other opioids [tramadol], glucocorticoids, triazole derivatives, and combinations of penicillins, including β-lactamase inhibitors) were known to interact with warfarin and identified as generating elevated INR. Two drug groups (oripavine derivatives and natural opium alkaloids) had not previously been tested but are flagged as having a possible interaction in the Danish interaction database because of the similarity with tramadol.10,18–24 Tramadol has a well-known interaction with warfarin, likely because of the shared CYP3A4- mediated metabolism.22,25,26 Buprenorphine (oripavine derivatives) is also metabolized via the CYP3A4 system, and our data support a potential interaction between buprenorphine and warfarin. Morphine (natural opium alkaloids) has no apparently shared metabolism with warfarin.10,26 Yet, natural opium alkaloids had a strong signal of increasing INR (significant in 4 out of 4 folds and significant in the sensitivity analysis). Similar, antipropulsives had a weaker, previously unknown signal (significant in 2 out of 4 folds and not observed in the sensitivity analysis). Natural opium alkaloids and antipropulsives affect the gastrointestinal tract and may alter the absorption of warfarin or vitamin k. Potentially, this may have lead to the observed signal.
The total number of warfarin interactions is not well established. Some warnings are based on few case reports (eg, dopamine agonists) or studies finding only a small increase in the INR (eg, propionic acid derivatives). Some drugs only increase the bleeding rate but not INR (eg, salicylic acid and derivatives). The real number of warfarin interactions affecting INR may, therefore, be lower than the 47 identified by the Danish Drug Interaction Database.
We had expected 3-hydroxy-3-methylglutaryl coenzyme-A reductase inhibitors (eg, simvastatin) to have a signal of increasing INR because it has been reported to have a significant interaction with warfarin.27 Table 4 illustrates that many of these prescriptions were found in the residual category and to a lesser extent in the increasing INR category compared with the rediscovered drug groups having an increasing INR signal. It is possible that physicians may be particularly aware of this interaction and regulate warfarin dosages accordingly or that the interactions cause weaker changes in INR levels than our predefined cutoff points for defining high or low INR.
We were able to identify some warfarin–drug interactions causing decreasing INR, despite several potential barriers. If warfarin is not taken, INR decreases to one. Therefore, poor compliance would create noise. A shift in antithrombotic therapy without completing the warfarin therapy would also create a signal of decreasing INR. This is likely to be the case for the drug groups, including platelet aggregation inhibitors excluding heparin, direct thrombin inhibitors, and heparin.
The aim of this study was to clarify whether patterns of drug–drug interactions could be detected without a prior hypothesis. Seven out of 47 possible interactions were rediscovered. Only 10 drug groups for each category (increasing INR and decreasing INR) were extracted from the 220 available drug groups and tested. The maximal number of interactions that could be identified by the current method was, therefore, 10 for each category. This limit was chosen to have enough power for the logistic regression model and to reduce the unwanted effect of multiple testing. Yet, it could be argued that correction (eg, Bonferroni) should be applied to the logistic regression because we were testing 10 drug groups. Increasing the number of important drug groups selected by random forest would increase the need for correction and, thereby, lower the power to detect drug groups. Even with the relative low discovery rate, our method may, nevertheless, be a valuable supplement to established systems for evaluating drug–drug interactions.
Strengths and Limitations
The strength of our system is that nonobvious interactions may be easier discovered because of the hypothesis-free approach, and some interactions may be discovered at an earlier stage. A flipside is the presence of the many confounders, and this study demonstrated the importance of field knowledge to identify these. To minimize unmeasured confounding, new findings should ideally be tested in a randomized study, but this may be unethical. Evaluation using other end points or a new data set would also strengthen the validity. This study was based on drugs claimed from prescriptions outside the hospitals. Drugs applied only in hospital would, thus, remain undiscovered. Moreover, in observational studies, physicians may create biases by their decisions. Because of the knowledge of interactions in warfarin treatment, INR may be measured more frequently than if another drug is initiated, precluding INR changes from reaching the prespecified threshold (1.5 and 4.0). Some drugs may be correlated with confounders, which may affect INR levels. Antipropulsives (eg, loperamide) may, for instance, be correlated with obstipation or diarrhea and the condition (in contrast to the treatment) could be the cause of altered INR.
In this study, the random forest and logistic regression models held information about the number of novel drugs prescribed in the period, but could not further discern if multiple drugs were initiated simultaneously. If 2 drugs often were prescribed together and one but not the other had an interaction, the noninteracting drug could wrongly be marked as interacting with warfarin. We did, however, not observe any implausible interactions, suggesting that this may not be a problem in the current study.
This study focused on random forest to test the present research question, but other methods could also have been considered (eg, logistic regression with the least absolute shrinkage and selection operator).
This study demonstrated the ability to discover known and possibly unknown warfarin–drug interactions without a prior hypothesis using clinical registries. This opens up for new approaches in the search for unknown drug–drug interactions in cardiovascular medicine.
Sources of Funding
Drs Hansen, Sehested, and Gislason are funded by The Danish Heart Foundation.
The Data Supplement is available at http://circoutcomes.ahajournals.org/lookup/suppl/doi:10.1161/CIRCOUTCOMES.116.003055/-/DC1.
- Received June 1, 2016.
- Accepted October 17, 2016.
- © 2016 American Heart Association, Inc.
- Jensen DD,
- Cohen PR
- Tatonetti NP,
- Fernald GH,
- Altman RB
- Caruana R,
- Niculescu-Mizil A
- 10.↵The Danish Health Authority. Interaktionsdatabasen.dk [Internet]. [cited 2016 Apr 7]; Available from http://www.interaktionsdatabasen.dk/
- Lynge E,
- Sandegaard JL,
- Rebolj M
- Olesen JB,
- Lip GY,
- Hansen ML,
- Hansen PR,
- Tolstrup JS,
- Lindhardsen J,
- Selmer C,
- Ahlehoff O,
- Olsen AM,
- Gislason GH,
- Torp-Pedersen C
- Rouaud A,
- Hanon O,
- Boureau AS,
- Chapelet G,
- Chapelet GG,
- de Decker L
- Olesen JB,
- Lip GY,
- Hansen PR,
- Lindhardsen J,
- Ahlehoff O,
- Andersson C,
- Weeke P,
- Hansen ML,
- Gislason GH,
- Torp-Pedersen C
- Genuer R,
- Poggi J-M,
- Tuleau C
- Juel J,
- Pedersen TB,
- Langfrits CS,
- Jensen SE
- Westergren T,
- Johansson P,
- Molden E