Patient Registries of Acute Coronary Syndrome
Assessing or Biasing the Clinical Real World Data?
Background— The risk of selection bias in registries and its consequences are relatively unexplored. We sought to assess selection bias in a recent registry about acute coronary syndrome and to explore the way of conducting and reporting patient registries of acute coronary syndrome.
Methods and Results— We analyzed data from patients of a national acute coronary syndrome registry undergoing an audit about the comprehensiveness of the recruitment/inclusion. Patients initially included by hospital investigators (n=3265) were compared to eligible nonincluded (missed) patients (n=1439). We assessed, for 25 exposure variables, the deviation of the in-hospital mortality relative risks calculated in the initial sample from the actual relative risks. Missed patients were of higher risk and received less recommended therapies than the included patients. In-hospital mortality was almost 3 times higher in the missed population (9.34% [95% CI, 7.84 to 10.85] versus 3.9% [95% CI, 2.89 to 4.92]). Initial relative risks diverged from the actual relative risks more than expected by chance (P<0.05) in 21 variables, being higher than 10% in 17 variables. This deviation persisted on a smaller degree on multivariable analysis. Additionally, we reviewed a sample of 129 patient registries focused on acute coronary syndrome published in thirteen journals, collecting information on good registry performance items. Only in 38 (29.4%) and 48 (37.2%) registries was any audit of recruitment/inclusion and data abstraction, respectively, mentioned. Only 4 (3.1%) authors acknowledged potential selection bias because of incomplete recruitment.
Conclusions— Irregular inclusion can introduce substantial systematic bias in registries. This problem has not been explicitly addressed in a substantial number of them.
Received December 15, 2008; accepted August 7, 2009.
Patient registries (PRs) are organized systems using observational study methods to collect uniform data to evaluate specified outcomes for a population defined by a particular disease with predetermined scientific, clinical, or policy purposes.1 Because well-designed and well-performed PRs can provide a real-world view of clinical practice, they are increasingly popular. Furthermore, the information they provide is sometimes used in clinical guidelines to establish the range of benefit or harm of interventions.2
Historically, there has been a lack of standards for conducting and reporting methods and results for PRs.1 Additionally, the magnitude of biases introduced by an inadequately performed registry is not a usually explored issue. However, registries are more prone to biases, especially selection bias, than randomized clinical trials.3 “Selection bias” (SB) refers to situations where the procedures used to select study subjects lead to an effect estimate among those participating in the study that is different from the estimate that is obtainable from the target population.4
The present study has 2 hypotheses. We first proposed that the absence of quality control in a registry may result in misleading conclusions attributable to SB. Thus, we examined the magnitude of the potential SB in a recent registry of acute coronary syndrome (ACS) and its influence on both the outcome event rates and the association between exposures and in-hospital mortality. We also hypothesized that although recommendations have been produced to conduct and report registries so as to prevent bias,1,5 they are often insufficiently complied with. Therefore, we did a systematic literature survey to assess how registries of ACS have been reported in 13 prominent journals.
WHAT IS KNOWN
Use of patient registries is common, because they can be used for multiple purposes and require fewer resources than randomized clinical trials.
Well-designed and well-performed patient registries can provide a real-world view of clinical practice.
WHAT THE STUDY ADDS
In the absence of careful quality control, systematic error attributable to selection bias is a real possibility in patient registries.
Most patient registries about acute coronary syndrome recently published may be flawed because of a lack of quality control.
The way of reporting many currently published registries about acute coronary syndrome is suboptimal, thus limiting the assessment of their validity.
Selection Bias in an Actual Registry
We used data from MASCARA, a previously reported6,7 observational study in 50 Spanish hospitals selected on a random basis (34 had angiography capability). From October 2004 to June 2005, all consecutive patients ≥18 years within 24 hours of the onset of anginal pain at rest were eligible. Patients were finally included if ACS was confirmed during admission according to specific clinical, ECG, and laboratory findings.6,7 The exclusion criteria included impossibility to follow-up, myocardial ischemia triggered by a noncardiac cause, and concomitant noncardiac disease with a life expentancy of less than 12 months.6,7 Participating hospitals were encouraged to enroll consecutive patients. At each site, the designated physician identified those patients fulfilling inclusion criteria and having no exclusion criteria, requested the informed consent and classified the patients into ST-segment elevation ACS, non–ST-segment elevation ACS, and unclassified ACS according to the ECG findings at admission. Thereafter, specifically trained external researchers recorded demographic and clinical data, in-hospital treatment, and outcomes on standardized case report forms.
Quality Control in the MASCARA Study
To assess the presence of potential SB, a post hoc quality control was carried out to ensure consecutive enrollment. In a first step, the hospitals episode statistics of diagnoses related to ACS (410, 411, 413, and 414) were requested to all centers. Only 17 of 50 centers agreed to participate in the quality control. The coordinating center (Vall d’Hebron Hospital, Barcelona) cross-checked the hospital episode statistics database with MASCARA database to find nonincluded eligible patients. All located cases who met inclusion criteria were retrospectively included after a careful review of each clinical record. The initial cohort, prospectively obtained from the 17 centers before quality control, constituted the “prospective cohort.” The second cohort of nonincluded eligible patients obtained after cross-checking constituted the “retrospective cohort.” Finally, the sample resulting from assembling both the prospective sample and the retrospective samples constituted the “actual cohort.” Study investigators were unaware that a quality control was going to be carried out after the data collection. The strategy for the data quality assurance in centers that did not participate in this quality control is reported in detail elsewhere.7
The coordinating center was excluded from the analyses because it had undergone a strict continuous quality control and its investigators were aware of the study development. In the 17 centers undergoing quality control we assessed the differences in the rates of baseline prognostic variables, treatments during hospitalization, and outcome variables between the prospective cohort and the retrospective cohort. Discrete variables were compared using Fisher test and continuous variables using Student t test or Mann–Whitney U test where appropriate.
We assessed the impact of SB on the association between the exposure variables (ie, baseline prognostic and treatment related variables) and the in-hospital mortality using a bivariate and a multivariable approach. In the bivariate approach we compared, for each variable, the initial relative risks (RRs; ie, using only the prospective sample) with the actual RRs (ie, after assembling the prospective sample with the retrospective sample). We quantified the relative difference between both RRs in absolute value (ie, ABS [RR in the prospective cohort − RR in the actual cohort]/RR in the actual cohort). To compute the statistical significance of the difference between the actual RRs and the initial RRs we obtained the Z value from the standardized difference after applying a logarithmic transformation to normalize the distributions of both the actual and the prospective RRs.8
For the multivariable approach we built a nonparsimonious model from the prospective cohort to simultaneously control for all the exposure variables. We used a multilevel approach using generalized estimating equations to take into account the clustering of observations within hospitals. Thereafter the same model was applied in the actual cohort to analyze the change in the odds ratio estimations. Relative differences between odds ratio (OR) estimations in the prospective and actual cohort were quantified in the same manner as in the bivariate approach.
All statistical analyses were done using SAS and SPSS software; for the logistic multivariable approach, we used STATA software.
Once the potential influence of SB on the results was assessed in a real study, we investigated to what extent registries published in medical literature might be prone to SB.
We included ACS registries, regardless of their predetermined purpose. We defined “PR” as any observational study conducted in a specific population defined by a particular disease to evaluate specific outcomes. We restricted the review to prospective and longitudinal studies with forward design, that is, those studies in which the outcome of interest was not present in the moment of initiating the study (ie, case–control, cross-sectional, or retrospective cohort studies were not considered eligible) and focused on ACS (ie, those in which the target population of the study were hospitalized patients for any type of ACS). Community-based studies on patients with history of ACS were not considered eligible.
An experienced librarian (A.P.) used Medline to search electronically 7 high-impact general medicine journals (Lancet, Annals of Internal Medicine, JAMA, New England Journal of Medicine, American Journal of Medicine, British Medical Journal, Archives of Internal Medicine) and 6 cardiology journals (Circulation, Journal of the American College of Cardiology, European Heart Journal, Heart, American Heart Journal, and American Journal of Cardiology) in the last 3 years. This search was carried out until February 2008. A combination of MeSH terms and free text words searched in different fields were used (Appendix I).
Two investigators (I.F.-G. and F.M.), working independently, used standardized forms to establish whether abstracts of articles identified met the eligibility criteria (as defined above). They retrieved the full text of all potentially eligible articles. The same reviewers independently assessed eligibility of the full text articles with standardized forms and resolved discrepancies by discussion. An arbitrator (G.P.-M.) resolved any remaining discrepancies.
Data Extraction and Data Analysis
Two reviewers (I.F.-G. and F.M.), trained in health research methods, independently extracted data using a standardized form. Reviewers collected information on general characteristics such as the target population, number of participating hospitals, sample size, follow-up period, number of patients lost to follow-up, and ethical aspects. We were interested in the type of recruitment of the eligible population. We predefined 4 types: consecutive without interruptions, consecutive with predefined interruptions (eg, the first week of each month of the study period), administrative database (ie, the eligible population was located from pre-existent administrative databases created for other purposes), or other types (ie, not belonging to the mentioned categories). The 2 first categories corresponded to the gold standard to avoid SB, because all eligible patients presenting during the enrolment period would be included. Registries from administrative databases may also avoid SB9 but may be otherwise problematic.10 In addition, we specifically recorded whether authors acknowledged the possibility of missing patients during recruitment, if there was any action to retrospectively locate and include missing patients and if they acknowledged the possibility of selection bias because of incomplete recruitment. We specifically sought whether any quality control of the selection/inclusion process or the data abstraction process was mentioned. We defined quality control of the selection/inclusion process as any audit aimed at assessing whether all eligible patients were finally included and whether the included patients actually met the inclusion criteria. We defined quality control of the data abstraction process as any structured process to ascertain the quality of data, such as (1) training data collectors, (2) continuous feedback to data sites on issues such as missing or out-of-range values and logical inconsistencies, (3) checking of data consistence across sites, and (4) reviewing screening logs and procedures or samples of data.1
Because some of the retrieved articles corresponded to several publications of the same study, the information concerning ethical issues, type of recruitment of the eligible populations, and quality control data were considered to be present in the study if any of the articles from the same study mentioned it. Articles that referenced other articles from the same study which had not been included in the sample but contained potentially relevant information were also reviewed so as to complete, if needed, the information required for our study.
The κ statistic provided a measure of interobserver agreement independent of chance on the eligibility of registries.
The authors had full access to the data and take responsibility for its integrity. All authors have read and agree to the manuscript as written.
Selection Bias in the MASCARA Study
In the 17 centers, quality control identified 1439 eligible patients who had not been initially included, thus increasing the sample size from 3265 patients before the retrospective search to 4704 in the final actual sample size. Table 1 shows the differences between the initial prospective cohort and the retrospective cohort. Almost all the baseline risk variables were substantially more prevalent in not initially included patients, thus raising the global risk of the final actual cohort. By contrast, the majority of recommended therapies were less often used and, excepting for unclassified acute coronary syndrome, their in-hospital mortality was almost 3 times higher.
Table 2 shows the impact of SB on the association between exposure variables and the in-hospital mortality. The initial RRs would have underestimated and overestimated to some extent the actual RRs in 17 and 7 variables, respectively. This deviation was higher than expected by chance in 21 variables (P<0.05), and it was higher than 10% in 17. The highest deviation rate occurred with the treatment-related variables; in 1 (glycoprotein IIb/IIIa inhibitors) the direction of the association shifted from nonconclusive to protective effect.
Table 3 shows the impact of SB as assessed by multilevel regression analysis. ORs of in-hospital mortality were underestimated and overestimated in the prospective cohort in 10 and 12 variables, respectively. Although there were not statistically significant differences between prospective ORs and actual ORs, the deviations were higher than 10% in 14 variables. There was not any change in the direction of the associations, but the variable “gender” lost statistical significance whereas the variables “coronary care unit admission” and “statin therapy” reached significance when applying the model in the actual cohort. In both the prospective and the actual cohorts, the variance of the intercept was statistically significant (0.51 and 0.86 respectively; P<0.001), indicating a cluster effect related to hospital level.
We retrieved 414 abstracts, from which we identified 169 potentially eligible registries, of which 129, corresponding to 77 studies (several studies had more than 1 substudy), proved eligible on consensus review of the full text (Figure). The agreement on eligibility was excellent (κ=0.91; 95% CI, 0.87 to 0.95).
Table 4 shows the information, by articles and by studies, concerning general and methodological characteristics. In 112 of 129 (86.8%) articles, the authors mentioned that they intended to recruit the eligible population consecutively either without or with predefined interruptions. In 14 articles (10.9%) there was no statement about the type of recruitment. In 7 articles (3 studies), the authors declared that patients initially missed during recruitment had been subsequently located and were finally retrospectively included. Only 4 articles (2 studies) acknowledged the possibility of SB because of incomplete recruitment.
Out-of-hospital follow up was reported in 93 (72%) of the articles. Only 60 of these 93 (64.5%) reported data about the proportion of patients lost to follow-up, being lower than 5% in most. Of 9 articles with more than 5% of patients lost to follow-up, the resulting potential bias was acknowledged in only 2 (22.2%).
Only in 29.4% quality control of the selection/inclusion process was reported. Specific actions to control the data abstraction process were mentioned in 48 publications (37.2%) from 15 studies (19.5%).
Our analyses show that irregular inclusion can introduce substantial bias in registries and that this problem has not been explicitly addressed in a substantial number of them, where the relevant information is incomplete. In MASCARA registry, patients who had been missed for prospective inclusion (30.6%) had higher risk and received definitely less recommended therapies than those initially included. Moreover, mortality of this missed cohort was almost 3 times higher than the mortality of the prospective cohort. Therefore, incomplete recruitment/inclusion appears to have resulted in an underestimation of the global baseline risk, an overestimation of the rate of therapies applied, and, more importantly, an underestimation of in-hospital mortality. This deviation was also obvious for the association between exposure variables and in-hospital mortality in both univariate and multivariate analysis. Concerning the systematic overview, although the majority of reviewed articles reported that they intended consecutive enrolment, a quality control was carried out in less that one third of the reviewed studies, and only in 2 of 129 articles the possibility of SB because of incomplete recruitment was acknowledged.
The Grading of Recommendations Assessment, Development, and Evaluation Working Group noted that as randomized trials are not always feasible, in some instances observational studies may provide valid evidence.11 However, although the methodological assessment of published RCTs is the rule, such practice seems less stringent concerning PRs, even if it is well known that registries are more prone to bias because of their nonexperimental design.1
MASCARA registry provides a real example about how SB may affect the estimated rates of clinically relevant variables. This is especially problematic when the first aim of the registry is the estimation of risk. Remarkably, the population initially missed was systematically of higher risk than the population included. Additionally, a registry could be an incentive for enrolling only patients who either are at low risk of complications or have not suffered them, thus biasing the results toward lower event rates. In any case, the mechanisms of our findings and their generalizability deserve further study.
In more than half the reviewed articles the number of enrolled patients was higher than 1200. In most of them, authors stated that they intended to include all consecutive eligible patients. Given the expected difficulty to locate and include all patients with so prevalent a condition, a quality control of the comprehensiveness of recruitment/inclusion would be desirable. However, it was apparently undertaken in only 9 of 77 studies. Similarly, only in 19.5% any approach to avoid errors in data abstraction process was mentioned. Although there are no fixed rules, as with randomized clinical trials, certain strategies for limiting systematic bias in registries could be useful. For instance, the purpose of the registry should be clearly defined beforehand to choose a realistic recruitment strategy according to that purpose. In addition, a continuous feedback to data sites and checking of data consistence across sites may limit data errors. Finally a periodical random review of a sample of clinical records during the execution of the study to search for data missing and data inconsistencies may also improve the quality of the data.
Other findings of our analysis suggest that the current way of reporting registries on ACS is far from optimal: up to 10.9% of the reviewed articles did not report what strategy had been used to enroll the patients, 35.5% articles with out-of-hospital follow-up did not mention whether they had follow-up losses or not, a quality control of the data abstraction process was mentioned only in 37.2% articles, and in 16.3% there was no mention of ethical issues. In view of the importance of the information coming from registries it is surprising that the scientific community does not seem, as a whole, to have given comprehensive attention to methodological issues in them. Although MASCARA registry may seem an extreme instance of poor compliance, the key point is that many relevant registries are reported in such a way that precludes assessing to what extent they are actually flawed.
Limitations and Strengths
Our assessment of SB in the MASCARA study depends on the reliability of the hospitals episode statistics to locate all the ACS episodes. Other studies have reported that although the specificity of this system for hospital episodes is high, its sensitivity is variable.12 This could have to some extent, affected the findings of the study concerning the characteristics of the missing population. On the other hand, only 17 of 49 centers agreed to participate in the quality control. The characteristics of patients recruited in nonparticipating centers suggest (data not reported) that if they had been included as such in the analysis the magnitude of the resulting SB would have been even higher. This is especially true for the in-hospital mortality event rate, which was substantially lower in centers that did not agree to participate.
Our assessment of SB could have been influenced by other biases, such as information bias. For instance, it would be possible that data collectors had introduced bias by misreporting an outcome, either intentionally or unintentionally. However, external researchers were specifically trained to collect data from clinical records according to standardized definitions, thus minimizing the possibility of such bias.
Concerning the systematic survey, we focused on registries on ACS, which were similar to MASCARA registry. Generalizing our findings to all registries is questionable, but there is no reason to believe that the way of reporting registries depends on the characteristics of the target population. We focused on cohort studies because they are mostly used for descriptive purposes as well as for evaluating comparative effectiveness or safety. Our results may not apply to other observational designs such as cross sectional or case-control studies.
Our work has additional strengths. Our sample of 129 registries is the result of a systematic search. Our data collection was comprehensive and careful, including independent judgment and abstraction of data at all stages by reviewers trained in this methodology.
Our results show that systematic error attributable to SB in PRs is a real possibility and that the absence of quality control can lead to biased results. Although our systematic review could not evaluate to what extent misleading conclusions are present in the reviewed sample, the relatively few registries in which quality control was performed suggest that this possibility is a real one. Researchers should ensure, during the design and execution of a PR, that the risk of selection and information bias is minimal. In addition, they should report the results in a way that let the reader assess the possibility of bias. Clinicians should view with caution the results of registries without adequately reported quality control.
We thank all the researchers who actively participated in the inclusion of patients in the MASCARA study (Appendix II). MASCARA researchers involved in recruitment of patient received an honorarium from a grant for this purpose.
Sources of Funding
The present study has been funded with grants from the Fondo de Investigación Sanitaria (PI04/1408), from the Red de Investigación Cardiovascular del Instituto Carlos III (RECAVA), and from an unrestricted grant of Bristol-Myers Squibb.
The online-only Data Supplement is available at http://circoutcomes.ahajournals.org/cgi/content/full/CIRCOUTCOMES.108.844399/DC1.
Gliklich RE, Dreyer NA, eds. Registries for Evaluating Patient Outcomes: A User’s Guide. (Prepared by Outcome DEcIDE Center (Outcome Sciences, Inc. dba Outcome) under Contract No. HHSA29020050035I TO1.) AHRQ Publication No. 07-EHC001-1. Rockville, Md: Agency for Healthcare Research and Quality. 2007.
Rothman KJ. Modern Epidemiology. Boston, Mass: Little, Brown; 1986.
Altman DG, Bland JM. Interaction revisited: the difference between two estimates. BMJ. 2003; 326: 219.
Raftery J, Roderick P, Stevens A. Potential use of routine databases in health technology assessment. Health Technol Assess. 2005: 1–iv.
Delamothe T, Smith R. Open access publishing takes off. BMJ. 2004; 328: 1–3.