Variation in Cardiologists’ Propensity to Test and Treat
Is It Associated With Regional Variation in Utilization?
Background— Regional variation in healthcare utilization, including cardiac testing and procedures, is well documented. Some factors underlying such variation are understood, including resource supply. However, less is known about how physician behaviors and attitudes may influence variation in utilization across regions.
Methods and Results— We performed a survey of a national sample of cardiologists using patients vignettes to ascertain physicians’ self-reported propensity to test and treat patients with cardiovascular problems, computing a Cardiac Intensity Score for each physician based on his/her responses intended to measure the physician’s propensity to recommend high-tech and/or invasive tests and treatments. In addition, we asked under what circumstances they would order a cardiac catheterization “for other than purely clinical reasons.” For some survey items, there was substantial variation in physician responses. We found that the Cardiac Intensity Score was associated with 2 measures of population based healthcare utilization measured within geographic regions, with a stronger association with general healthcare spending than with delivery of cardiac services. Although nearly all physicians denied ordering a potentially unnecessary cardiac catheterization for financial reasons, some physicians acknowledged ordering the test for other reasons, including meeting patient and referring physician expectations, meeting peer expectations, and malpractice concerns. More than 27% of respondents reported ordering a cardiac catheterization if a colleague would in the same situation frequently or sometimes, and nearly 24% reported doing so out of fear of malpractice. These 2 factors were significantly associated with the propensity to test and treat, but only fear of malpractice was associated with regional utilization.
Conclusions— Variability in cardiologists’ propensity to test and treat partly underlies regional variation in utilization of general health and cardiology services. The factor most closely associated with this propensity was fear of malpractice suits. This factor may be an appropriate target of intervention.
Received December 5, 2008; accepted January 20, 2010.
Variation in healthcare utilization across regions, sometimes called small area variation, is well documented in the United States. Across geographic regions, there are widely differing rates of total healthcare spending,1 utilization of high-cost interventions such as coronary revascularization2–4 and joint replacement,5,6 and less costly but more frequently performed medical care such as physician visits,1 mammography,7,8 and echocardiography.9 Even the use of evidence based interventions for which there is consensus about “which rate is right,” such as the use of β-blockers after acute myocardial infarction, vary regionally.4,8 Increased utilization of general healthcare services does not result in better outcomes, increased patient satisfaction,10 nor improved patient perceptions of the quality of medical care received.11 In an era of escalating healthcare costs and focus on the delivery of high-quality care at the lowest possible cost, it is critical to understand why some regions experience so much higher rates of healthcare utilization than others.
The diagnosis and treatment of coronary artery disease (CAD), a common condition among older Americans, is costly and often invasive. We4,12 and others2,3 have shown substantial regional variation in the utilization of cardiac services in every step of care from diagnostic testing to therapeutic intervention. Between regions, rates of stress testing, cardiac catheterization, and revascularization have been shown to differ from 3- to 8-fold, depending on the procedure under examination.4 Thus, testing and treatment for CAD provides an ideal context in which to examine reasons for regional variation.
Legitimate causes of variation include differences in underlying disease rates and differences in patient preferences, but these factors do not fully explain observed variation.4,10 Some other associates of variations in care are well understood. For instance, variations in cardiac catheterization rates are highly associated with variations in the supply of cardiac catheterization laboratories.4,13 However, other influences on variation are less well understood, including the contribution made by varying physician attitudes. To better understand the potential contribution of physician factors, we surveyed cardiologists about their clinical decisions to test or intervene using patient vignettes. We wished to test whether (1) self-reported tendency to test for and treat CAD was associated with population-based measures of variation in utilization; and (2) whether self-reported tendency to test for and treat CAD was associated with nonclinical factors such as fear of malpractice suits and peer pressure.
WHAT IS KNOWN
Regional variation in healthcare utilization, including cardiac testing and procedures, is well documented.
Some factors underlying such variation are understood, including resource supply.
WHAT THE STUDY ADDS
A summary measure of self-reported tendency to test for and treat cardiac disease was associated with population-based healthcare utilization measured within geographic regions.
Increased tendency to test and treat were associated with the perceived practice of colleagues and fear of malpractice.
Only fear of malpractice was associated with regional utilization.
The survey described in this article is part of a larger physician survey project. A large, representative national sample of primary care physicians and cardiologists were surveyed regarding the intensity of their practice style. By “intensity,” we mean the propensity to treat aggressively, particularly the propensity to order high-tech tests and treatments. Some questions were the same for primary care physicians and cardiologists and others were different. The current study describes the results of the survey of cardiologists. Survey instrument development, including focus groups and cognitive interviews, was conducted in collaboration with the Center for Survey Research, an academic survey research organization affiliated with the University of Massachusetts Boston. This study was approved by the Institutional Review Boards of Dartmouth Medical School and the University of Massachusetts.
Survey Development and Overview
To learn how cardiologists test for and treat CAD, we conducted focus groups with cardiologists (2 focus groups, 12 cardiologists) in 2 cities with differing healthcare utilization rates (Miami, Fla, and Portland, Ore). We discussed diagnostic and therapeutic decision-making, both generally and related to specific draft vignettes describing theoretical patients. A draft survey instrument was developed based on the results of the focus groups and revised based on the results of 8 cognitive interviews with clinicians, conducted to ensure that the questions were understood and the answer choices meaningful.
The survey consisted of questions about the physician, his/her subspecialty, practice setting, and other variables specific to the physician and his/her clinical decision-making. Clinical care questions included vignettes presenting patient scenarios commonly encountered in cardiology practice; physicians were asked how often they would recommend a number of proposed interventions (which were not mutually exclusive) in response to each vignette (see Table 1 for the wording of the vignettes). Responses were assessed using 5-point Likert scales. To ascertain nonclinical factors that may affect decisions to test and treat, we also asked physicians about the likelihood they would recommend a cardiac catheterization for “other than purely clinical reasons” under various conditions using a four point Likert scale.
Using the Masterfiles of the American Medical Association, we drew a national random sample of physicians self-identified as cardiologists. Selected physicians were contacted by telephone to verify specialty, mailing address, and number of hours spent treating patients. A maximum of 3 call attempts were made to speak to either the physician or an informant (eg, receptionist). To be eligible for the study, a physician had to spend at least 20 hours per week treating patients. Residents were ineligible. A trained professional interviewer checked for eligibility. Over the course of the 8-week verification period, we identified 1157 eligible physicians of an original sample of 1340, of whom 999 were randomly selected to receive the survey.
Each potential participant was sent an initial questionnaire packet with a cover letter explaining the study, a $20 cash incentive, a survey instrument, and a postage-paid return envelope. About 2 weeks after their initial mailing, physicians who had not yet responded were sent another questionnaire packet with a letter, survey instrument, and return envelope. Of 999 physicians who were mailed the initial survey, 5 were found to be ineligible based on their survey responses; of the remaining 994, 614 responded for a response rate of 62% (calculated using AAPOR Response Rate 1 formula14). Nonresponders did not differ from responders with respect to gender, practice type, or number of years in practice. Despite prescreening for specialty, an additional 16 physicians reported their specialties as something other than cardiology on the survey; these observations were excluded from the analysis, leaving a total analytic sample of 598.
Regional Variation in Healthcare Utilization
We used 2 different measures to assess regional variation in the delivery of medical care as exposure variables, both based on utilization in the Medicare population, one general and another specific to cardiology. We used the Hospital Referral Region (HRR) as the unit of analysis for regional variation; HRRs are regions developed by the Dartmouth Atlas project representing referral patterns for tertiary care throughout the entire United States and provide a way to measure healthcare utilization at the population level.
The general measure of regional utilization is a measure of overall healthcare spending called the End-of-Life Expenditure Index (EOL-EI).1,10 This measure reflects the portion of Medicare spending in a region attributable to the quantity of services provided, with adjustment for differences in price and illness. It is calculated as age/sex/race-adjusted spending on hospital and physician services provided to Medicare enrollees in the last 6 months of their lives. We calculated the mean EOL-EI for each of the 306 HRRs in the United States; HRRs were then sorted in order of EOL-EI and divided into quintiles of approximately equal population size, based on the entire Medicare population aged 65 and older. Each physician was assigned to an HRR and corresponding quintile of EOL-EI based on the county of his/her primary practice location.
The second measure of regional utilization focused on cardiac services. For this measure, we used the population based cardiac catheterization rate. Rates of cardiac catheterization are highly correlated with both stress testing and revascularization and thus are a good measure of the intensity of delivery of cardiac services. For purposes of this analysis, the catheterization rate may have an advantage over the EOL-EI because it directly relates to some cardiologists’ survey responses about clinical decision-making. We calculated population-based cardiac catheterization rates (CATH rates) for 2003 by counting the number of aged fee for service Medicare beneficiaries receiving the procedure in each HRR during the study period and dividing by the total aged Medicare population in the HRR at midyear. Rates were then standardized by age, sex, and race using the indirect method15; the entire fee for service Medicare population was the standard population. We divided HRRs into quintiles as described above. We then associated each physician with the cardiac catheterization quintile for his/her assigned HRR. EOL-EI and cardiac catheterization quintiles were modestly correlated (Spearman r=0.18, P<0.0001).
Physician Practice Style Intensity
We defined high-intensity practice as the propensity to preferentially recommend high-tech, aggressive, or invasive treatment options. Physician practice style intensity was measured using responses to 3 clinical vignettes representing typical patients seen for cardiac evaluation and treatment, in which the physician was asked how often he/she would recommend specific interventions for the hypothetical patient. Each vignette presented high- and low-intensity testing and treatment options. Responses were assessed using a 5-point Likert scale as described above. We combined responses into a summary measure of the tendency to test and treat intensively in the following way. We used a modified Delphi technique16 with responses from 8 research clinicians (physicians and nurses) to assign weights to each testing and treatment option for each vignette. The weights assigned an intensity score for each potential recommendation (eg, ordering a stress imaging study was designated as more intensive [weight=7] than ordering a standard exercise treadmill [weight=2] for a 75-year-old man with chest pain). Using the weights and reverse coding of high-intensity responses (that is, because all items were coded 1 for the most intense response and 5 for the least, for items with weight >5, high-intensity items, we reversed the coding to give more intense items a higher value: for these items, 1 became 5 and 5 became 1; for items with weight <5, we retained the original coding), we calculated a physician’s intensity score for each vignette by summing the weighted responses. We derived a physician’s summary intensity score, which we called the Cardiac Intensity Score, by averaging the normalized values for each individual vignette intensity score (normalized to a mean of 50 and SD of 10). If a single physician had more than half the responses missing for 1 or more vignettes, no summary score was calculated; 32 physicians had no summary score because of missing data. One question was deleted when examination of the results made it clear that respondents did not understand the intent of the question. (This question allowed physicians to say they would start empirical therapy without further testing; many respondents said yes to this choice and then also endorsed further testing.)
Because of the relatively subjective nature of our modified Delphi technique, we also performed a factor analysis to assess the robustness of our conclusions. We used eigenvalues, the proportion criterion, and examination of the interpretability of the results of several models using 2, 3, and 4 factors with varimax rotation to select the best model.
We included several survey items to assess nonclinical factors that might influence a physician to intervene by asking directly how often each of 5 factors (eg, malpractice concerns) had led them to perform a cardiac catheterization “for other than purely clinical reasons.” Responses to these questions were assessed on a four point Likert scale with responses described as “frequently,” “sometimes,” “rarely,” or “never.”
Characteristics of respondents were described using standard statistical methods, means and standard deviations for continuous variables, and proportions for categorical variables. For purposes of describing vignette responses, we combined the responses “always/almost always” and “most of the time,” using proportions to summarize the data.
The primary analysis of interest was the relationship between each of the 2 measures of utilization (general intensity in the region as measured by the EOL-EI and cardiac specific healthcare utilization in the region as measured by CATH rate) and physician practice style intensity (as measured by the Cardiac Intensity Score). The relationship between Cardiac Intensity Score and both measures of regional level healthcare utilization intensity (quintile) was assessed using ANOVA, with quintile as the independent variable and Cardiac Intensity Score as the dependent variable. Tests for linear trend across quintiles were performed using linear regression models, entering quintile as a continuous variable. Separate models were created for EOL-EI and CATH rate regional variation measures. ANCOVA was performed to assess the primary relationship between region level intensity of utilization and Cardiac Intensity Score while controlling for physician characteristics such as years of graduation, subspecialty, board certification, and practice type. Clustering of physician responses within HRR was accounted for by using the covariates from the final ANCOVA models in mixed models containing a random effect for HRR. ANOVA models were also used to assess the relationship between nonclinical physician practice style factors and regional variation in utilization. We used χ2 tests to assess the relationship between region level intensity of utilization and propensity to order a cardiac catheterization for other than clinical reasons. Analyses were performed using SAS version 9.1 (SAS Institute, Cary, NC).
Respondents and nonrespondents were similar with respect to sex, year of medical school graduation, and practice type. The average age of cardiologists surveyed was 52; 93% were male and 82% white (Table 2). The majority, 54%, belonged to a specialty group, nearly all were board-certified (95%), and 23% were international medical school graduates. Of these cardiologists, 36% were general cardiologists, 21% invasive cardiologists, 36% interventional cardiologists, and 7% electrophysiologists. On average, 9% of physicians’ practices consisted of Medicaid patients, 52% Medicare patients, and 12% of their patients were in capitated plans.
Table 1 describes physicians’ responses to the vignettes in detail and the weights used in the computation of the Cardiac Intensity Score for each item. Generally, the management options with weights of 10 (the highest possible weight) elicited the smallest proportion of physicians’ responding “always/almost always” or “most of the time.” For example, only 7% of physicians reported that they would admit the patient in Vignette 1 (with new onset of angina on heavy exertion to the hospital with such high frequency; assigned weight=10); an even smaller proportion (6%) of physicians reported that they would repeat angiography always or most of the time for the patient in Vignette 2 (with severe end-stage congestive heart failure and nonsustained ventricular tachycardia on Holter monitoring; weight=10); and less than 2% of physicians reported that they would place a pulmonary artery catheter in the Vignette 3 patient (with an exacerbation of end stage congestive heart failure due to inoperable ASCVD always or most of the time; weight=10). However, for other high-weight items (weight, 8 to 9), more physicians reported that they would recommend such an intervention all or most of the time (eg, 30% of physicians so responded to the question about angiography for the patient in Vignette 1; 65% of physicians so responded to the question about implantable cardioverter-defibrillator implantation for the patient in Vignette 2). There was considerable disagreement for some other responses. Nearly half of physicians reported that they would recommend a biventricular pacemaker always or most of the time for the patient in Vignette 2 and a similar proportion would initiate or continue palliative care discussions for the patient in Vignette 3 always or most of the time. Figure 2 shows the statistically significant associations between EOL-EI quintile and individual survey item responses. For the patient in Vignette 1, about 20% of physicians in the lowest spending quintile would order a resting echocardiogram all or most of the time, whereas nearly 40% of physicians in the highest spending regions would do so. More physicians endorsed an imaging stress test in this patient, but the differences between the lowest and highest quintiles were similar. Fewer physicians in the highest quintile would refer the patient in Vignette 2 to palliative care all or most of the time compared with physicians practicing in lower spending quintiles.
Practice type and international medical school graduation were the only characteristics related to the Cardiac Intensity Score (Table 3). International medical school graduates had higher Cardiac Intensity Scores than US medical school graduates. Physicians practicing in HMOs had much lower scores than those in other practice settings. In large geographic regions (those defined by the US Census), New England and Pacific physicians had the lowest scores while those in the Mid-Atlantic, South Atlantic, and West South Central regions had the highest scores (data not shown), although these differences were not statistically significant.
Increasing levels of overall utilization (EOL-EI) in HRRs were significantly associated with higher levels of the Cardiac Intensity Scores for physicians practicing in those HRRs (P trend=0.0002) (Figure 1A); although absolute differences were small, <1 SD (least-squares mean=46.4 in the lowest quintile and 51.1 in the highest quintile, SD 6.8). CATH rates were also positively associated with the Cardiac Intensity Score, although less strongly (P trend=0.0008) and with even smaller absolute differences (Figure 1B). Controlling for physician subspecialty, board certification, international medical school graduation, practice type, and year of graduation had only a minor effect on the results. In multiple regression models, international medical school graduation and practice type were the only physician factors significantly associated with Cardiac Intensity Score, with international medical graduates and members of multispecialty groups reporting higher practice intensity compared with others as measured by the Cardiac Intensity Score.
Factor analysis identified a general intensity factor that was highly correlated with the Cardiac Intensity Score (r=0.83, P<0.0001). Models substituting the general intensity factor score for the Cardiac Intensity Score demonstrated very similar results.
In an attempt to identify nonclinical factors that might contribute to cardiac testing and treatment intensity, we asked how often a respondent would order a cardiac catheterization of questionable clinical necessity when certain conditions applied and we assessed whether the Cardiac Intensity Score differed according to whether or not physicians self-reported that their clinical decisions were influenced by various nonclinical factors. The majority of cardiologists denied ordering a cardiac catheterization of questionable clinical necessity for any of these reasons, although the pattern was different by the reason given for the test (Table 4). Almost no one reported recommending a cardiac catheterization for financial gain, whereas some substantial proportion of cardiologists reported that they would recommend cardiac catheterization for each of the other 4 nonclinical reasons frequently or sometimes. There was no association between Cardiac Intensity Score and self-reported tendency to perform cardiac catheterization to meet the expectations of the patient or the referring physician. However, Cardiac Intensity Scores were higher for physicians who reported being more heavily influenced by what their colleagues would do (P=0.02) and by malpractice concerns (P=0.02), although in both cases the magnitude of the differences was small.
There was no association between ordering a test for nonclinical reasons and EOL-EI quintile (Table 5). Physicians practicing in areas with higher CATH rates were more likely to report frequently or sometimes ordering a cardiac catheterization when a referring physician expected the test (19.1% in the lowest quintile compared with 35.5% in the highest quintile, P=0.05) or out of fear of malpractice (11.9% in the lowest quintile and 35.1% in the highest quintile, P<0.0001).
Using a cardiologist survey and previously developed measures of variation in healthcare utilization in the population, we have shown that cardiologists report a higher propensity to test and treat intensively in high-utilization regions than in lower utilization regions. Thus, our hypothesis that physician practice style intensity is one factor that underlies regional variation was supported and highly statistically significant, although absolute differences were small. (With a derived measure such as our Cardiac Intensity Score created from responses to patient vignettes, it is difficult to define the size of a clinically meaningful difference.) These findings are similar to those of other studies assessing both cardiologist17 and primary care18,19 practice.
Although our 2 measures of utilization were correlated, the relationship was modest and the relationship between physician practice style intensity and overall utilization (EOL-EI) was somewhat stronger than that with cardiac specific utilization (CATH rate). Although it may be surprising that cardiologists’ reported practice intensity more closely mirrored an overall measure of local healthcare utilization (EOL-EI) than cardiac specific utilization (CATH rates), overall healthcare utilization is heavily dependent on practices (eg, hospitalization) such as those measured by our survey. EOL-EI measures overall utilization and is largely dependent on such things as hospitalization rates and the number of days spent in hospital. Some of the vignette responses that we rated as high intensity included admission to the hospital, so this finding is not particularly surprising.
What might affect practice style intensity? We found that 2 nonclinical factors were associated with cardiologist practice style intensity. Cardiologists with high Cardiac Intensity Scores were more likely to report recommending a cardiac catheterization that was not clinically indicated when they thought their peers would do so than those with lower scores, indicating that physicians who practice intensively may be more likely than others to be influenced by peers—or that conformity to perceived practice norms is a potent influence on practice style. In addition, those who reported practicing more intensively were more likely to indicate that fear of malpractice suits influenced their decisions. Physicians sometimes report that they provide care of questionable necessity because a patient asks for or expects such care; although 17% of cardiologists surveyed acknowledged this occurs (at least sometimes), we found no evidence for the hypothesis that the tendency to respond to patient expectations was a predictor of practice intensity. The Cardiac Intensity Score was not associated with either patient or referring physician expectation, nor did cardiologists acknowledge recommending testing to enhance the financial standing of their practices.
Are the nonclinical factors associated with cardiologists’ practice style intensity also related to region level utilization? We hesitate to put much emphasis on the borderline significant association between referring physician expectations and CATH rate due to the borderline probability value and the large number of statistical tests performed in this study. Thus, the only factor associated with region level utilization was fear of malpractice suit and this factor was associated only with CATH rate, not with overall utilization as measured with EOL-EI. As such, malpractice concerns seem to be the common denominator in each analysis and may provide a reasonable target for intervention.
Limitations of our work must be acknowledged. First, our measures of utilization were assessed in the Medicare population alone. However, there is evidence that such measures track closely with those derived from other populations.20,21 Second, we measured cardiologist practice intensity using vignettes rather than a “real-world” measure such as chart review or direct observation. This technique has the advantage of standardizing patient characteristics, so that all cardiologists made recommendations for the “same” set of patients, but we did not perform a validation study. However, several studies22–25 have validated the use of vignettes against both chart review and standardized patients with good results—vignettes provided greater concordance with a gold standard (standardized patient) than chart review. In other studies,26,27 the relationship between self-report and chart review, while present, was only modest (correlations in the 0.30 range). In addition, we used a convenience sample of like-minded research clinicians to assign weights. However, an analysis based on the primary care survey from this same project used both modified Delphi weights and factor analysis to create intensity scores and obtained nearly identical results from both,18 and our secondary analysis using factor analysis gave similar results to the primary analysis.
Although we found that cardiologist practice style intensity as measured in the “same” patient was associated with regional variation in utilization, practice style intensity explained only a small proportion of the overall variation in utilization, as measured by R2 from an ANOVA model, indicating that there are other factors at work. Some of these factors are known, such as resource supply, but others remain to be identified. Similarly, we identified some nonclinical factors that influence physician recommendations, but much work remains to identify others. Our results suggest that malpractice concerns may be a target for intervention to reduce variations.
Sources of Funding
This work was supported in part by grant P01 AG019783 from the National Institute on Aging.
The views expressed herein do not necessarily represent the views of the Department of Veterans’ Affairs or the US Government.
Wennberg DE. The Dartmouth Atlas of Cardiovascular Health Care. Chicago: AHA Press; 1999.
Weinstein JN. In: Birkmeyer J, Wennberg JE, Cooper M, eds. The Dartmouth Atlas of Musculoskeletal Health Care. Hanover NH: Center for the Evaluative Clinical Sciences, Dartmouth Medical School; 2000.
Burns RB, McCarthy EP, Freund KM, Marwill SL, Shwartz M, Ash A, Moskowitz MA. Variability in mammography use among older women. J Am Geriatr Soc. 1996; 12: 44–50.
Wennberg DE, Dickens J Jr, Soule D, Kellett M Jr, Malenka DJ, Robb J, Ryan T Jr, Bradley W, Vaitkus P, Hearne M, O'Connor G, Hillman R. The relationship between the supply of cardiac catheterization laboratories, cardiologists and the use of invasive cardiac procedures in Northern New England. J Health Serv Res Policy. 1997; 2: 75–80.
American Association for Public Opinion Research. 2004. Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. III edition. Lenexa, KS: AAPOR.
Kahn HA, Sempos CT. Statistical Methods in Epidemiology. New York: Oxford University Press; 1989.
Hsu C, Sandford BA. The Delphi technique: making sense of consensus. Practical Assess Res Eval. 2007; 12: 1–8.
Sirovich G, Gallagher PM, Wennberg DE, Fisher ES. Discretionary decision making by primary care physicians and the cost of US health care. Health Affairs. 2008; 27: 813–823.
Bronner KK, Cooper MM, Stukel TA, Wennberg JE. The Dartmouth Atlas of Health Care in Pennsylvania. Chicago, IL: American Hospital Publishing; 1998.
Available at: www.bcbsm.org/atlas.