Reappraising the WHI study's evidence that hormone replacement therapy increases the risk of a heart attack

 

Introduction:

 

The WHI study was a randomised controlled trial of approximately 16,000 postmenopausal women that was designed to establish whether hormone replacement therapy (HRT) increased, or decreased, the risk of a number of diseases, including the risk of heart attacks [1]. The WHI trialists concluded that long term HRT significantly increased the risk of coronary heart disease events in postmenopausal women (HR 1.29) . This conclusion was diametrically opposite to the conclusion drawn in many observational studies performed in the last two decades, which concluded that HRT has a cardio-protective or neutral effect (HR =<1.0). Many people therefore changed their belief regarding the expected cardiac effects of long-term HRT in postmenopausal women on the basis of the WHI study's results. Is that change in belief warranted? Many opinion leaders argue that the results of a well-done randomised controlled trial (RCT) are more scientifically conclusive than the results of observational studies, which frequently suffer from the effects of uncontrollable, and often unrecognised, confounding variables [2]. However, it is my belief that a RCT can only produce scientifically conclusive results (point estimate HR result with narrow 95%CIs surrounding the point estimate HR result) if the RCT has an adequate sample size relative to the size of the control event rate [3] - a condition not realised by the WHI study.

 

Considering the effects of a low control event rate on the scientifically conclusiveness of a RCT's results:

 

The average annualised coronary heart disease control event rate (CHD-CER) in the placebo patients in the WHI study was 0.3% per year. However, the WHI study's annualised CHD-CER doesn't necessarily represent the mean annualised CHD event rate in the general population of postmenopausal women, who have the same age range and the same degree of absence of known coronary heart disease as the WHI's study sample of 8,102 placebo control patients. The "true" population mean annualised CHD event rate could be anywhere within the 95%CI range surrounding the WHI study's annualised CHD-CER point estimate value of 0.3%. Using a confidence interval calculator, and assuming a sample size of 8,102 patients and a CER of 0.3%, I calculated that the predicted "true" annualised CHD-CER could be anywhere between 0.2% and 0.44% (with a 95% degree of confidence). Even if the "true" annualised CHD-CER is 0.3% on average (for a time period of 6+ years), there is no guarantee that the annualised CHD-CER will be exactly 0.3% in each year of the WHI"s study time period of 6+ years -- there is a significant probability that the yearly CHD-CER could vary from a low value of 0.2% to a high value of 0.44%, depending on the degree of year-by-year variance existing in that sample of ~8,000 patients. Table 1 demonstrates the annualised CHD event rate in the placebo control patients and HRT patients in the WHI study.

 

Table 1: Annualised CHD event rate in the WHI study (copy of table from reference [1]).

 

Year of study

(number of participants)

 HRT therapy
Number of patients with CHD event
(annualized percentage)

Placebo therapy
Number of patients with CHD event
(annualized percentage)

Hazard ratio

(95%CI)

Year 1
(8435 treated, 8050 placebo) 

43
(0.51%)

23
(0.29%)

1.78

(1.08-2.96)

Year 2
(8353 treated, 7980 placebo)

36
(0.43%)

30
(0.38%)

1.15

(0.71-1.86)

Year 3
(8268 treated, 7888 placebo)

20
(0.24%)

18
(0.23%)

1.06

(0.56-2.00)

Year 4
(7926 treated, 7562 placebo)

25
(0.32%)

24
(0.32%)

0.99

(0.57-1.74)

Year 5
(5964 treated, 5566 placebo)

23
(0.39%)

9
(0.16%)

2.38

(1.10-5.15)

Year 6 plus
(5129 treated, 4243 placebo)

17
(0.33%)

18
(0.42%)

0.78

(0.40-1.51)

 

Note that HRT was only apparently harmful in year 1, when the HRT patients had an annualised CHD event rate (0.51%) that was slightly higher than the "expected" annualised CHD event rate of 0.2-0.44%, and in year 5, when the placebo patients had an unexpectedly low annualised CHD control event rate of 0.16%. Note that HRT was not shown to be harmful in year 2, 3, 4 and 6+, when both the HRT-treated and placebo control group patients had an annualised CHD event rate that was in the "expected" 95%CI range of 0.2-0.44%. The calculated HR results for year 1 (HR 1.78) and year 5 (HR 2.38) could therefore be a reflection of chance events and they do not necessarily imply a causal relationship between HRT exposure and an increased risk of CHD events [4].

 

How does one demonstrate a causal relationship in a low control event rate RCT?

 

A causal relationship judgement, based on the results of a low control event rate RCT, is an inferential judgement based on the difference in outcome event rates between exposed and control patients (who are automatically, but often incorrectly, presumed to have exactly the same baseline risk of an outcome event as the exposed patients). This inferential judgement would be more solid and carry greater evidentiary weight if the causal relationship occurred consistently throughout the RCT's study period. For example, many statin trials have been performed and it has been difficult to demonstrate that statins decrease the CHD mortality rate in patients with no prior history of CHD, because each individual statin trial has been significantly underpowered (sample size too small relative to the low annualised CHD mortality CER of <0.6%). Meta-analysts have therefore pooled the results from all the statin trials in an attempt to more convincingly demonstrate that statins reduce the future risk of a CHD-induced death in patients with no prior history of coronary heart disease [5] - see the annualised results in table 2. 

Table 2: Annualised CHD mortality event rate from meta-analysis of statin trials (derived from reference [6]).

 

Year of study

(number of participants)

Statin therapy
Number of patients with CHD mortality event
(annualized percentage)

Placebo therapy
Number of patients with CHD mortality event
(annualized percentage)

Hazard ratio
(95%CI)

Year 1
(24051 treated, 23962 placebo) 

77
(0.32%)

101
(0.42%)

0.75
(0.56-1.01)

Year 2
(23743 treated, 23648  placebo)

84
(0.35%)

105
(0.44%)

0.80
(0.60-1.06)

Year 3
(22722 treated, 22647 placebo)

95
(0.42%)

98
(0.43%)

0.97
(0.73-1.28)

Year 4
(18082 treated, 17984 placebo)

77
(0.43%)

114
(0.63%)

0.67
(0.50-0.90)

Year 5
(17194 treated, 17073 placebo)

99
(0.58%)

101
(0.59%)

0.97
(0.74-1.28)

All years (averaged)
(21158 treated, 21062 placebo)

432
(2.04%)

519
(2.46%)

0.83
(0.73-0.94)

 

Table 2 demonstrates that statins decreased the annualised risk of a CHD death during only three-out-of-five years of the pooled-study time period. The relatively wide 95%CI range surrounding each point estimate HR result is an expression of the degree of uncertainty regarding the absolute degree of risk reduction, and one cannot be certain that statins offer a clinically significant benefit in any particular year (or overall) if the "true" CHD mortality HR value has to be be lower than an arbitrarily-selected HR of 0.85-0.90 (arbitrary minimum benefit threshold HR value). Consequently, it would require a much larger pooled sample size to generate a point estimate HR result that has both tail ends of the 95%CI range lower than a HR of 0.85-0.90, and thereby prove, with a great degree of scientific conclusiveness, that statins provide a clinically significant CHD mortality benefit. This particular example demonstrates how difficult it is for clinical researchers to generate scientifically conclusive results from RCTs that have a low baseline CER, even when they pool the results from multiple studies to generate a larger sample size. 
 

 

Example of another low control event rate RCT that generated scientifically inconclusive results:

 

It is impossible to scientifically conclude, with a great deal of confidence, that HRT increased the risk of a CHD event in the WHI study because the study was handicapped by i) a low control event rate, ii) a sample size that was too small relative to the size of the control event rate, and iii) a year-by-year inconsistency in HR values that suggest chance events rather than a causal relationship.

It is even more surprising to discover that people readily conclude that the APPROVe trial [7] decisively demonstrated that rofecoxib increases the risk of cardiovascular (CV) events - considering that the low control event rate APPROVe study had an even smaller sample size, and inexplicable event-phenomena in the placebo group patients during the latter half of the study period - see table 3. 

 

Table 3: Confirmed thrombotic cardiovascular events in the APPROVe study (from reference [8])  

 

Time interval

Rofecoxib 25mg (N=1287)

Placebo (N=1299)

Relative risk (95%CI)

Number at risk

Events/patient years

Rate (95%CI)*

Number at risk

Events/patient years*

Rate (95%CI)

0-6 months

1287

7/602

1.16
(0.47-2.40)

1299

5/622

0.80
(0.26-1.88)

1.45
(0.40-5.78)

6-12 months

1129

5/544

0.92
(0.30-2.14)

1195

7/586

1.20
(0.48-2.46)

0.77
(0.19-2.81)

12-18 months

1057

10/510

1.96
(0.94-3.61)

1156

8/558

1.43
(0.62-2.82)

1.37
(0.49-3.99)

18-24 months

989

7/481

1.46
(0.59-3.0)

1079

3/531

0.57
(0.12-1.65)

2.58
(0.59-15.43)

24-30 months

 938

6/456

1.32
(0.48-2.86)

1042

2/510

0.39
(0.05-1.42)

3.35
(0.60-33.97)

>30 months

896

11/466

2.36
(1.18-4.23)

1001

1/521

0.19
(0.00-1.07)

12.30
(1.79-529.46)

* Events per 100 patient years

 

Note that the RR was only significantly elevated in the second half of the APPROVe trial's study period (18-30+ months), and note that the wide 95%CI range surrounding each point estimate RR result reflects the small number of patients at risk. The 95%CI range values are so wide that the study's results can only be deemed to be scientifically inconclusive. Also, note that the increased RR values found in the second half of the study period were mainly due to an unexplained, and pathophysiologically implausible, marked reduction in the number of cardiovascular events in the placebo group - rather than being due to a significantly increased number of cardiovascular events in the rofecoxib group. Chance events could have played a significant role in this underpowered study.

 

Conclusion:

 

The WHI study had too low an annualised control event rate and too small a sample size to produce scientifically conclusive results with respect to HRT-induced CHD events. The same admonition applies to the WHI study's conclusions regarding a HRT-induced risk of other disease entities, which also had very low annualised control event rates (eg. annualised CER of breast cancer, endometrial cancer, colorectal cancer, stroke and hip fractures was 0.3%. 0.06%, 0.16%, 0.21% and 0.15% respectively).

Low control event trials cannot produce scientifically conclusive results unless the sample size is unrealistically large, and it is questionable whether it is ethically acceptable to enroll trial participants in a RCT that is predestined to produce a scientifically inconclusive result.

 

Jeff Mann, MD.

Retired physician.

jmannemg@earthlink.net

Date of first draft: December 2005.

 

References:

 

1. Writing Group for the Women's Health Initiative Investigators. Risks and Benefits of Estrogen Plus Progestin in Healthy Postmenopausal Women: Principal Results From the Women's Health Initiative Randomized Controlled Trial. JAMA. 288(3):321-333, July 17, 2002.

2. Yusuf, Salim. Anand, Sonia. Hormone replacement therapy: a time for pause. CMAJ. 167(4):357-359, August 20, 2002.

3. Mann J. Can small RCTs produce results that are clinically significant and scientifically conclusive?

Available at http://jeffmann.net/soapbox/Statistics-smallsampleRCTs.htm

Adobe pdf version available at http://jeffmann.net/soapbox/Statistics-smallsampleRCTs.pdf

4. Mann J. Quantifying the potential magnitude of chance event noise in randomised controlled trials.

Available at http://jeffmann.net/soapbox/chanceevents.htm

Adobe pdf version available at http://jeffmann.net/soapbox/chanceevents.pdf

5. Cholesterol Treatment Trialists' (CTT) Collaborators. Efficacy and safety of cholesterol-lowering treatment: prospective meta-analysis of data from 90 056 participants in 14 randomised trials of statins. Lancet 2005 Oct 8;366(9493):1267-78. Epub 2005 Sep 27.

6. Webfigure 1i: Absolute effects on CHD death per mmol/L LDL cholesterol reduction for those with and without prior MI/CHD. From reference number 5 (online version offering extra web material).

7. Bresalier RS, Sandler RS, Quan H, Bolognese JA, Oxenius B, Horgan K, Lines C, Riddell R, Morton D, Lanas A, Konstam MA, Baron JA. Cardiovascular Events Associated with Rofecoxib in a Colorectal Adenoma Chemoprevention Trial. NEJM March 17th 2005. Vol 352. p1092-1102.

8. FDA public website. CDER Meeting Documents. Arthritis Drug Advisory Committee. February 16-18, 2005 Joint Meeting with the Drug Safety and Risk Management Advisory Committee. Available at http://www.fda.gov/ohrms/dockets/ac/cder05.html