The Xigris affair: Searching for a new paradigm in the world of evidence-based medicine 

Warning to readers: This exploratory essay is not focused on determining whether Xigris actually works in the treatment of sepsis. I am not an infectious diseases expert, or sufficiently knowledgeable about clinical trial design logistics and statistics, to be able to answer that specific question. I am primarily writing this exploratory essay about the "Xigris affair" to demonstrate why I think we need to change the way we collect the scientific evidence and evaluate the scientific evidence, before we use the evidence to alter our clinical practice of medicine.

A recent advance in drug therapy for sepsis was formally approved by the FDA in early November 2001, when the FDA gave the pharmaceutical company, Eli Lilly and Company, formal approval to market the drug Xigris (recombinant human activated protein C) in the USA. The company hopes that the sales of Xigris may rival their company's number 1 blockbuster drug Prozac - according to Thomas Burton, writing in the Wall Street Journal. The drug is regarded by some people as a breakthrough drug in the treatment of sepsis that does not respond promptly to conventional antibiotic therapy. A major milestone in the treatment of septic patients was the discovery of antibiotics, which markedly decreased the mortality rate of septic patients. However, even the most potent combinations of modern-day antibiotic therapy cannot guarantee success, and the mortality rate of sepsis still hovers around the 30% mark. Drug companies have searched for new therapeutic modalities to treat antibiotic-resistant sepsis, and many novel therapies have been developed over the past decade. However, they have all failed to advance the cure of sepsis in the "real" world's true testing arena - the community hospital ICU. Recombinant Human Activated Protein C (rhAPC), the latest drug therapy for the treatment of sepsis, can reputedly reduce the mortality rate of septic patients by 20% (relative risk reduction), and many critical care physicians are already regarding rhAPC therapy as the new "standard of care" for septic patients. However, the new drug is expected to come with an astronomical price tag, and Thomas Burton quoted Raulo S. Frier, vice president of clinical services at Express Scripts Inc., a big pharmacy-benefits manager, as saying that "a lot of pharmacy directors are going to be hyperventilating over the cost". Nevertheless, he also concluded that physicians will be under severe pressure to use the drug in the USA because of a fear of lawsuits if they do not use it in the appropriate circumstances.

What are the appropriate circumstances for rhACP use, and how were they determined?

Consider the evidence-based medical literature support for the use of rhAPC in the therapy of sepsis? The landmark journal article on rhAPC was written by one of Lilly's lead clinical investigators, Gordon R. Bernard, and published in the March 8th 2001 issue of the New England Journal of Medicine.


* Efficacy and safety of recombinant human activated protein C for severe sepsis.

Bernard GR, Vincent JL, Laterre PF, LaRosa SP, Dhainaut JF, Lopez-Rodriguez A, Steingrub JS, Garber GE, Helterbrand JD, Ely EW, Fisher CJ Jr; Recombinant human  protein C Worldwide Evaluation in Severe Sepsis (PROWESS) study group.

Division of Allergy, Pulmonary and Critical Care Medicine, Vanderbilt University School of Medicine, Nashville, TN 37232, USA. gordon.bernard@mcmail.vanderbilt.edu

BACKGROUND: Drotrecogin alfa (activated), or recombinant human activated protein C, has antithrombotic, antiinflammatory, and profibrinolytic properties. In a previous study, drotrecogin alfa activated produced dose-dependent reductions in the levels of markers of coagulation and  inflammation in patients with severe sepsis. In this phase 3 trial, we assessed whether treatment with drotrecogin alfa activated reduced the rate of death from any cause among patients with severe sepsis. METHODS: We conducted a randomized, double-blind, placebo-controlled, multicenter trial. Patients with systemic inflammation and organ failure due to acute infection were enrolled and  assigned to receive an intravenous infusion of either placebo or drotrecogin alfa activated (24 microg per kilogram of body weight per hour) for a total duration of 96 hours. The prospectively defined primary end point was death from any cause and was assessed 28 days after the start of the infusion. Patients were monitored for adverse events; changes in vital signs, laboratory variables, and the results of microbiologic cultures; and the development of neutralizing antibodies against activated protein C. RESULTS: A total of 1690 randomized patients were treated (840 in the  placebo group and 850 in the drotrecogin alfa activated group). The mortality rate was 30.8 percent in the placebo group and 24.7 percent in the drotrecogin alfa activated group. On the basis of the prospectively defined primary analysis, treatment with drotrecogin alfa activated was associated with a reduction in the relative risk of death of 19.4 percent (95 percent confidence interval, 6.6 to 30.5) and an absolute reduction in the risk of death of 6.1 percent (P=0.005). The incidence of serious bleeding was higher in the drotrecogin alfa activated group than in the placebo group (3.5 percent vs. 2.0 percent, P=0.06). CONCLUSIONS: Treatment with drotrecogin alfa activated significantly reduces mortality in patients with severe sepsis and may be associated with an increased risk of bleeding.


In that NEJM article, the author concludes:

"In this study, the administration of drotrecogin alfa activated reduced the rate of death from any cause at 28 days in patients with a clinical diagnosis of severe sepsis, resulting in a 19.4 percent reduction in the relative risk of death and an absolute risk reduction of 6.1 percent. A survival benefit was evident throughout the 28-day study period, whether or not the groups were stratified according to the severity of disease. Our results indicate that in this population, 1 additional life would be saved for every 16 patients treated with drotrecogin alfa activated."
The evidence certainly seems to be clear to the author of the NEJM article, and he appears to believe that rhAPC is routinely indicated for the treatment of severe sepsis, irrespective of disease severity. Further along in the discussion section of the article, he writes:
"A consistent effect of treatment was seen among the groups examined, including those stratified according to age, APACHE II score, sex, number of dysfunctional organs or systems, type of infection (gram-positive, gram-negative, or mixed), site of infection, and presence of protein C deficiency at study entry. Reductions in relative risk of death were observed regardless of whether the patients had a deficiency of protein C at baseline, suggesting that drotrecogin alfa activated has pharmacological effects that go beyond simple physiologic replacment of activated protein C. This observation further suggests that measurements of protein C are not necessary to identify which patients would benefit from treatment with drotrecogin alfa activated. A consistent treatment effect was also observed regardless of the site of infection or the type of infection."
Clinicians are constantly admonished not to base their clinical practice on anecdotal experience, and they are constantly encouraged to seek evidence-based literature support to undergird their clinical practice of medicine. A clinical practioner, who uses his evidence-based medicine skills to parse that NEJM article on rhAPC therapy, would find little reason to doubt the accuracy of the author's statements, and little reason to suspect that the quality of the research evidence supporting the use of rhAPC in routine community practice is actually very controversial, and possibly unsound. Where should an clinical practioner turn to for an alternative point-of-view, or a contrary opinion regarding the quality, or clinical significance, of a RCT? In the July 19th 2001 issue of the NEJM, the journal published a few letters of correspondence from people, who questioned the validity of the PROWESS study group's trial. However, space limitations and strict editorial control by the NEJM limit its value as a forum for vigorous debate and intelligent discussion. What other sources of information does a clinical practioner have for elucidating the scientific truth about the value of rhAPC therapy for sepsis? I think that there is no global medium or forum, that enables an interested clinician to rigorously evaluate controversial EBM literature issues from a variety of angles, and I think that the medical profession needs to find a better method, that will allow an EBM practioner to consider controversial RCT results from a multiplicity of viewpoints.

Consider the PROWESS study group's trial of rhAPC therapy for sepsis in greater detail.

The PROWESS trial consisted of 1,690 septic patients - 840 in the placebo group and 850 in the rhAPC group.

Patients were eligible for the trial if they had a known or suspected infection on the basis of clinical data at the time of screening and if they met the following criteria within a 24-hour period: three or more signs of systemic  inflammation and the sepsis-induced dysfunction of at least one organ or system that lasted no longer than 24 hours. Patients had to begin treatment within 24 hours after they met the inclusion criteria. Patients were followed for 28 days after the start of the infusion or until death.

The trial was designed to enroll 2280 patients; two planned interim analyses by an independent data and safety monitoring board occurred after 760 and 1520 patients had been enrolled. Statistical guidelines to suspend  enrollment if rhAPC was found to be significantly more efficacious than placebo were determined a priori, and used the O'Brien-Fleming spending function according to the method of Lan and DeMets.

The trial was initiated in August 1998 and the trial was stopped after the second interim analysis in June 2000 showed that treatment with rhAPC was associated with a lower 28-day mortality rate (24.7% vs 30.8%), which exceeded the a priori guideline for stopping the trial.

The lead clinical investigator's conclusion was that:-

"In this study, the administration of drotrecogin alfa activated reduced the rate of death from any cause at 28 days in patients with a clinical diagnosis of severe sepsis, resulting in a 19.4 percent reduction in the relative risk of death and an absolute risk reduction of 6.1 percent. A survival benefit was evident throughout the 28-day study period, whether or not the groups were stratified according to the severity of disease."
Is the conclusion perfectly valid, or is there any reason to doubt the author's conclusions?

Consider the evidence.

Point number 1

Although the NEJM article's lauthor stated that "a survival benefit was evident throughout the 28-day period, whether or not the groups were stratified according to disease severity", he did not state whether the degree of benefit was equally great for all levels of sepsis severity.

A careful look at the factual evidence demonstrates that the survival benefit is not the same for all septic patients, irrespective of disease severity.

The severity of sepsis is usually graded according to the APACHE II scoring system, and in the PROWESS trial, the mortality rate of patients with lower APACHE II scores was less than the mortality rate in patients with higher APACHE II scores, for both treated and placebo patients

Mortality as a function of APACHE II scores (slide 23 from the FDA collection - showing the mortality rate plotted against APACHE II scores)

* Note that the mortality rate is even higher in rhAPC treated patients with very low APACHE II scores of 5-10; that there is no significant difference in the mortality rate in the APACHE II score range of 10-15, 15-20, and 20-25; and that there is only clear cut evidence of a mortality rate benefit for APACHE II scores > 25.

* Note that although there is no mortality rate difference in a subgroup of septic patients with an APACHE II score of 40-45, that it is wise not to draw any firm conclusions from a single subgroup analysis, because of the small number of patients found in a single subgroup.

The FDA reviewers, at the FDA's Advisory Board Meeting in October 2001, debated whether they should recommend FDA approval of rhAPC for patients with a lesser severity of sepsis. They finally concluded that there was no evidence to suggest that rhAPC was effective in patients with a lesser severity of sepsis, and that the small bleeding risks associated with rhAPC therapy outweighed the potential benefit in patients with very low APACHE II scores. They also recommended that further studies be performed to determine whether rhAPC is effective in patients with lesser degrees of sepsis.

The drug manufacturer was therefore obliged to note in the drug information insert, that rhAPC has not been shown to work in patients with lower APACHE II scores; and the drug manufacturer was also obliged to perform further studies involving the use of rhAPC in patients with a lesser severity of sepsis.


* Quote from the Xigris drug insert:

"Baseline APACHE II score, as measured in PROWESS, was correlated with risk of death; among patients receiving placebo, those with the lowest APACHE II scores had a 12% mortality rate, while those in the 2nd, 3rd and 4th APACHE quartiles had mortality rates of 26%, 36%, and 49% respectively. The observed mortality difference between Xigris and placebo was limited to the half of patients with higher risk of death i.e., APACHE II score > 25, the 3rd and 4th quartile APACHE II scores. The efficacy of Xigris has not been established in patients with lower risks of death eg. APACHE II score of < 25."

Quote from the FDA's letter of drug approval, mandating further studies:

"To evaluate the efficacy and safety of Drotrecogin alfa (activated) in a study of approximately 11,350 adult patients with severe sepsis and a lower risk of death (eg. APACHE II score of 24 or less). In addition, this trial will evaluate whether low-dose heaprin has an effect on the mortality of Drotrecogin alfa (activated) treated patients in this patient population. The protocol will include appropriate neurological evaluation of patients to detect occult neurological events. The final protocol of this study will be submitted to CBER by May 15, 2002, a minimum of 5000 patients will be enrolled by December 1, 2003, patient accrual will be completed by March 1, 2005, and a final study report will be submitted to CBER by June 1, 2005"


Point number 2

The FDA committee members, when reviewing the results of the PROWESS trial, also noted that there was no significant difference in the number of patients sent home from hospital between the rhAPC-treated and placebo patients, despite the small 6.1% difference in mortality between the two groups.

Functional status at day-28 (slide 35 from the FDA collection)

* Note the small 1.7 % difference (32.3% - 30.6%) in the number of patients sent home at day-28.

* Note that there was a 2.4% difference (23.5% - 20.9%) in the number of patients still remaining in hospital at day-28, and a 2.4% difference (12.2% - 9.8%) in the number of patients still remaining in the ICU at day-28 - for a total difference of 4.8%.

* Note that although rhAPC therapy resulted in a 6.1% absolute reduction in the mortality rate, that those 6.1% of survivors did not go home => most (4.8%) of those patients remained in the hospital and their further clinical course is unknown (because the study was not designed to examine what happened to patients during the time period beyond day-28)

The fact that so many patients remained in the hospital and ICU after 28 days troubled many FDA committee members, who wondered whether using the 28-day mortality rate as the sole primary outcome measure, was an accurate reflection of the true efficacy of rhAPC therapy; and they were also concerned that the subsequent discovery of a relative increase in morbidity and/or delayed mortality in rhAPC patient-survivors, would undermine the scientific validity of the trial's conclusion. The drug company representatives at the meeeting did not have any information about post-day-28 morbidity and/or mortality events, because obtaining that information was not a specification of the trial's design. However, they stated that they intended to gather that additional information in a subsequent study.

Therefore, the FDA, in its letter of drug approval, mandated the submission of additional data from ongoing studies, and it will certainly be in the public interest if this additional information is eventually acquired, and widely disseminated.


* Quote from the FDA's letter of approval for Xigris

Clinical

13. To submit data from an ongoing study to assess long-term survival outcome. The protocol for ---------- entitled "Long-term follow-up of survivors from the PROWESS trial (an observational study)" was submitted to IND --- on July 2, 2001. Patient follow-up data will be collected by June 15, 2002 and a final study report will be submitted by November 15, 2002. Observed initial ICU-stay mortality rates and initial hospitalization mortality rates will be provided for Drotrecogin alfa (activated) and placebo patients. In addition, ----------- estimates of 90-day, and 180-day, and one-year survival for Drotrecogin alfa (avtivated) patients and placebo patients will be included.


Point number 3

I think that point number 3, is the major point of concern regarding the PROWESS study group's trial.

In June 1999 (about half-way through the trial), a protocol amendment was submitted to the agency and the trial's inclusion and exclusion criteria were modified.

Some additional exclusion criteria added to the original protocol's list of exclusionary criteria, include:-

(* see the appendix for a more complete list of exclusionary criteria)

The major intended effect of the modification, from the sponsor's viewpoint, was to decrease the likelihood of a patient dying from a non-sepsis related death during the 28-day study period. The sponsors felt that the amendment would decrease the *noise and allow the trial investigators to make a more accurate estimation of the mortality-reducing effect of rhAPC therapy in the treatment of sepsis. The FDA's reviewing committee members were somewhat perplexed by the reputed need for this particular change in the trial's protocol, and they regarded the protocol change as a major confounding variable. Another confounding fact is that the trial investigators changed the drug formulation at roughly the same point in time, so that the drug formulation (BDS2) used in the first half of the trial was different to the drug formulation (BDS2+) used in the second half of the trial. The FDA reviewers may not have regarded these two confounding elements to be of major significance, if the trial results were similar between the two halves of the trial. However, the first-half trial period results showed no significant difference in the mortality rate between the treated and placebo patients, while the second-half trial period results showed a very significant difference.

Mortality rate differences between the original and amended protocol: (slide number 40 from the FDA collection)

* Note that 720 patients were treated under the original protocol, and that the mortality difference between treated and placebo patients was only 2% (30% - 28%).

Note that 970 patients were treated under the amended protocol, and that the mortality rate difference between treated and placebo patients was 9% (31% - 22%).

The FDA reviewers regarded the two halves of the trial as not being equal - because the results from the first-half of the trial suggested that rhAPC had no significant therapeutic effect, while the second-half results suggested that rhAPC had a definite therapeutic effect.

Some of the FDA reviewers felt that they were dealing with two separate studies, and that it was not acceptable to comingle the disparate results from the two halves of the entire trial.

It may be easier to perceive the true extent of this complex problem, when you consider the trials' cumulative 28-day mortality graphs.

Cumulative 28-day mortality rate over time

* Note that the placebo 28-day mortality rate remained relatively constant during the entire second-half of the trial at 31 - 32%.

* Note that the rhAPC mortality rate remained around 27 - 28% during the mid-section of the trial period, and then rapidly declined during the latter part of the second-half of the trial to reach a point of statistical significance at the time of trial termination in June 2000.

* Note that if the trial was terminated around October 1999, that the trial's results would have shown a non-significant difference in mortality between the treated and placebo groups, and the trial's conclusion would have been negative rather than positive.

* Note that during the last few months of the trial (February 2000 - June 2000) the slope curve of the graph representing the rhAPC-treated group of patients was relatively steep, which suggests that patients recruited during that latter period at the end of the trial exhibited disproportionately greater degrees of response to the drug (for unknown reasons, according to the trial investigators)

What is very interesting, is that Lilly's drug company representatives adamantly insist that there was no difference in drug potencty between BDS2 and BDS2+, and that differences in drug efficacy could not have accounted for the large difference in results between the two halves of the trial. The company's representatives also could not account for the rapid decline in the 28-day cumulative mortality rate graph of the rhAPC-treated patients during the last part of the second half of the trial, and they felt that it could not be due to the change in the trial protocol's exclusionary criteria, which were only designed to decrease *noise and improve the *signal and *power of the study.

(* see the appendix for an explanation of the concepts of "noise", "signal" and "power; and how they are deliberatedly altered by clinical trial designers in order to achieve a greater level of confidence in a trial's conclusion)

What do you think caused the significant difference in trial results between the first-half and second-half of the trial?

Do you think that the significant difference in results between the first-half and second-half patients, which could be related to changes made in the trial protocol midway through the trial, automatically invalidates the trial's results?


Here are some comments made by FDA committee members at the FDA Advisory Board meeting:

Dr Fleming: Clearly the study isn't remotely significant in the number of people who are alive out of the hospital because there is only a 1 percent difference. Is there anything known about what the implications are, in essence that what we are doing really is keeping people alive but in the hospital what those implications are?

Dr Cross:  Well, clearly, patients here had satisfied entry criteria for severe sepsis. However, within that broad category, especially with the amendments made, I am not left with a good feeling of knowing whether or not there is a definable population, who would benefit from this drug. Many of the patients who were excluded are precisely those who would be most likely treated in my large, academic center. There are those who are immunocompromised with underlying malignancy, et cetera, et cetera. So, although there were some entry criteria used here, it would not really define how it would be used in my hospital. I still am left with not knowing what were the reasons for the change halfway through the protocol, which resulted in the differences reported.

Dr Fleming: My general philosophy is clinical trials ought to be designed to obtain the relevant real world answers. Hence, we should tend toward inclusive eligibility criteria.....So, I am particularly concerned if, in fact, there is an intention to use these data to justify a label that, in fact, is much more inclusive than what the eligibility criteria were in the trial.

Dr Munford: Yes, I think the criteria did define a population with severe sepsis. However, it was a very narrowly defined population as others have said, may or may not extrapolate to the larger pool of patients with severe sepsis.....Given the disparate results and the two phases of the trial and the fact that there is not an obvious explanation for these differences, I favor further study of this drug.

Dr Suffredini: You know, my concern is that we basically have two studies. We have entry criteria for one, entry criteria for two. There are overlap certainly, but in terms of -- and they define a very sick group of patients, but in terms of generalizability, et cetera, we really are competing one against the other study, basically, and not until the second part of the study where there were many things changed, inclusion criteria, drug change, among other things, I think we have concerns about combining the two studies. So, yes, the criteria defined a sick group of patients, but it is a moving target and I am not sure why the sponsors changed their target midway through the trial. In other words, with the knowledge that they have had with extensive clinical trial experience, why it had to be modified, I don't know the reason. I really don't -- I didn't really understand the rationale behind that.

Dr Reller: It seems to me we have basically two studies here and even though the study design may be appropriate in inclusion of sepsis in both groups, we have heard much that there is nothing to say that they couldn't be combined, but then we are left with quite different outcomes. One of those two still baffles me. If they are really not so different, then why the differences?  I am wondering if the first phase may not be more reflective of what physicians in intensive care units actually have to deal with.

Dr Eichacker: I think this is a very difficult trial to interpret, given the way criteria changed and given the way the drug changed. So, I think it is very important that it continue to be studied.

Dr Wald: I think this is very tough. I just feel that this drug was performing well and it sort of performed better and better as time went along and I am not sure exactly what that reflects, but I do think it is the truth. I think that if we don't approve it, we will be denying people who would benefit from it.

Dr Ebert: I think in an ideal world if we were able to approve this drug with the -- to the patients who were enrolled in this study with the restrictions that we have talked about, as far as restricting it to patients who would be at higher risk of mortality, I think this would be a no-brainer. Unfortunately, my concern is that the ultimate use of this drug might include, first of all, the low risk population and also probably more importantly the patients who were excluded from the study and, therefore, I would vote no.


Conclusion: Are you as perplexed as the FDA reviewers appear to be? The FDA reviewers were infectious disease specialists, who had special expertise in the evaluation of clinical trials - and yet, they found this particular trial extremely difficult to evaluate. I think that the FDA reviewers made a sterling effort to understand the repercussions of the trial protocol changes, but the link between the protocol changes and the disparate results is obviously complex and uncertain. If these well-trained experts could not decide whether the trial's results can be "generalized" to the type of patient that was deliberatedly excluded from active participation in the trial (including patients with immunodeficiency disorders, recent organ transplantation, underlying malignancy, severe co-morbid diseases with a high short-term risk of death, and evidence of organ failure > 24 hours), then how can we expect community medical practioners, who do not have special expertise in this area, to know how best to use this ultra-expensive drug in clinical practice?

Timothy Begany in an article called "NEW SEPSIS DRUG PROVES IMPRESSIVE IN PHASE III TRIAL", which was recently published in Pulmonary Reviews, wrote:- "Drotrecogin alfa (activated) will probably become a standard of care for severe sepsis not too far in the future, unless we see a peculiar toxicity or something else unexpected that did not appear in the clinical trials," speculated Dr. Opal, who is also the Director of the Infection Control Service at Memorial Hospital of Rhode Island in Pawtucket".

However, Dr Opal failed to consider something else unexpected that did appear in the clinical trial, and I think that Dr Opal should seriously consider whether the following reasons justify a re-evaluation of the PROWESS study group's trial, before he regards rhAPC as the standard of care for the treatment of sepsis

I think that the PROWESS study group's trial is a poster child example of why we need to change the way that we perform, and evaluate randomized controlled clinical trials. I think that EBM practioners, who only evaluate the results of clinical trials by examining the evidence presented in medical journal articles, are often not discovering the scientific truth - because pertinent information is often hidden from view. I think that EBM practioners have to routinely take an additional step and examine the trial's raw data to ensure that there is no hidden source of systematic bias that can invalidate the trial's results. At the present time, editors of medical journals do not demand that authors, who submit papers on RCTs for publication, supply all of the trial's raw data. I think that this is a major problem, that needs to be corrected! The editors of medical journals tend to rely on the peer review process to ensure that the quality of a RCT is adequate, but that process will obviously fail to discover the type of problems that were uncovered by the FDA's rigorous review of the PROWESS study group's trial.

I think that there is a better solution to the problem. I realize that my proposed solution is radical, but I think that it could better ensure that RCTs discover the scientific truth.

The proposed solution:

I think that we need to change the whole RCT process from the bottom-up, by mandating that all study groups planning a RCT, first establish a trial website page prior to the time of trial inception. That website page should provide a comprehensive amount of information about the intended trial. In particular, the official trial website statement should describe the intended purpose of the trial in precise detail. The study group should be obliged to provide a systematic review of the relevant literature as it pertains to their proposed research project, and the official statement should explicate how their proposed research project would advance medical knowledge or solve a particular medical problem. The statement should also tie the purpose of the proposed research project to all previously performed research studies, that are relevant to their proposed research project. The study group should also provide detailed information about the trial's design protocol, its methodology, the statistical techniques that are to be used to interpret the study's results, its data and safety monitoring process, its stopping plans, and how delta figures for clinical significance were derived. The trial website should also provide ongoing information about the trial's progress and timely publication of any interim results, and it should particularly include updated information on drug efficacy, drug adverse effects and drug safety concerns as the trial proceeds. Most importantly, the study group should be obliged to host an "open" online discussion group, which is refereed by an independent agency, that will enable public citizens to "openly" comment on any aspect of the proposed trial. The study group should be obliged to respond to all questions and criticisms, which the independent referee deems especially relevant.

The main purpose of the "open" online discussion group is to allow the public citizenry to comment on, and freely discuss a number of trial issues - especially whether the trial is conforming to the scientific standards established by the CONSORT statement, and the ethical standards established by the ASSERT statement

If the PROWESS study group had established a website with an "open" online discussion forum, informed public citizens - most likely physicians, statisticians, researchers and other professionals - would have had the opportunity to provide the PROWESS study group with useful advice and/or criticisms and/or feedback regarding:

Some would argue that a trial's raw data is proprietary information belonging to the drug company that sponsored the trial, and that the drug company sponsoring the trial is under no obligation to adopt an "open" stance with respect to the purpose, design and execution of the trial plan. That may be true, but eventually the public is going to have to pay for the drug; and therefore, the public has an extraordinary interest in ensuring that the clinical trial was fairly performed, and that the marketed drug is truly efficacious and safe.

Jeff Mann.

Appendix

FDA briefing information on Xigris

FDA letter of approval for Xigris

Xigris info sheet from the FDA

FDA advisory board meeting transcripts

Summary of exclusion criteria used in the PROWESS study group's trial

(* red highlighted text indicates exclusionary criteria added to the list midway through the trial)

Journal article: "Why randomized controlled trials fail but needn't: 2. Failure to employ physiological statistics, or the only formula a clinician-trialist is ever likely to need (or understand!) by David Sackett - CMAJ 2001;165(9):1226-37.

(available at http://www.cma.ca/cmaj/vol-165/issue-9/1226.asp)

When reading the FDA's transcripts, I got the distinct impression that a number of FDA reviewer-physicians may not have fully understood the specific terms - *noise, *signal and *power - used by Dr Macias in his explanation of why the PROWESS study group needed to change the trial's exclusionary criteria after the trial had already begun.

If any readers are interested in learning more about those terms, I would recommend that they read David Sackett's article in the CMAJ.

In the following section, I have provided my own simplified interpretation of those terms - as they were described by David Sackett in that CMAJ article, and I have also related those terms to some events in the PROWESS study group's trial.

When a clinical trial investigator designs a clinical trial, he wants to ensure the validity of the trial by maximizing confidence in the trial's conclusion.

In simple terms, a formula can be used to express the confidence level of a trial's conclusion.

Confidence = signal/noise x square root of (sample size)

If expressed in words, the formula states that the confidence in the conclusion of an RCT is the ratio of the magnitude of the signal to the magnitude of the noise, times the square root of the sample size

Confidence describes how narrow the confidence interval is (the narrower the better) around the effect of treatment, whether expressed as an absolute or relative risk reduction or as some other measure of efficacy.

The signal describes the differences between the effects of the experimental and control treatments.

The noise (or uncertainty) in an RCT is the sum of all the factors - "sources of variation" - that can affect the absolute risk reduction or absolute difference.

The sample size is the number of patients in the trial. The influence of sample size on the confidence in a trial is a function of its square root - if a trial designer wants to cut the confidence interval around a study's absolute risk reduction in half by adding more patients to it, he needs to quadruple the number of recruited patients. That is why trial designers may choose to concentrate their efforts on increasing the signal and decreasing the noise, rather than having to significantly increase the sample size in order to achieve the same effect.

Four determinants affect the magnitude of the signal generated in a RCT

Restricting eligibility to patients who are at higher than average "baseline" risk of outcome events leads to higher "control event rates" (CER) among those receiving placebo or the treatment. Because the absolute risk reduction signal is equivalent to the product of this control event rate and the relative risk reduction from therapy (ARR = CER x RRR) it follows that, if the relative risk reduction achieved by the experimental treatment is both true and constant over different control event rates, the experimental treatment will generate a larger absolute risk reduction signal when the control event rate is high than when it is low.

The second way that one can increase the ARR signal and the confidence in a positive result, is by selectively enrolling highly responsive patients who are more likely (than average) to respond to the experimental therapy.

The noise element in a trial is reduced by eliminating or minimizing sources of uncertainty.

Variations in the outcome of study patients can be reduced by making the patients more homogeneous - using the same strategies used to improve signal: assembling patients with similar risks and similar responsiveness, and making experimental and control patients as similar as possible.

Ensuring high compliance and minimizing sloppiness and inconsistency in the ascertainment of outcomes are other ways of reducing noise.

I am not exactly sure why the PROWESS study group's trial investigator decided to change the trial protocol midway through the trial. My guess is that the trial investigator decided to reduce the number of recruited patients, who could die a non-sepsis death during the 28-day study period, because that non-sepsis death would not count as a sepsis-related death (control event rate) and that would decrease the *signal of the trial - presuming, of course, that the trial investigators could accuratedly differentiate between sepsis and non-sepsis deaths. If the trialists could not differentiate sepsis from non-sepsis deaths (which is the more likely scenario considering that their trial's primary endpoint was defined as death from any cause), then the presence of a large number of non-sepsis deaths would represent *noise because it would introduce an element of uncertainty into the trial results. A third possible explanation is, if many patients were experiencing a non-sepsis death in both arms of the study, then there would be less patients remaining in the treatment arm who could thereoretically respond to the drug treatment (dilution of numbers of potential responders), and therefore the relative difference between the treatment arm and the placebo arm could be decreased (increased *noise and decreased *signal). Perhaps the trial investigators would be willing to make their reasons for changing the trial protocol much more explicit, so that we don't have to guess. Either way, the question is whether the exclusion of patients who were likely to die from a non-sepsis related cause within 28 days, really affected the trial's results. I noticed that the number of deaths was only 1 - 2% lower in the second half of the trial (after the trial protocol was changed) and I cannot appreciate how such a small mortality difference increased the trial's signal/noise ratio, and therefore its confidence power. I personally suspect that the trial protocol change midway through the trial has more likely decreased many community physician's confidence in the wisdom of the trial investigators, especially when some of those exclusionary criteria eliminated the types of patients commonly treated for sepsis in community ICUs - thereby making it much harder for community physicians to "particularize" the results of the trial to those types of patients.

Why the trial investigators decided to restrict the recruitment of patients with a history of malignancy, organ transplantation, and organ failure > 24 hours into the second half of the trial is not entirely clear to me - I would be interested in suggestions from readers (or trial investigators).

Commentary, criticism and controversy:

Insightful comments from readers will be included in this section.