Ethics and Statistics
L. Michael Glodé, MD
04 Jun 2012 11:13 AM
Since I have no formal training in either of these disciplines, it is either possible (or likely) that I will say things that offend both camps. In three of the presentations at ASCO this week in which I have great interest professionally (prostate cancer), questions have arisen that bring these into play. In one case, a study was closed when the DSMB did an interim analysis that showed a very low p-value (something like 0.0003 or similar) at a pre-planned interim analysis. A discussant used the complex math surrounding “spending alpha” and boundary crossing, and as I understood it, did not agree with this decision. In another presentation, a conclusion was said to be NOT “not inferior” and the presenter and discussant were caught up in defending their conclusions that one treatment should be considered “superior.” In a third presentation, the conclusions of several committees were turned over by the FDA using a different way of analyzing the data.
In all instances, both the presenters and discussants were cordial, thoughtful, and I believe really did keep the patients’ best interests in mind. (Perhaps this could be a lesson for our politicians who hopefully keep our country’s best interests in mind...) However, one wonders how to best marry math with sociology and biology, no doubt a daily challenge for the FDA and other decision makers.
We all recognize that patients also have a stake in decisions of these sorts. In their case, however, the strong emotions may outweigh statistical considerations, just as financial desires can clearly impact a pharmaceutical company’s point of view. In some well-publicized incidents over the past three decades, there have been direct conflicts between patients, pharma, and the FDA in working through these thorny issues. I have read a few ethics articles on such issues, but in some ways it surprises me that we are still dealing with them. Perhaps the answer would be to have the p-value for significance be “0.05-0.08” more accurately reflecting biology as we know it rather than reducing it to a binary function. It wouldn’t work for grinding a mirror to place in the Hubble space telescope, but it might be helpful for those of us in medicine.
CommentsNumber of Comments: 4
Wednesday, June 06, 2012 6:36 AM
Dear Michael: I agree with your take on this, and the "ethics" of statistics, and I am with you: to date, I am still unsure if "not being 'not inferior' is the same as being 'superior' or 'equivalent' or neither.
Perhaps the biggest challenge though is something you alluded to in the concept of "statistical significance". In a recent presentation of a non-inferiority study, the "experimental" regimen was deemed statistically not inferior- but compared to the control regimen produced a median survival that was 6 months shorter. When asked about clinical versus statistical significance, the presenter stressed the statistical and did not comment on the clinical significance. I believe (as you allude to) he was trying to be balanced, but I think he reflected the conundrum we all face when trying to interpret statistics clinically.
How best to marry clinical and statistical significance is left to those talented enough to design the models. I am waiting to see what they come up with!
Wednesday, June 06, 2012 11:17 AM
Michael: I think the exit polling from yesterday's gubernatorial recall election in Wisconsin might help in understanding the early stopping problem. When the sample size is small (exit poll sample) the chance that it is not representative of the entire population is increased, Therefore, to reliably assure that the predicted outcome is real for the entire population, the difference between the arms needs to be larger than it would be if you waited to count all of the votes. CNN found out that using a small sample of the "study" population could lead to the wrong conclusion. The same is a possiblity for the study that was stopped before reaching the O'Brien-Fleming boundary established for early analysis. Perhaps the statistics of study interpretation have gotten so complicated that we, as consumers, have to trust the statisticians since we don't understand how they developed the statistics they recommend. I guess it is a little like being a patient! Hopefully we do a better job explaining our recommendations to our patients.
Wednesday, June 06, 2012 11:20 AM
Statistics should inform us for clinical utility.
Sometimes in trial design, execution, and interpretation some of that is missed.
These discussions initiated at ASCO12 provided a good forum for asking these questions online and at future meetings.
I was surprised at the overflowed rooms on clinical trial design (Bayesian, etc).
There is great interest in trial design.
Sunday, June 10, 2012 3:16 PM
I agree that proper clinical trial design and interpretation is critical. I think pharma trials are frequently designed to push towards a positive result. The use of multiple subgroup comparisons and excessive interim analyses are good examples. As clinicians, it is helpful to be wary of non-significant results.
I think guidelines should be based upon the best data possible. If we accepted p values of 0.10, then 10% of trials would be positive by chance.