I know this makes me sound old and is only a step away from “get off my lawn,” but…
It seems like things in clinical research and biostatistics used to be more straightforward. To be comfortable that findings on a comparison were real, finding that the odds of the results being false were less than one in twenty was the goal (p<0.05). Various studies reported at ASCO 2012 have challenged that notion and have led to some significant discussion. I know
Dr. Glodé has also discussed this today. The fact that the two of us have been moved to write about this can only signify that many others are pondering this as well.
A randomized comparison of abiraterone to placebo in chemotherapy-naïve men with prostate cancer
was reported (Abstract LBA4518). Co-primary endpoints were radiographic progression-free survival and overall survival. Overall survival was given a pre-specified alpha level of p<0.0008. In other words, comfort with the result did not require that it could be wrong in less than one of twenty times but rather had to be wrong in less than one of 1,250 times. At an interim analysis, the p-value for overall survival was 0.0097. Looking at the entire spectrum of results, an independent data monitoring and safety committee felt the study should be halted, patients unblinded, and results reported. Even though the chances of this representing a false positive result were more than twenty times what we usually look for, not reaching the planned alpha of 0.0008 leads many people to consider this a failed study. Not reaching the planned alpha may, in and of itself, be enough for the FDA to decline to approve a pre-chemotherapy indication for this drug in the treatment of prostate cancer.
A
similar situation occurred in a Plenary abstract describing a comparison between a new agent for the treatment of breast cancer, T-DM1, and capecitabine with lapatinib (Abstract LBA1). This trial also had co-primary endpoints; progression-free survival and overall survival. At the time of interim analysis, the benchmark for progression-free survival was met (the usual p<0.05). At the time of analysis, however, the number of deaths would require a p-value of 0.0003 or less to be comfortable that it was a true result. In other words, the odds of a false result had to be less than 3 in ten thousand. The p-value was only 0.0005 or 5 in ten thousand, so the statisticians tell us we cannot be comfortable enough of this being a true result. Fortunately, this not being a blinded trial, data can continue being collected and the study has not been halted.
I won’t even get into the statistical analysis used to determine that intermittent androgen deprivation was inferior to continuous androgen deprivation in men with metastatic prostate cancer (
Abstract 4). In all honesty, I cannot explain it.
As much as I have appreciated biostatisticians before, it is truly clear to me how the modern era of clinical trials has made a good biostatistician absolutely vital. If I ever know any children who have a knack for math and show any interest in statistics, I know what my advice will be. Not “plastics, Benjamin” but rather “biostatistics.”