By John W. Sweetenham, MD, FACP, FASCO, FRCP
The emergence of big data, AI, predictive analytics, and advanced bioinformatics platforms is transforming our understanding of cancer, our approaches to developing new treatments, and our thinking about cancer care delivery and the true value and benefit of new interventions to patients and their caregivers. Realization of the power of these new tools is growing and large data sets, with their associated analytics, have entered the mainstream of cancer research and cancer care. In the last few years, we have started to witness the transition of large data sets from a resource to mine for generation of hypotheses, to a data source which can be used as reliable information to test hypotheses and answer challenging questions, often based on the massive volume of data available.
A very tangible example of this is the growing willingness of the U.S. Food and Drug Administration (FDA) to approve new cancer treatments based partly on the use of what has now become known as “real-world data” (RWD). In fact, the FDA recently issued guidance to industry for the submission of documents using RWD and RWE (“real-world evidence”) for drugs and biologics. The document defines RWD as data relating to patient health status and/or delivery of health care that are routinely collected from a variety of sources including electronic health records (EHRs), medical claims and billing data, product and disease registries, patient-generated data (such as patient-reported outcomes) and mobile devices such as wearables. It defines RWE as the clinical evidence derived from analysis of RWD.
Acknowledgment of the potential contribution RWD can make to our understanding and decision making has been driven by many factors, not least being the challenge of conducting relevant and impactful clinical trials of new therapeutics. The observation that patients recruited to cancer clinical trials are not representative of the general population of patients with cancer is well documented, and not surprising given that only 2% to 3% of all patients with cancer in the U.S. enter onto a study. The reasons for this are multifactorial and complex, including the often stringent eligibility criteria for trials, failure to adequately represent minority populations, inadequate information and education for the public regarding clinical research, and the time constraints of busy clinical practices. As the oncology world attempts to address all of these issues, including relaxing eligibility criteria for trials to make them more representative of the “real world” population, clinical trials still take many months or years from concept to design to completion and the new treatments under evaluation can still sometimes face an uncertain pathway to approval, especially when the benefits observed in some studies can be marginal.
Big Data’s Big Problem
Recognizing the rapid increase in the number of new cancer therapeutics in the last few years, the FDA has streamlined its approval process—a change which has undoubtedly given our patients access to effective new therapies more quickly than in the past. This has been a positive development but the process is still far from perfect and there are several studies which show that many recently approved drugs do not meet criteria for clinical benefit by widely used value frameworks such as those of ASCO and ESMO. The use of RWD and RWE is another strategy to accelerate this process. We should regard this as a positive step—large sets of accurate, validated, and consistently collected data are a powerful research tool.
But therein lies a big problem—to be useful, the data need to be accurate, consistently collected, and verifiable to a level comparable with what we expect from a prospective clinical trial. If the data contained within these large sets are anything less, it erodes confidence in what they are telling us. And at the moment, we are seeing an increasing trend toward labeling data sets as “real-world data” when they fall far short of these benchmarks.
Why is this concerning? I think there are several reasons, which relate partly to data quality and partly to how we communicate results of studies.
A very quick and superficial PubMed search of “cancer” and “real-world data” brings up literally thousands of publications. A quick survey of a few of these shows how quickly the RWD concept has been embraced by the oncology world, and how much the term is being misused. The studies derive their data from multiple sources, including claims data, individual tumor or treatment registries, data platforms directly downloaded from EHRs, cancer registries, or other single-center series of patients not included in prospective trials. We should have major concerns about the quality of these data.
There is a real danger that we legitimize flawed data sets by labelling them as “real-world data” when they neither reflect the real world nor can we trust the data. Historically, registry data have been an important source of hypothesis-generating research. I have personally previously published registry data, clearly labelled as such and clearly recognizing the limitations of these data because of the inherent bias in what gets included in a registry and what doesn’t. A close look at some recent publications of “real-world data” show that these are registry data by another name.
From a data quality perspective, we need to be very circumspect about the accuracy of claims data and also of data derived from EHRs, many of which do not adequately capture even stage or performance status—two of the most fundamental data elements we would need for our decision making in the clinic. To label studies using these sources as “real world” is suspect and potentially very misleading.
Meaning no disrespect to the author, a recent presentation at the American Society of Hematology reported on the outcomes for patients with aggressive B-cell lymphoma who had undergone CAR-T cell therapy but were not eligible for a prospective trial, and were therefore treated with a commercial product according to the FDA indication. The results of this study were reported as real-world experience, yet this is a highly selected, young patient population with good performance status, which bears no resemblance to the real world of this disease, for which the median age at presentation is around 70 years, performance status is frequently poor, and there is a high incidence of comorbidity. The conclusions of this study are still important, but the results need to be reported in the appropriate context.
Resisting the Hype of RWD
This may seem overly critical, but it really matters. There is an extensive literature on the use of spin and the choice of language in reporting the results of studies, demonstrating that nuances in language can often imply a more positive or significant outcome than the study actually shows. As the term “real-world data” becomes socialized in the oncology world, we need to be careful that we do not allow the same thing to happen.
Without doubt, there are highly reliable big data sets, derived from multiple centers, abstracted according to consistent validated protocols with robust quality assurance and verification strategies. These sets are a valuable resource with great potential for research and care delivery and their value is already being recognized by many research institutions, health systems, and regulatory bodies. These merit the label of RWD. The term should be restricted to platforms like these. Otherwise, we run the risk that incomplete or inaccurate data derived from inherently biased, or poorly characterized, patient populations gain a new respectability as real-world data.
Since the concepts of RWD and RWE are now firmly embedded in the oncology vocabulary, it’s time to make sure they are clearly defined and that, in our scientific journals and at our major meetings, we limit the use of these terms to studies where they are truly merited.
Dr. Sweetenham is a professor in the Department of Internal Medicine at UT Southwestern (UTSW) Medical Center and the associate director for clinical affairs at UTSW’s Harold C. Simmons Comprehensive Cancer Center. He specializes in treating lymphomas and other hematologic malignancies. Dr. Sweetenham is the editor in chief of ASCO Daily News. Follow him on Twitter @JSweetenhamMD.