Real-World Evidence: Time for a Reality Check

Real-World Evidence: Time for a Reality Check

Guest Commentary

Aug 07, 2019

Dr. John W. SweetenhamBy John W. Sweetenham, MD, FACP, FASCO, FRCP

The emergence of big data, AI, predictive analytics, and advanced bioinformatics platforms is transforming our understanding of cancer, our approaches to developing new treatments, and our thinking about cancer care delivery and the true value and benefit of new interventions to patients and their caregivers. Realization of the power of these new tools is growing and large data sets, with their associated analytics, have entered the mainstream of cancer research and cancer care. In the last few years, we have started to witness the transition of large data sets from a resource to mine for generation of hypotheses, to a data source which can be used as reliable information to test hypotheses and answer challenging questions, often based on the massive volume of data available.

A very tangible example of this is the growing willingness of the U.S. Food and Drug Administration (FDA) to approve new cancer treatments based partly on the use of what has now become known as “real-world data” (RWD). In fact, the FDA recently issued guidance to industry for the submission of documents using RWD and RWE (“real-world evidence”) for drugs and biologics. The document defines RWD as data relating to patient health status and/or delivery of health care that are routinely collected from a variety of sources including electronic health records (EHRs), medical claims and billing data, product and disease registries, patient-generated data (such as patient-reported outcomes) and mobile devices such as wearables. It defines RWE as the clinical evidence derived from analysis of RWD.

Acknowledgment of the potential contribution RWD can make to our understanding and decision making has been driven by many factors, not least being the challenge of conducting relevant and impactful clinical trials of new therapeutics. The observation that patients recruited to cancer clinical trials are not representative of the general population of patients with cancer is well documented, and not surprising given that only 2% to 3% of all patients with cancer in the U.S. enter onto a study. The reasons for this are multifactorial and complex, including the often stringent eligibility criteria for trials, failure to adequately represent minority populations, inadequate information and education for the public regarding clinical research, and the time constraints of busy clinical practices. As the oncology world attempts to address all of these issues, including relaxing eligibility criteria for trials to make them more representative of the “real world” population, clinical trials still take many months or years from concept to design to completion and the new treatments under evaluation can still sometimes face an uncertain pathway to approval, especially when the benefits observed in some studies can be marginal.

Big Data’s Big Problem

Recognizing the rapid increase in the number of new cancer therapeutics in the last few years, the FDA has streamlined its approval process—a change which has undoubtedly given our patients access to effective new therapies more quickly than in the past. This has been a positive development but the process is still far from perfect and there are several studies which show that many recently approved drugs do not meet criteria for clinical benefit by widely used value frameworks such as those of ASCO and ESMO. The use of RWD and RWE is another strategy to accelerate this process. We should regard this as a positive step—large sets of accurate, validated, and consistently collected data are a powerful research tool.

But therein lies a big problem—to be useful, the data need to be accurate, consistently collected, and verifiable to a level comparable with what we expect from a prospective clinical trial. If the data contained within these large sets are anything less, it erodes confidence in what they are telling us. And at the moment, we are seeing an increasing trend toward labeling data sets as “real-world data” when they fall far short of these benchmarks.

Why is this concerning? I think there are several reasons, which relate partly to data quality and partly to how we communicate results of studies.

A very quick and superficial PubMed search of “cancer” and “real-world data” brings up literally thousands of publications. A quick survey of a few of these shows how quickly the RWD concept has been embraced by the oncology world, and how much the term is being misused. The studies derive their data from multiple sources, including claims data, individual tumor or treatment registries, data platforms directly downloaded from EHRs, cancer registries, or other single-center series of patients not included in prospective trials. We should have major concerns about the quality of these data.

There is a real danger that we legitimize flawed data sets by labelling them as “real-world data” when they neither reflect the real world nor can we trust the data. Historically, registry data have been an important source of hypothesis-generating research. I have personally previously published registry data, clearly labelled as such and clearly recognizing the limitations of these data because of the inherent bias in what gets included in a registry and what doesn’t. A close look at some recent publications of “real-world data” show that these are registry data by another name.

From a data quality perspective, we need to be very circumspect about the accuracy of claims data and also of data derived from EHRs, many of which do not adequately capture even stage or performance status—two of the most fundamental data elements we would need for our decision making in the clinic. To label studies using these sources as “real world” is suspect and potentially very misleading.

Meaning no disrespect to the author, a recent presentation at the American Society of Hematology reported on the outcomes for patients with aggressive B-cell lymphoma who had undergone CAR-T cell therapy but were not eligible for a prospective trial, and were therefore treated with a commercial product according to the FDA indication. The results of this study were reported as real-world experience, yet this is a highly selected, young patient population with good performance status, which bears no resemblance to the real world of this disease, for which the median age at presentation is around 70 years, performance status is frequently poor, and there is a high incidence of comorbidity. The conclusions of this study are still important, but the results need to be reported in the appropriate context.

Resisting the Hype of RWD

This may seem overly critical, but it really matters. There is an extensive literature on the use of spin and the choice of language in reporting the results of studies, demonstrating that nuances in language can often imply a more positive or significant outcome than the study actually shows. As the term “real-world data” becomes socialized in the oncology world, we need to be careful that we do not allow the same thing to happen.

Without doubt, there are highly reliable big data sets, derived from multiple centers, abstracted according to consistent validated protocols with robust quality assurance and verification strategies. These sets are a valuable resource with great potential for research and care delivery and their value is already being recognized by many research institutions, health systems, and regulatory bodies. These merit the label of RWD. The term should be restricted to platforms like these. Otherwise, we run the risk that incomplete or inaccurate data derived from inherently biased, or poorly characterized, patient populations gain a new respectability as real-world data.

Since the concepts of RWD and RWE are now firmly embedded in the oncology vocabulary, it’s time to make sure they are clearly defined and that, in our scientific journals and at our major meetings, we limit the use of these terms to studies where they are truly merited.

Dr. Sweetenham is a professor in the Department of Internal Medicine at UT Southwestern (UTSW) Medical Center and the associate director for clinical affairs at UTSW’s Harold C. Simmons Comprehensive Cancer Center. He specializes in treating lymphomas and other hematologic malignancies. Dr. Sweetenham is the editor in chief of ASCO Daily News. Follow him on Twitter @JSweetenhamMD.


The ideas and opinions expressed on the ASCO Connection Blogs do not necessarily reflect those of ASCO. None of the information posted on is intended as medical, legal, or business advice, or advice about reimbursement for health care services. The mention of any product, service, company, therapy or physician practice on does not constitute an endorsement of any kind by ASCO. ASCO assumes no responsibility for any injury or damage to persons or property arising out of or related to any use of the material contained in, posted on, or linked to this site, or any errors or omissions.


Michael Liebman, PhD

Aug, 09 2019 8:03 AM

I appreciate and support your comments...and I believe that  you are pointing to the fundamental

need for greater transparency in providing details about what, or who, is or is not included in any given study

and how, exactly, data is collected, not just how the data field is named.   This is critical to understand the

potential biases that exist in any and all data sets and do not invalidate the actual data but need to be

incorporated into any analysis that includes these potentially diverse/disparate data sets….

Gregory P. Hess, MD, MBA, MSc

Aug, 17 2019 9:38 AM

I agree with the concerns raised by Dr. Sweetenham and the comment offered by Dr. Liebman. Much of my past thirty years has extensively focused on the collection, integration and applications of real-world data, including claims, EMRs, EHRs, registries and other types of data.  From my perspective, it is critical to focus on extremely high quality from the 1st steps to the last. Not all firms have this perspective, particularly when there are for-profit drivers including an aggressive time-to-market. Subsequently while real-world data can and does have tremendous potential value and advantages, I agree that a lack of transparency and other barriers can create the classic situation of 'garbage in, garbage out' relative to the conduct of retrospective studies. Let's push together for significant improvements in RWE.

Jose Ales-Martinez, MD, PhD

Aug, 19 2019 9:51 AM

Very timely commentary. The problem is compounded by the flood of information coming from different sources and the lack of critical appraisal. Congratulations for sharing your view.

Back to Top