Skip to main content
Premium Trial:

Request an Annual Quote

Sequencing-Based Clinical Trial Matching Yields High False-Positive Rates


NEW YORK – Algorithms designed to match patients to precision oncology trials based on molecular profiling results aren't generating appropriate matches, suggests research out of the Vanderbilt-Ingram Cancer Center.

The study, published recently in the Journal of Clinical Oncology: Clinical Cancer Informatics, revealed that algorithms using only patients' diagnosis and next-generation sequencing data to match patients to clinical trials resulted in false-positive matches 88 percent of the time. These false positives were not due to any errors in the molecular testing results themselves, but rather were most commonly the result of delayed and disjointed information sharing about the trials' arm-specific enrollment status.

According to the study, 72 percent of false-positive trial matches occurred when patients were matched to specific arms of a given trial that had closed or suspended accrual, or perhaps had never opened at the patient's location in the first place. Precision oncology clinical trials, in other words, are often being processed by algorithms as "available" on the whole, even when the same is not true of the specific arm within that trial that would have been appropriate for a given patient.

The high false-positive matching rate poses a significant challenge not only to patients and their care teams, but also to the drugmakers and investigators running these trials, many of whom are struggling to recruit enough patients with rare biomarkers to conduct their trials. This in turn may delay or prevent targeted drugs from being approved and reaching the market, the researchers noted.

Multi-arm trial challenges

Precision oncology trials with multiple arms, including large umbrella and basket trials that stratify patients into cohorts based on their disease type or biomarker status, are becoming more common. But even as these types of trials expand their reach and oncologists gain familiarity with the design rationales, patient accrual remains a challenge.

"There's this trend of multi-disease trials like ASCO TAPUR or NCI-MATCH where there could be as many as 10, 15, 20 cohorts that are open at the time," explained Vanderbilt's Neha Jain, first author on the study. "And there could be five of the 25 arms open at institution X, but when you go on or other resources, you see that the trial is open, and the assumption is made that all 25 arms are open. But that's not the case."

This is the root of the issue with the algorithm-based matching systems, Jain said. Many of these systems, which are designed to find possible clinical trials based on next-generation sequencing (NGS) test results and diagnosis details, only process trials as open or closed on the whole, but are not powered to do the same for the trials' individual arms.

When Jain and colleagues began conducting their study, they weren't expecting the information-sharing pitfalls to be the main culprit for false-positive matches. They were hoping to highlight the reasons why many patients weren't enrolling in clinical trials even when their NGS reports listed dozens of potential matches. Their initial thought was that stringent eligibility criteria — including requirements for previous lines of treatment or perhaps specific disease status details like the presence of brain metastases — were to blame.

Jain and colleagues compared clinical trial match results from Vanderbilt's algorithmic Precision Clinical Trial Matching (PCTM) system against "augmented" matches that incorporated human involvement — usually from a research nurse — to prescreen patients and interrogate trial availability. The study included any patient at Vanderbilt who received a new NGS test result between October 2018 and April 2019. After excluding patients who were deceased at the time of the prescreening or who had primary oncologists outside of Vanderbilt, this patient number totaled 82, nearly half of whom had lung cancer.

Vanderbilt's PCTM algorithm, which was developed with Vanderbilt's software development partner GenomOncology, is an informatics platform that, according to Jain, was initially developed in a way that mirrored those used by many large NGS testing vendors. The platform uses biomarker status and diagnosis details such as vital status, diagnosis date, and the presence of other biomarkers outside the NGS panel like immunohistochemistry results, to generate the matches returned with the NGS report. On the clinical trial side, Vanderbilt's system draws from the institution's internal clinical trial management system and interfaces with Vanderbilt's My Cancer Genome data model, which draws from external public data including and

Without human involvement, the PCTM recommended 808 trials for the 82 patients involved in the study. Then, after research nurses manually filtered results based on treatment setting — that is, neoadjuvant, adjuvant, or metastatic — along with other treatment-specific data, that number dropped to 755. Finally, when a fully human-led prescreening was performed, which involved research staff individually confirming whether each trial was indeed a valid match, this number dropped to just 87 across all 82 patients.

Elaborating on what this prescreening step involved, Jain referred to the time-consuming process as "super unfancy."

"Usually it involved making calls to the main investigator of the trial and saying, 'Hey, we have a patient and they are eligible for all of these things … do you have a slot?' … And that knowledge is not transferrable. If I'm another person [on the research staff] at another institute doing the same exact thing for my patients, I'm going to make that call again. And that's the problem … it's a massive duplication of efforts."

In their paper, Jain and colleagues cited an estimate from another study that research nurses spend roughly four to nine hours per patient evaluating clinical trial eligibility. That same study, published in 2012, also estimated that the financial burden on a cancer center for this screening alone could exceed $90,000 annually.   

Jain acknowledged that the burden on research staff, as well as the high financial cost, can disproportionately affect smaller cancer centers and community oncology practices that don't have the centralized infrastructure and staff bandwidth built into some large academic networks, and that burden can be transferred to patients who may not end up able to enroll in precision oncology trials. In many cases community oncology centers have relationships with larger academic centers and can tap into those centers' clinical trial networks for their patients. Jain said this is the case with Vanderbilt and many practices across Tennessee and the surrounding region, but is not always true, and disparities in access to precision oncology trials can result.

Reporting delays, possible fixes

With a field that moves as fast as precision oncology, keeping clinical trial status data current is no small feat. The public database — which Jain actually said is better today than it has been in years past when it comes to staying current — can lag by as much as seven months in updating recruitment status, according to another prior study.

While Jain was hesitant to point fingers in any one direction, she acknowledged that pharmaceutical companies that sponsor trials could play a bigger role in helping to facilitate information sharing and curation. They could also help establish application programming interfaces (APIs) between separate systems so this data is synchronized in real time.

"And it's actually in pharma's interest to do that, because the more quickly they fill trials, the more quickly the whole process moves," she said. "The ultimate goal for them and the entire oncology community is that a drug gets approved."

Pharmaceutical companies aren't the only stakeholders that need to be involved in an effort to keep this information current. The researchers noted that multi-institutional collaborations between entities such as the National Library of Medicine, National Cancer Institute, and commercial vendors could help by furthering both policy-based improvements and technological advancements.

Crucially, Jain pointed out that there needs to be some sort of centralized system in place to facilitate information sharing across multiple institutions in real time, whereby the research nurse who called ten PIs one morning to ask about availability would be able to disseminate the answers gleaned to other research nurses in the same boat. Jain likened this concept — and its feasibility — to real-time e-commerce updates that have become ubiquitous.

"If you go on Amazon or Walmart [websites] and you need something right away, you'll see there are exactly three items in this store and five items 10 miles away and 15 items 100 miles away… so it's not hard to do, it's just that somehow our side [medicine] has taken longer to catch up to the technology that probably exists."

Jain and colleagues also put forth a suggestion that each biomarker- or histology-defined trial arm in the growing number of basket and umbrella precision oncology trials be treated as its own entity, at least insofar as it can be recognized by algorithms as such. The necessity of addressing the arm-specific challenge became clear to the researchers when, after narrowing the matches down with the time-consuming manual prescreening, the reasons for false-positive matches were recorded. Of 16 potential reasons, the most common was that the study itself or the relevant individual study arm was closed to enrollment.

"A more robust approach of assigning a unique identifier to each trial arm in addition to the trial may be pivotal in teasing apart the recruiting statuses among trial arms," they wrote.

Of course, this isn't a shift that can happen overnight; it will take an updated infrastructure. "It's a problem of keeping data updated on a fairly rapid basis [and] it's a moving target," Jain said. "We need much more infrastructure if we are going to record the status of each individual arm within a trial."

While large-scale collaboration across all of precision oncology will be necessary to address these preventable false-positive clinical trial matches, Jain said that Vanderbilt itself, in the wake of this study, is taking steps to address the arm-level information in Phase I trials, where the inconsistent status updates were most common. "That matching system will be available to all of our nurses, but we can't [include in the updated system] information on what is happening at MD Anderson or some other institution," she said. "What's still missing is a bigger collaborative effort across sites."