NEW YORK (GenomeWeb) – A report published earlier this month has revealed new details from AstraZeneca's head-to-head evaluation of four commercial NGS liquid biopsy tests — an effort to precisely analyze concordance or discordance between the various assays, as well as how their results compare to tumor tissue sequencing.
As the company has mentioned previously in more limited presentations, the results, published in JCO Precision Oncology, raise questions about the reliability of variant calls that these labs provide commercially. Now published in full, the study also helps reveal more of the nuts and bolts behind these discrepancies despite aspects of the compared tests remaining proprietary or undisclosed.
In a commentary appearing alongside the AstraZeneca report, a trio of Dana Farber clinicians — Cloud Paweletz, Christie Lau, and Geoffrey Oxnard — wrote that while many of the causes of discrepant liquid biopsy calls have become better understood in the last few years, mechanisms for how individual assay characteristics play into this have remained more of a mystery.
"We and others [have] identified that assay sensitivity is closely related to clinical factors such as stage and metastatic spread … [while] false positives … are routinely attributed to tumor heterogeneity," the Dana Farber group wrote.
"What is least understood is how technical factors related to assay performance contribute to false-positive and false-negative results. Probe design and specifics of library generation are often considered proprietary and are only minimally described in the few analytical validation studies that have been published," they added.
In an effort to untangle some of this, the AstraZeneca team, led by Brian Dougherty, executive director of translational science in oncology, convinced four commercial test vendors to participate in a head-to-head analysis of a set of 24 matched samples, sending back both their raw data and their variant calls so that investigators could evaluate not just areas of discordance, but also how those discordances came about in light of each assay's unique parameters.
This design allowed the study to "address many of the pitfalls that [have] plagued prior vendor comparison studies," Oxnard, Lau, and Paweletz wrote in their commentary. It allowed the study to investigate inter-assay technical factors like mutation-calling biases, and "examine whether false negatives were the result of stochastic sample biases or thresholding nuances of the vendor’s bioinformatic filter."
The vendors, unnamed in the study, are all US-based CAP–CLIAcertified NGS commercial laboratories. Resolution Bioscience confirmed to GenomeWeb that its test is one of the four, specifically "vendor B," which features a characteristically smaller panel.
Possibilities for the other three labs, which the AstraZeneca team described as all including "60 to 70 genes" in their tests, include Guardant Health, Foundation Medicine, and Personal Genome Diagnostics, but none of the three confirmed or denied their participation when contacted.
In the JCO Precision Oncology publication, Dougherty and colleagues reported that the compared liquid biopsy assays demonstrated "substantial variability" in their calls for the 24 samples. Overall, the team calculated that ctDNA test sensitivity ranged from 38 percent to 89 percent, while positive predictive value (PPV) spread between 36 percent and 80 percent.
Although some plasma-tissue discordances could be attributed to things like heterogeneity and clonal hematopoiesis, this was to a much lesser extent than technical factors, the authors wrote.
Overall, vendor C had the highest calculated sensitivity, 89 percent, and vendor B (Resolution) had the highest PPV at 80 percent. Vendor D had a PPV of only 36 percent, reflecting a high level of what were deemed false positives, authors reported.
In the case of false-positive variants, the study found that these tended to be novel, and were frequently related to vendor-specific mutational biases. False negatives, on the other hand, were largely attributed to bioinformatic filtering of suspected germline variants or limitations related to the signal-to-noise ratio.
For example, one TP53 variant was missed due to elevated background noise in vendor A's assay, but "was clearly present and reported by vendor C," the team wrote.
Resolution Bioscience CEO Mark Li said earlier this year that one heartening aspect of the AstraZeneca analysis for his company was that it saw relatively few false positives. This reflects a conscious choice on Resolution's part to maintain conservative bioinformatics. "A false positive... It's terrible," he said.
In the reported results, Resolution (vendor B) missed a PIK3CA H1047R variant that was called by vendor C, for example. On further examination, the variant was present in Resolution's raw data but was not reported, presumably because it was present at levels below the firm's bioinformatic calling threshold, but above that used by vendor C. There was a high signal-to-noise ratio for both companies and similar numbers of reads (five for vendor C and four for vendor B).
In an email this week, Li added that the data highlights that "many vendors who claim super low detection rates may be reporting false positives, in one case even more false positives than true positives. "We believe a false positive is the worst thing we can do," he added.
Further details from the analysis include the fact that most discordance observed in the cross-vendor and tissue comparison was observed below 1 percent variant allele frequency. Limiting the analysis only to mutations called at an allelic fraction greater than 1 percent, three of the vendors — A, B, and C — all achieved a PPV of 100 percent.
According to Oxnard and his commentary coauthors, the results "should give pause." Although positive prediction was high for three of the assays at higher allele frequencies, mutations called with AF less than one percent saw PPV dropping as low as 17 percent for vendor A, for example.
"Note that variants detected at less than 1 percent AF are routinely reported by each vendor, and such sensitivity is advertised as a unique strength of plasma NGS assays," the commentators wrote.
Li also highlighted this aspect of the study. "Everything below 0.2 percent was not real," he said, "So the '.01 or .001 debate' seems to be going away a little bit," he said. As liquid biopsy has advanced, companies have at times advertised limits of detection down to .01 percent or even lower, despite the fact that given the amount of input being used there are not enough molecules to detect a variant. "The math just doesn't pencil [in] even if you're 100 percent efficient," Li explained.
But, he also added in his email that the fact that the study showed 100 percent PPV for three of the tests with AF above 1 percent "should give the clinical community some confidence that a liquid biopsy can work."
Another interesting aspect of the AstraZeneca analysis was its reiteration of the need for blood-based assays to accurately distinguish germline from somatic variants, whether via paired normal sequencing or bioinformatics filtering. Investigators verified several higher-frequency plasma variants as germline by comparison with tumor-normal sequencing data. That said, the analysis reveals that such practices can influence discordances between different liquid biopsy assays.
For example, a BRCA2 variant was reported by vendor D at 20 percent AF but was not reported by vendor C, despite its presence in that vendor’s raw data "possibly because of a bioinformatic filtering of suspected germline variants," the authors wrote.
Troublingly, one vendor (vendor A) mis-processed 8 samples, associating results with the wrong patient sample. Correcting for this would have raised vendor A's PPV from 55 percent to 73 percent and its sensitivity from 38 percent to 44 percent. But the authors decided to present only the as-is data, because no other labs mixed up samples, and the mistake reflects a "serious issue."
Authors of the commentary piece accompanying the AstraZeneca data stressed that because the samples used were enriched for early-stage cancers "which are not the intended population for plasma NGS," and because the results don't take into account clinical reporting strategies that can potentially mediate some of the identified issues, implications for clinical practice should be taken with caution.
"Extrapolation of these findings to the analysis of clinical samples from more advanced cancers … may lead to over-interpretation of the discrepancies," the study authors wrote.
Resolution's Li said that his company is dedicated to continuing to study these factors and participating in cross-platform analyses, considering that a public perception of variability or discrepancy in liquid biopsy tests is harmful to the field as a whole.
Results from previous studies comparing Guardant Health with Foundation Medicine or with PGDx that have had more dismal results are "not good for anybody," he said.
In more recent analyses, also by AstraZeneca, he said that investigators have seen much higher concordance, at least for SNVs and indels, between Resolution's and Guardant's tests.
In their commentary piece, Oxnard and colleagues wrote that as the use of plasma NGS becomes increasingly widespread in cancer care, "there remains a clear need for [more] concordance studies. Future analyses should "ideally focus on actionable variants and … be limited to advanced cancer," the group added.