A number of lawsuits have alleged that Zoloft, an anti-depressant manufactured by Pfizer, causes heart defects in babies when the drug is taken during pregnancy. The lawsuits turn on expert evidence that Zoloft causes cardiac defects in a fetus when taken early in a pregnancy.
Hundreds of federal lawsuits were consolidated in the Eastern District of Pennsylvania, in a process known as multi-district litigation. After excluding expert testimony offered by the plaintiffs’ steering committee, the district court judge granted summary judgment to Pfizer and dismissed the cases without a trial. That decision was appealed to the Court of Appeals for the Third Circuit.
The plaintiffs’ steering committee initially relied on the expert opinions of epidemiologist Anick Bérard. The trial court excluded Dr. Bérard’s proposed testimony because he relied (in the court’s words) on the “novel technique of drawing conclusions by examining ‘trends’ (often statistically non-significant) across selected studies.”
After Bérard was excluded as an expert witness, the plaintiffs’ steering committee proposed to call Nicholas Jewell, a statistician, to prove causation. Pfizer filed a Daubert motion to exclude Jewell’s testimony. The court’s decision to grant that motion was affirmed on appeal.
Jewell analyzed studies that found a significant association between Zoloft and cardiac defects. The court declined to consider one of those studies because scientists who tried to replicate its results could not do so, and declined to consider another because the study contained an error that invalidated its results.
The trial court expressed concerns about the remaining studies because those that reached consistent results were based on the same database, while a study with a larger database failed to replicate those results. Jewell could not explain the inconsistency in a way that satisfied the court.
The trial judge also faulted Jewell for relying on statistically insignificant results, for disregarding a meta-analysis that reported insignificant associations between Zoloft and cardiac defects, for reanalyzing two studies that found no significant association between Zoloft and cardiac defects, and for conducting his own meta-analysis that included two studies but disregarded others.
The trial judge ultimately found that Jewell “failed to consistently apply the scientific methods he articulates, has deviated from or downplayed certain well-established principles in his field, and has inconsistently applied methods and standards to his data so as to support his a priori opinion.”
The question on appeal was whether the judge crossed the elusive line between acting as a gatekeeper to prevent the jury from hearing unreliable testimony and acting as a juror by judging the credibility of Jewell’s opinions.
The court of appeals noted that a judge must take care not to usurp the jury’s role. A trial court should exclude expert testimony only when the flaw in the expert’s methodology, or application of the methodology, is so large that the expert lacks “good grounds” for his or her conclusions.
According to the court of appeals, the central question on appeal was “whether statistical significance is necessary to prove causality.” Declining to state “a bright-line rule,” the court conceded that a causal connection between drug ingestion and a resulting harm may exist even in the absence of statistically significant findings. For example, studies of small populations might not detect significant differences in outcomes between pregnant mothers who took a drug and those who did not, while studies of larger populations (if they existed) might detect that difference.
Still, it was the plaintiffs’ obligation to prove a causal connection between Zoloft and birth defects. The court concluded that statistical significance is not a “magic criterion” of admissibility, but regarded it as “an important metric to distinguish between results supporting a true association and those resulting from mere chance.”
The plaintiffs argued that the district court erroneously required “replicated, significant epidemiological results before inferring causality.” The appellate court decided that the trial judge did not impose that requirement as a legal standard, but made a factual finding about what the teratology community generally requires to establish causality. Of course, the difference between a factual finding and a legal standard is murky when the factual finding drives the court’s decision about whether the legal standard of reliability has been satisfied.
The court based its finding about what the teratology community requires on the court’s own review of the scientific literature to determine the “prevailing standard” that scientists follow. Although proof that a scientific methodology has been generally accepted is not required by Daubert, it is a factor the court can consider. Since the court also considered (and rejected) alternative methodologies used by Jewell and Bérard, including general trends analysis, reexamination of studies, and meta-analysis, the court of appeals decided that the trial court did not create an inappropriate legal standard that applies in all cases.
“Weight of Evidence” Methodology
Jewell’s expert opinions rested on a combination of two methodologies: a “weight of evidence” analysis and the Bradford Hill criteria. A weight of evidence analysis invokes a chain of reasoning to arrive at the best answer to a question. The Bradford Hill criteria are principles that epidemiologists use to distinguish a mere association from a causal connection.
The court of appeals agreed that the weight of evidence analysis and the Bradford Hill criteria are reliable methodologies for determining causation. The appellate court agreed with the trial court, however, that Jewell failed to apply the methodologies in a reliable way.
The court noted that “flexible methodologies” require an expert to make choices by, for instance, assigning more weight to one factor than another. Reliable application of a flexible methodology requires the expert to justify those choices with sound scientific reasoning. It is that reliance on the scientific method that distinguishes the reliable application of a methodology from an outcome-driven assessment of evidence.
The court of appeals accepted that methodologies such as trend analysis, meta-analysis, and reanalysis may be reliable, but faulted Jewell for failing to apply those techniques reliably and for failing to failing to explain how his analysis supported selected Bradford Hill criteria. According to the court, Jewell “applied these techniques inconsistently, without explanation, to different subsets of the body of evidence.”
The court of appeals rejected some of the trial court’s reasoning. Unlike the trial court, the court of appeals did not think it is inherently problematic for one scientist to reanalyze data obtained by another scientist and to arrive at a different conclusion as a result of that reanalysis. The court of appeals also thought the trial judge usurped the jury’s role in concluding that one study cannot replicate another when both studies are based on the same population, a proposition that Jewell disputed. The court of appeals nevertheless concluded that the trial court did not abuse its discretion in concluding that Jewell’s conclusions were insufficiently reliable to satisfy Daubert.