Late last year, a forensic firearms analyst in Wisconsin emailed a remarkable document to more than 200 of her colleagues across the country. It was a handout from an online lecture given by Jim Agar, the assistant general counsel for the FBI Crime Lab.
For years, forensic firearms analysts have claimed the ability to examine the marks on a bullet found at a crime scene and match it to the gun that fired it—to the exclusion of all other guns. It can be powerfully persuasive to juries. But over the last decade or so, some scientists have cast doubt on the claim.
Forensic firearms analysis falls into a subcategory of forensics colloquially known as “pattern matching.” In these specialties, an analyst looks at a piece of evidence from a crime scene and compares it with a piece of evidence associated with a suspect.
The most damning criticism of the field came in a 2016 report by the President’s Council of Advisers on Science and Technology, or PCAST, which found that “firearms analysis currently falls short of the criteria for foundational validity,” and that the studies the field’s practitioners often cite to support their work are poorly designed and “seriously underestimate the false positive rate.”
After decades of deferring to these forensic analysts, a handful of judges started to heed the warnings from scientists, and have put limits on what some forensic witnesses can say in court. Those decisions have sparked a defensive backlash in the forensics community, along with rebukes from law enforcement officials and prosecutors.
Agar’s document is part of that backlash. In the two-page handout, Agar instructs firearms analysts on how to circumvent judges’ restrictions on unscientific testimony. He even suggests dialogue for prosecutors and analysts to recite if challenged. Most controversially, Agar advises analysts to tell judges that any effort to restrict their testimony to claims backed by scientific research is tantamount to asking them to commit perjury.
Agar’s document was so volatile, it was upbraided by the Texas Forensic Science Commission (TFSC). That agency—the only one of its kind—was formed in the wake of revelations that bogus expert testimony likely caused the state to convict and execute an innocent man, and is tasked with ensuring that expert testimony given in Texas courtrooms is scientifically valid. The TFSC called Agar’s advice to firearm analysts “irredeemably faulty,” and stated that it “runs counter to core principles in science.”
“This is just really unbelievable,” Ellen Yaroshefsky, a professor of legal ethics at Hofstra University, told The Daily Beast after reviewing Agar’s memo. “He’s encouraging false testimony and he’s undermining respect for the judiciary. I mean, he’s saying that if a judge says you can’t give unscientific testimony, you’re being forced to commit perjury? It’s just absurd.”
A Short History of FBI Forensic Blunders
Agar’s employer, the FBI crime lab, is often touted as the most prestigious forensics institution in the world. But the lab has also overseen some embarrassing, high-profile scandals.
In 2004, FBI analysts erroneously matched a partial fingerprint from the Madrid train bombings to falsely implicate Oregon lawyer Brandon Mayfield. A year later, the agency conceded there’s no scientific evidence to support “comparative bullet lead analysis,” a subfield of forensics based on the premise that each batch of bullets has a unique chemical signature. For years, analysts had cited this theory to claim that a bullet found at a crime scene could only have come from, say, a box of bullets found in a suspect’s home. It just wasn’t true.
In 2015, the agency was forced to cop to an even bigger scandal: For decades, its analysts had claimed an ability to match hair and carpet fibers that just isn’t scientifically feasible. One review found FBI analysts had made statements unsupported by science in 95 percent of the cases in which they testified. Such testimony sent hundreds of people to prison, including to death row. Those analysts also trained dozens—perhaps hundreds—of state and local analysts in the same dubious methods, potentially corrupting thousands more cases.
None of that has appeared to chasten the agency. Instead, the FBI and the Justice Department have been stubborn and defensive in the face of criticism, rejecting offers from scientific organizations to audit their methods and blind test their analysts. DOJ officials have assured the public that they’d conduct their own internal reviews, but have then been opaque about when or how or even if those reviews were conducted, or what they found. Agar’s handout to firearms analysts suggests little has changed.
The core problem with pattern matching fields of forensics is that they’re inherently subjective. In addition to firearms analysis, they include specialties like comparing a hair found on the victim with a hair from the suspect’s head, or pry marks found on a door frame to a screwdriver found in the suspect’s house, or a bite mark on the victim to a mold of the suspect’s teeth.
“Most of the fields of forensics were developed not by scientists, but by law enforcement to generate leads or to help convict suspects once they had been identified.”
In nearly all of these fields, there has been little effort to identify how frequently the characteristics that might distinguish one piece of evidence from another occur among the entire population of those particular things. You can’t say that because a hair is a particular color or thickness it definitely came from a particular suspect unless you also know how often that color and thickness occur together in the general population. And in a field like tool mark analysis, this part of the equation may not even be knowable. For an analyst to say the pry marks on a door frame could only have been produced by a particular screwdriver, for example, would require that analyst to know for certain that no other object on Earth could possibly have created similar marks.
Most of the fields of forensics were developed not by scientists, but by law enforcement to generate leads or to help convict suspects once they had been identified. Until recently, neither the analysts nor their methods had been subjected to the rigors of scientific inquiry—to processes like peer review or blind proficiency testing. Most also aren’t amenable to scientific concepts, such as calculating a margin for error.
It’s helpful to contrast these specialties with DNA testing, which actually did come from the scientific community. We know precisely how often certain DNA markers occur in the human population. This means that when scientists generate a DNA profile from a spot of blood at the scene of a crime, an analyst can say exactly how likely it is that the sample came from a particular suspect. Tellingly, unlike other forensic specialists, DNA analysts tend to shy away from terms like “match.” Instead, they state the statistical probability that a sample could have come from anyone other than the suspect.
Moreover, for most pattern matching fields, even if it were possible to calculate how often distinguishing characteristics occur, there has been little effort to gauge how proficient the analysts are at actually identifying and distinguishing those characteristics. That may be because the tests that have been done are disconcerting. In one proficiency test given to bite mark analysts, for example, the participants couldn’t even agree on whether the test marks were human bites, animal bites, or some other injury.
The field of forensic firearms or ballistics analysis, the subject of Agar’s memo, rests on two underlying premises. The first is that when a gun is fired, it leaves unique, identifiable marks on the bullet—marks that can’t be replicated by any other gun. The second is that, by examining these marks, firearms analysts can objectively and reliably match them to the gun that fired them, to the exclusion of all other guns.
There is no scientific research to support either premise. At best, in some cases, an analyst could say with some certainty that a particular gun did not fire a particular bullet.
Alicia Carriquiry is director at the Center for Statistics and Applications in Forensic Evidence at Iowa State. She and her team have been assembling a database of the ballistics marks left on bullets. Their research thus far has indicated there’s little support for the claim that every gun leaves unique marks on the bullets it fires—or least not in a way that’s useful for distinguishing one gun from another.
Controlled studies have also shown that the entire field of forensic firearms analysis is inherently subjective. The Houston Forensic Science Center is one of the few crime labs in the country to take a strictly scientific approach to forensics. Director Peter Stout regularly administers blind proficiency tests to his analysts. He first gave his ballistics analysts “sensitivity tests,” in which they were asked to determine whether two bullets were fired by the same gun. The analysts reached the correct conclusion about 76 percent of the time—leaving a lot of room for reasonable doubt.
Stout also gave his analysts “specificity tests,” in which they were asked to determine whether two bullets were fired by different guns. Here, the success rate dipped to 34 percent.
Carriquiry points to another recent sensitivity study—funded by the FBI itself—in which the analysts’ success rate was just 48 percent. “A dispassionate observer would say that they would have made fewer mistakes if they had flipped a coin,” Carriquiry says. “Given that astonishingly low accuracy, it seems pure hubris to be recommending to examiners to ‘push back.’”
A Repeating Pattern
In a series of decisions in the 1990s, the U.S. Supreme Court made judges the gatekeepers of science in the courtroom: Judges would determine which experts were credible enough to be heard by juries, and which were not.
But judges aren’t trained in science; they’re trained in the law. So it should come as no surprise that they’ve taken on this responsibility as lawyers might, not as scientists do. Because we have an adversarial legal system, for example, they’ve taken a similar approach to expert testimony. They tend to let each side bring in its own expert, let the experts fight it out on the witness stand, and then leave it to the jury to decide which expert is more credible.
The problem with an adversarial approach is that the skills it takes to persuade a jury aren’t necessarily the same skills it takes to be a thoughtful and careful scientist. In fact the two are often contradictory, and juries crave certainty. An expert who is willing to say, “this is the way it is,” will often seem more persuasive than an expert who says, “I don’t think we can say either way,” even though the latter is often more accurate.
Since the first fingerprint case in 1910, pattern matching analysts have given juries the certainty they crave. It wasn’t until revolutionary DNA testing began in the early 1990s that we started to discover that such testimony was sending innocent people to prison.
Citing these and other studies, defense attorneys and reform advocates have asked judges to limit firearms analysts only to conclusions supported by science. For example, an analyst could say, “I can’t exclude the possibility that this particular bullet was fired by that particular gun,” but they wouldn’t be allowed to say “this gun and only this gun could have fired that bullet.”
Until recently, judges routinely denied those requests.
This brings us to conflict between law and science: Science is constantly changing and evolving with new evidence and new testing. The rule of law requires stability and predictability, which is why courts tend to rely on precedent. Because forensics was born out of law enforcement, not science, by the time scientists began disproving the core premises of various fields of forensics, those fields had already gained a foothold in the legal system. It takes a lot to overturn precedent. So most judges have taken the past of least resistance, and continue to allow those fields into evidence.
It’s only in recent years, and only because of DNA testing and the growing body of scientific research, that judges have become more skeptical of pattern matching forensics.
The first shot across the bow came in 2009, when the National Academy of Sciences published the first comprehensive, scientific review of forensics, which found that analysts routinely give testimony unsupported by scientific research, even though it’s often presented to and perceived by jurors as science.
In the wake of that study, the Obama administration created the National Commission on Forensic Sciences (NCFS), a large group of lawyers, scientists, judges, and statisticians tasked with identifying the shortcomings in forensic and prescribing solutions and best practices.
In 2016, the aforementioned presidential advisory group PCAST issued the most damning report on forensics to date, calling for outright prohibitions on fields like bite-mark analysis, and providing a scathing critique of other pattern-matching fields.
The reaction to these reports from law enforcement officials has been derisive and defensive. When the PCAST report came out, then-Attorney General Loretta Lynch abruptly dimissed it, declaring that the Justice Department “will not be adopting the recommendations.” Groups like the National District Attorneys Association attacked the scientists’ motives, and accused them of harboring a political agenda. Other defenders of the status quo have argued that only other forensic specialists—not scientists or statisticians—are qualified to evaluate the accuracy and reliability of their peers, a claim akin to stating only tarot card readers are qualified to evaluate the scientific validity of tarot cards.
If the Obama administration’s approach to forensics was contradictory—it provided a platform for scientists to expose the problems, while its law enforcement leaders refused to do anything about them—the Trump administration’s approach was to shut down the discussion altogether.
One of the first acts of then-Attorney General Jeff Sessions was to allow the NCFS charter to expire. Instead, Trump’s DOJ announced it would be conducting its own internal review of federal forensic practices to “give clear guidance to what the Department’s forensics examiners may discuss in a courtroom.”
The Trump administration put a former prosecutor named Ted Hunt in charge of the review. Hunt is an outspoken defender of the status quo. He was one of just two members of the NCFS to vote against its recommendation that pattern matching analysts be prohibited from making claims to juries that aren’t backed by science.
In the waning days of the Trump administration, a mysterious press release and paper appeared on the DOJ website—both unsigned but likely authored by Hunt. They essentially waved away the PCAST report as irrelevant and misguided and advised DOJ analysts to ignore it. The memo and paper were quickly denounced by groups like the Center for Science and Democracy and the Union of Concerned Scientists. But the Biden administration has yet to rescind or contradict the documents, so they remain the DOJ’s official position.
“To be charitable, it may be that he isn’t advising analysts to give testimony he personally knows to be untrue. What is true is that the overwhelming majority of the scientific community disagrees with him.”
It’s in this context that we get the Agar document. Over the many decades in which police and prosecutors have benefited from judicial authority over the use of science in the courtroom, they’ve welcomed and endorsed it. Now that scientists are finally breaking through—overcoming the hurdles of precedent and adversarial justice—a high-ranking FBI official is no longer endorsing judicial authority, but offering strategies to undermine it.
“The Agar memo would be laughable if it was not so dangerous,” says Carriquiry. “His recommendation to examiners to push back and deny all possibility of errors or uncertainty runs contrary to science.”
Agar may well believe that forensic firearms analysis is scientific (neither Agar nor the FBI responded to requests for an interview). To be charitable, it may be that he isn’t advising analysts to give testimony he personally knows to be untrue. What is true is that the overwhelming majority of the scientific community disagrees with him.
“If 99 percent of scientists believe there’s no scientific basis to say this, but 1 percent say maybe there is, we can’t let the state present it to a jury as if it’s just an honest disagreement among experts,” says Yaroshefsky. “At some point, you have to say okay, there’s an overwhelming consensus here.”
Willful Ignorance
The FBI and DOJ claim to run the most elite, scientifically sound crime labs in the world while, at the same time, refusing to open those labs to review by outside scientists. They want to tell jurors their forensics are science, but they don’t want scientists scrutinizing their forensics.
Agar’s handout makes clear that he’s offering guidance to analysts on his own behalf, and not officially for the FBI or DOJ. But the fact that the attorney who advises the country’s premier crime lab—the lab that often trains analysts in state crime labs—would distribute such advice to hundreds of ballistics analysts ought to be alarming.
What Agar advises in the document is, at its core, no different than hair/carpet fiber and bullet composition scandals from the FBI’s past. Now, as before, forensic analysts are corrupting trials by making statements to juries that, at best, are unsupported by scientific research—and, at worst, are contradicted by it. And now, as before, they’ve been training and advising state and local analysts to do the same.
But there is one important—and chilling—difference. Since the onset of modern DNA testing, the potential for a wrongful conviction due to faulty testimony from, say, a bite mark or hair fiber analyst is far less likely than it once was. Hair fibers typically contain DNA, and bite marks (if they’re real bite marks) typically include saliva. So in most of these cases, there’s no need for a pattern matching analysis. Law enforcement officials can go straight to DNA. Even if they do turn to a forensic analyst, DNA testing will quickly contradict any analyst who gets it wrong.
The blast radius of the DNA revolution should have hit all pattern matching fields, and called them into question. Instead, it was mostly limited to fields involving biological material—fields that DNA testing could directly disprove.
But other fields, like ballistics matching, tire tread analysis, and shoe print analysis, are just as scientifically dubious and can be just as subjective and susceptible to cognitive bias as other pattern matching fields. It’s a near-certainty that these too produce wrongful convictions. Even without DNA, we know forensic firearms analysts played a role in the wrongful convictions of Curtis Flowers in Mississippi and Patrick Pursely in Illinois.
Bullets, of course, aren’t made of biological material, and shooting someone from a distance is unlikely to leave behind probative DNA. This means that for shootings, we’re far less likely to have the slam-dunk proof of a wrongful conviction the courts often require.
This is probably why, despite the few rulings Agar laments in his handout, ballistics matching still retains more credibility with most judges than other pattern matching fields. It hasn’t been proven wrong as often, not because it isn't just as flawed, but because the science-driven technology that has conclusively proven other wrongful convictions just isn’t applicable in these cases.
In the end, so long as high-ranking officials at agencies like the FBI continue to support and encourage unscientific testimony, the wrongful convictions will continue. We’re just far less likely to ever find out about them.