Chicago In more than 140 cities across the United States, ShotSpotter’s artificial intelligence algorithm and complex network of microphones evaluate hundreds of thousands of sounds a year to determine whether they were a shooting, resulting in data that is now used in criminal cases nationwide.
But a classified ShotSpotter document obtained by the Associated Press specifies something the company doesn’t always tout about its “precise conditional system” — human employees can overrule and reverse algorithm decisions, and are given broad discretion to decide whether a sound is appropriate for a gunshot, fireworks, or thunder or something else.
Such setbacks are happening 10% of the time by 2021, which experts say could lend subjectivity to increasingly important decisions and runs counter to one of the reasons AI is used in law enforcement tools in the first place — to reduce the fallible role of all humans.
“I listen to a lot of gunshot recordings — and it’s not easy to do,” said Robert Maher, the national lead gunshot detection official at Montana State University who reviewed the ShotSpotter document. Sometimes it is clearly a gunshot. Sometimes it’s just ping and ping and ping. …and you can convince yourself it’s a gunshot.”
The 19-page operations document marked “Warning: Confidential” outlines how employees at ShotSpotter review centers should listen to recordings and evaluate the algorithm’s results for potential shootings based on a series of factors that would trigger judgment calls, including whether audio was heard. The cadence of the shooting, whether the sound pattern is like a “sideways Christmas tree” and if there is “100% certainty of gunfire in the reviewer’s mind.”
ShotSpotter said in a statement to the Associated Press that the human role is to positively validate the algorithm and that the document in “simple language” reflects the high standards of accuracy that reviewers must meet.
“Our data, based on a review of millions of incidents, proves that human review adds value, accuracy and consistency to the review process that our clients — and many gunshot victims — rely on,” said Tom Chittum, vice president of analytics at the company and forensic services.
Chittum added that the company’s expert witnesses have testified in 250 court cases in 22 states, and that its “97% overall accuracy rate for real-time detections across all clients” was verified by an analytics firm commissioned by the company.
Another part of the document underscores ShotSpotter’s longstanding focus on speed and decisiveness, its commitment to categorizing votes in under a minute and alerting local police and 911 dispatchers so they can dispatch officers to the scene.
Entitled “Adopt a New York State of Mind,” it refers to the New York Police Department’s request for ShotSpotter to avoid publishing alerts of sounds as “possible shootings” — only final ratings of shooting or not shooting.
“The end result: It trains the reviewer to be decisive and accurate in their rating and attempt to remove the questionable post,” the document reads.
Experts say such guidance under time pressures may encourage ShotSpotter reviewers to err in favor of classifying the audio as gunshot, even if some of the evidence for this is insufficient, potentially increasing the number of false positives.
“You don’t give humans a lot of time,” said Geoffrey Morrison, a UK-based voice recognition scientist who specializes in forensic operations. “And when humans are under a lot of stress, the probability of making mistakes is higher.”
ShotSpotter says it posted 291,726 fire alerts to customers in 2021. That same year, in comments to the AP attached to an earlier story, ShotSpotter said that more than 90% of the time human reviewers agreed to rate the machine but that the company invested in its team of reviewers “in 10 % of the time they disagree with the device. ShotSpotter did not respond to questions about whether this percentage is still correct.
The ShotSpotter operations document, which the company argued in court for more than a year was a trade secret, was recently released from a protective order in a Chicago court case in which police and prosecutors used ShotSpotter data as evidence in charging a Chicago grandfather with murder in 2020 for allegedly shooting a man. inside his car. Michael Williams spent nearly a year in prison before a judge dismissed the case due to insufficient evidence.
Evidence at Williams’ pretrial hearings showed that the ShotSpotter algorithm initially classified the noise picked up by the microphones as a firecracker, making that decision with 98% confidence. But a ShotSpotter reviewer who evaluated the sound quickly renamed it a gunshot.
The Cook County Public Defender’s office says the operations document was the only paper ShotSpotter sent in response to multiple subpoenas for any scientific guidelines, manuals or other protocols. The publicly traded company has long resisted calls to open its operations to independent scientific scrutiny.
ShotSpotter of Fremont, California, admitted to the Associated Press that it has “extensive training and operational materials” but considers it “confidential and a trade secret.”
ShotSpotter installed its first sensors in Redwood City, Calif., in 1996, and for years relied solely on local 911 dispatchers and police to review every potential gunshot until adding its own human reviewers in 2011.
Paul Greene, a ShotSpotter employee who frequently testifies about the system, explained in a 2013 evidentiary hearing that employee reviewers addressed issues with a system that “has been known from time to time to give false positives” because it “has no ear to listen.”
“The classification is the most difficult component of the process,” Green said at the hearing. “Simply because we have no… control over the environment in which shots are fired.”
Green added that the company likes to hire former military and police officers who are familiar with firearms, as well as musicians because “they tend to have a more developed ear.” Their training includes listening to hundreds of sound samples from gunfire and even visits to rifle ranges to learn about the characteristics of rifle blasts.
As cities weigh the system’s promise against its price—which can run as high as $95,000 per square mile annually—company staff detailed how acoustic sensors on utility poles and light poles pick up a loud sound, thump, or boom, then filter the sounds through an algorithm that ranks automatically whether it was a shooting or something else.
But until now, little was known about the next step: how ShotSpotter’s human reviewers in Washington, D.C., and the San Francisco Bay Area decide what’s a gunshot versus what other noise, 24 hours a day.
“Listening to audio downloads is important,” according to the document written by David Valdez, a former police officer and now-retired supervisor of one of the ShotSpotter review centers. “Sometimes the sound is so convincing to shoot that it can override all other characteristics.”
One part of the decision-making process that has changed since the document was written in 2021 is whether reviewers can consider whether the algorithm has “high confidence” that the sound was a gunshot. ShotSpotter said the company stopped showing the algorithm’s confidence rating to reviewers in June 2022 “to prioritize other elements more closely related to the accurate human-trained assessment.”
ShotSpotter CEO Ralph Clark said the system’s machine ratings were improved with “real-world feedback loops from humans.”
However, a recent study found that humans tend to overestimate their ability to identify sounds.
A 2022 study published in the peer-reviewed journal Forensic Science International looked at how human listeners identify sounds compared to voice recognition tools. It found that all human listeners performed worse than the sound system alone, saying the findings should lead to human listeners being disqualified in court cases whenever possible.
“Is this the case with ShotSpotter? Would the ShotSpotter plus reviewer system outperform the system alone?” asked Morrison, who was one of the seven researchers who conducted the study.
“I don’t know. But ShotSpotter should do validation to prove it.”
Burke reported from San Francisco.
Follow Garance Burke and Michael Tarm on Twitter at @garanceburke and @mtarm. Contact the AP Global Investigative Team at Investigative@ap.org or https://www.ap.org/tips/