In 2015, The Intercept published documents from whistleblower Edward Snowden which detailed the National Security Agency’s SKYNET program. These documents detail how SKYNET engages in mass surveillance of Pakistan’s wireless network, and then uses an algorithm on the cell network metadata in an attempt to rate every member of the population regarding their likelihood of being a terrorist.
Patrick Ball, a data scientist and the director of research at the Human Rights Data Analysis Group, is now stating that he believes a flaw in how the NSA trains SKYNET’s algorithm could be leading to mistakes and improper surveillance. Ball has provided testimony for war crimes tribunals in the past, and he told Ars Technica that the NSA’s methods are “completely bullshit” and called the algorithm scientifically unsound.
Ars Technica reported:
“Somewhere between 2,500 and 4,000 people have been killed by drone strikes in Pakistan since 2004, and most of them were classified by the US government as ‘extremists,’ the Bureau of Investigative Journalism reported. Based on the classification date of ‘20070108’ on one of the SKYNET slide decks (which themselves appear to date from 2011 and 2012), the machine learning program may have been in development as early as 2007.”
Based on Ball’s assessment, it is likely that thousands of innocent people in Pakistan may have been labelled as terrorists and possibly murdered based on the “scientifically unsound” algorithm that SKYNET employs.
SKYNET operates by collecting and storing metadata on the NSA cloud servers before extracting “relevant information” and then using the algorithm to determine possible leads for targeted assassination. These assassinations may be carried out by the “Find-Fix-Finish” strategy implementing Predator drones or “on-the-ground death squads.“
SKYNET’s analysis of metadata likely provides the program with details about people who may travel and sleep together, visits to other countries, and contact information. The slides released by Snowden reveal that the NSA machine learning algorithm uses more than 80 different categories to rate people as possible terrorists.
One large problem is that SKYNET relies on the user to provide the machine with examples of “known terrorists” in order to teach the algorithm to look for similar qualities. Ars Technica noted that since most terrorists are unlikely to take a survey from the NSA, the agency relies on possibly faulty indicators of what a terrorist’s behavior will look like. SKYNET’s algorithm analyzes metadata and compares that against a “known terrorist” and then produces a threat score for each individual.
But what happens if the algorithm makes a mistake?
“If they are using the same records to train the model as they are using to test the model, their assessment of the fit is completely bullshit,” says Patrick Ball. “The usual practice is to hold some of the data out of the training process so that the test includes records the model has never seen before. Without this step, their classification fit assessment is ridiculously optimistic.”
The reality is that as long as the people of the United States allow agencies of the U.S. government to operate largely in secret, we will never know what type of programs and criteria are used to judge and analyze our behavior. How long will it be before SKYNET applies its algorithm to the American people?