Center for Strategic Communication

A CH-47 Chinook helicopter over Kabul, Afghanistan on June 4, 2007. Photo: DoD

Insurgencies are amongst the hardest conflicts to predict. Insurgents can be loosely organized, split into factions, and strike from out of nowhere. But now researchers have demonstrated that with enough data, you might actually predict where insurgent violence will strike next. The results, though, don’t look good for the U.S.-led war.

And they’re also laden with irony. The data the researchers used was purloined by WikiLeaks, which the Pentagon has tried to suppress. And the Pentagon has struggled for years to develop its own prediction tools.

That data would be the “Afghan War Diary,” a record of 77,000 military logs dated between 2004 and 2009 that were spilled onto the internet two years ago by WikiLeaks. In a paper published Monday by the Proceedings of the National Academy of Sciences, a team of researchers used the leaked logs to (mostly) accurately predict violence levels in Afghanistan for the year 2010. (Behind a paywall, alas, but a summary is available for free in .pdf.)

It sounds simple. Take six years’ worth of data, plug in the right formulas, and out comes results that give a “deeper insight in the conflict dynamics than simple descriptive methods by providing a spatially resolved map of the growth and volatility of the conflict,” the researchers write. In practice, it’s maddeningly complicated — and suggests that the insurgency has successfully withstood the recent surge of U.S. troops.

One of the keys to accurate prediction, the report says, is a robust sample size. Though the military’s records almost definitely don’t contain every violent outburst that’s occurred in Afghanistan since 2004, and the events included range widely from “elaborate preplanned military activity and spontaneous stop-and-search events,” it’s better than relying on inaccurate or incomplete reports from NGOs or the media.

And yet the military has spent millions developing predictive tools. They don’t work very well. Darpa’s Integrated Crisis Early Warning System actually predicts few crises. Its predecessors, which date back to the 1980s, were arguably even more inaccurate. But those seek to predict big, sweeping geopolitical events. Researchers have had better luck estimating expected fatalities from the wars in Iraq and Afghanistan. But predicting violent events with news reports as data? #Fail.

But even when you’ve got more complete data, analyzing it is still an uphill battle. Data alone won’t reveal if a sudden spike in violence is a statistical blip or a marker of a trend. Conflict reports could steadily increase in one area, but elsewhere oscillate wildly between explosive violence and relative calm.

In order to get a sense of what the conflict will be like in the future, the researchers modeled volatility. The more volatility — the more conflict whips back and forth between extremes of war and peace — the less accurate the prediction.

In Afghanistan’s Sar-e Pul, Balkh and Badghis provinces, researchers observed a “modest number” of total events, which made these provinces look deceptively quiet, while also seeing “significant overall growth in activity throughout the years.” The growth was steady, which made predictions relatively easy. However, in western Farah province, which saw a highly volatile surge in violence in 2005 and 2006, the long-term trends are less known.

For less-volatile Baghlan province, the researchers predicted actions by “armed opposition groups” to increase by 128 percent: from 100 incidents in 2009 to 228 incidents in 2010. After comparing the results to data from the Afghanistan NGO Safety Office, it turns out they were pretty close. The office reported that Baghlan saw a 120 percent increase in violence, from 100 to 222 incidents. A correlation test found that “strong support” exists for the model in predicting outcomes within all of Afghanistan’s 32 provinces. Even in provinces where the real result was different from the predicted result, it was still within the range of expected outcomes: accurate in a statistical sense.

Two takeaways from the study won’t comfort the military. It would appear that the insurgency resisted the Obama administration’s surge of 30,000 troops into Afghanistan, at least in the first year. “Our findings seem to prove that the insurgency is self-sustaining,” Guido Sanguinetti, a computational scientist and the study’s lead author, told the Los Angeles Times. Even with a new offensive, “this doesn’t seem to disturb the system,” he said.

What’s more, the Pentagon has made a big push to stop the next WikiLeaks. But the researchers’ study suggests it might be more fruitful to sift through the data WikiLeaks uncovered for clues about the direction of the war.

They also caution when there’s a high level of volatility, it’s best not to jump to conclusions. Their model might only be good for insurgencies, where violence is more ad hoc. Punch-outs between rapidly moving and well-organized armies, on the other hand, might actually be even harder to predict. That’s “vital for decision purposes,” they write. “Simply stated, it might prove a better option to admit a large uncertainty about the future, than to base a policy decision on a highly uncertain prediction.”