[This post is part of a series on re-thinking conflict early warning.]
Let me give you the punch line first: peacebuilders should engage with big data, but not (only) to build predictive conflict early warning models. We should also use big data to inform systems thinking for conflict analysis.
There’s only been a little work done on how big data could be used to create predictive models for conflict early warning. Much of it is so far is either tentative (explorative pilots) or theoretical (explaining potential applications). Emmanuel Letouze, Patrick Meier, and Patrick Vinck published an article as part of this review of new technology for conflict prevention that makes two concrete suggestions for uses of big data in conflict prevention. One is the use of data to understand population movement, mainly through CDRs. In situations where migration patterns or group movements are known to affect conflict dynamics, such data provided in real-time could be very valuable to operational conflict prevention activities. Second, big data can help understand sentiment in a population by providing a source of perceptions data. UN Global Pulse has piloted a project to analyse perceptions expressed on Twitter in Indonesia.
Big data is typically defined as the digital traces of human activity. Although it doesn’t quite fit this definition, the Global Data on Events, Location and Tone (GDELT) dataset developed by Kalev Leetaru is similar to many big data feeds. The dataset comes from a “cross-section of all major international, national, regional, local, and hyper-local news sources, both print and broadcast, from nearly every corner of the globe, in both English and vernacular.” Events are coded by actors involved, the type of event, location, time and tone. The dataset is updated daily. In Kalev’s own words:
Measuring the global news tone essentially conducts a passive “poll” of the press across the world, summarizing their combined views on the likely outcome of the event, recording whether a bombing results in only a few isolated factual reports, or widespread extreme negativity.
I bring up GDELT because UNDP is carrying out two pilot studies (in Tunisia and Georgia) to look into how this kind of passive poll from GDELT can inform conflict early warning for UNDP Country Offices. Preliminary results from the Tunisia study suggest that looking at the changes in tone over time offers some insights on conflict-related trends. There is still more work to be done on looking at the predictive power of these trends in tone and examining whether they through up too many false positives or false negatives. But in general, it looks like GDELT’s passive poll could serve as one indicator in a conflict early warning system. Furthermore, the UNDP studies point to the potential for predictive conflict early warning models using other big data sources. And with tools like the Artificial Intelligence and Disaster Response platform (which currently works for Twitter, but may be extended to take in GDELT data), it will soon be relatively simple for non-technical staff to process and make sense of these big datasets.
The question that bothers me is not whether we can build more accurate predictive models, but rather whether the predictions they offer can really help peacebuilders. Changes in tone in the GDELT dataset tell us that something is about to go very wrong, that tensions are rising, that conflict is brewing. So now what? Perhaps this kind of warning is useful for a large organization like UNDP, as a way to keep an eye on a whole region and decide on strategic priorities for budgets and staffing. Perhaps some local peacebuilders could find this kind of passive poll useful as a more efficient way of keeping up with the news. But does any of this provide actionable data for conflict prevention or peacebuilding?
I think it could, but we need to look at the analysis differently. The part of the UNDP study on GDELT data from Tunisia that is most interesting to me is the preliminary actor analysis that breaks down who was saying what and to whom it was addressed. In fact, it’s always the analysis of interactions between actors that catch my attention in big data analysis for conflict. Last May at SXSW, CrimsonHexagon spoke on a panel about their Foresight platform, which is being used to look at Twitter conversations in Libya and Egypt and examine how social media is used after the revolution to build democracies and institutions. The research shows that people use Twitter in two ways post-revolution: instrumental (to tell social networks what’s going) and integrative (to express the meaning of their socio-political conditions). Google Ideas built this fascinating network map of Syrian defectors. And yesterday morning at ICCM, when Patrick Meir presented his analysis of tweets during the Westagate attack, I found the exploration of actors tweeting, being tweeted to and being referred to very thought provoking. In a recent commentary, Sanjana Hattotuwa refers to the interactions between and opinions expressed by different actors through big data as the “Track Two social hubbub”. Big data can provide actionable data for peacebuilding if we focus on unpacking the dynamics of this track two hubbub, and understand how they plug in to other tracks.
The trouble is, as Ben Ramalingan puts it, that “we don’t have a unified, conceptual framework for addressing questions of complexity” when it comes to big data. But I’ve been thinking: in peacebuilding, we do have a well-developed framework for thinking about complexity. For a number of years, CDA’s Reflecting on Peace Practice program has been training peacebuilders on how to use systems thinking to unpack conflict dynamics. In a nutshell, systems thinking looks at conflict as a group of interacting, interrelated, and interdependent components that form a complex and unified whole. To analyse a conflict, CDA recommends that practitioners create a system map of all the factors and issues, draw a force field analysis of forces for and against peace, and then chart the behavior of each force over time as well as the causal loops that bring them together. The point of this process is to identify the intervention point with most leverage to alter a conflict dynamic. In other words, systems thinking is used to identify the factors that are driving the evolution of the conflict system and picking out the weak links in the system for action. CDA training materials use a number of “conflict archetypes” to help peacebuilders identify what model best approximates the dynamics they see in their context and make their analysis actionable.
CDA’s systems thinking approach to conflict analysis is very complex and action-oriented, but it is not particularly data driven. How can we marry the insights of systems thinking using conflict archetypes with the social hubbub in big data? Kalev Leetaru has recently developed a very complex Global Knowledge Graph that connects all the elements in the GDELT dataset into one massive network graph. In his own words:
One of the most exciting elements of the Global Knowledge Graph is the vast array of non-event analyses it makes possible, from mapping themes, people, and terror groups over space, to looking at the connections among people and analyzing influencer networks around themes and space.
That’s where I think the magic might be. Could we use big data analysis tools to listen to social hubbub – not to predict conflict spikes but to better understand the intervention points in complex conflict dynamics? If we can, then we’ll really have big data for peace.
PS: on the exclusions of big data
One oft-cited objection to using big data for conflict early warning (or for anything else, in fact) is that it comes with important exclusions. Jonas Lerman recently published an article that outlines how big data poses a unique threat to equality: billions of people remain on its margins because they are not connected to the digital systems that generate big data and are thus missed from any analysis. The nonrandom, systematic omission of people who live on big data’s margins make any conclusions drawn from big data problematic. Although Lerman does not touch on bias in news media, this blogpost does a good job of pointing out biases in media reporting too. The issue of exclusions is important, but it need not de-rail explorations of the potential of big data. Patrick Meier does a great job of putting this concern in context here.