This blog post was also distributed on Big Think, Truthout, The Moderate Voice.
Last October, a colleague and I speculated on how a special, powerful form of predictive analytics would revolutionize presidential campaigning—and, if successful, how it might be poorly received by the public thereafter. In our work, he and I focus more on financial, marketing, and online applications of this technology. But we bet the story would break within politics by 2016 or 2020.
Surprise: There's no wait! After Obama's win in November, we've learned they already did this. The president won reelection with the help of the science of mass persuasion, a very particular, advanced use of predictive analytics, which is technology that produces a prediction for each individual customer, patient, or voter.
This is the first story ever of a presidential campaign performing and proving the effectiveness of mass scientific persuasion.
The technology's purpose is to predict for each individual, and act on each prediction. But you may be surprised to know what the Obama Campaign analytics team predicted. In this persuasion project, they did not predict:
- Who would vote Obama
- Who would vote Romney
- Who would turn out to vote at all
… and they didn't even predict:
- Who was "undecided"
Instead, they predicted persuasion:
- Who would be convinced to vote Obama if (and only if) contacted
This is the new microcosmic battleground of political campaigns—significantly more refined than the ill-defined concept of "swing voter".
Put another way, they predicted for which voters campaign contact would make a difference. Who is influenceable, susceptible to appeal? If a constituent were already destined to vote for Obama, contact would be a waste. If an individual was predicted as more likely swayed towards Obama by contact than not swayed at all, they were added to the "to-contact" list. Finally, to top it off, if the voter was predicted to be negatively influenced by a knock on the door—a backfired attempt to convince—he or she was removed from the campaign volunteers' contact list: "Do-not-disturb!"
I interviewed in detail Rayid Ghani, Chief Data Scientist of Obama for America—who will be keynoting on this work at Predictive Analytics World in San Francisco (April 14-19) and Chicago (June 11-12)—for an article (January 21, 2013 in The Fiscal Times) and book chapter on this topic.
To make this possible, team Obama first collected data on how campaign contact (door knocks, calls, direct mail) faired across voters within swing states. Of course, such contact normally helps more than it hurts. But, since the number of volunteers to pound the pavements and dial phones is limited, targeting their efforts where it counts—where contact actually makes a difference—meant more Obama votes. The same army of Obama activists was suddenly much stronger, simply by issuing more intelligent command.
Therefore, they used the collected data not just to measure the overall effectiveness of campaigning, but to predict the persuadability of individual swing state constituents. Each person got a score, and the scores drove the army of volunteers' every move.
Persuasion modeling (aka uplift modeling or net lift modeling) has been honed in recent years for use in marketing. It's the same principle as for political campaigning, guiding calls and direct mail just the same (although marketing more rarely employs door knocks)—but selling a product rather than a president.
I've extensively covered this technology, which is more advanced than "regular" predictive analytics. Normally, you predict human behavior like click, buy, lie, or die (the subtitle of my forthcoming book on the topic). In this case, you predict the ability to influence said behavior.
If consumer advocates consider mass marketing a form of manipulation, they may find in this work even more to complain about. Was the election Moneyballed? As mere mortals are we consumers, patients, and voter too susceptible to the invisible powers of advanced mathematics? Will privacy proponents whip out their favorite adjective-of-concern, creepy? Shouldn't elections be about policies, not number-crunching?
No question, the power of persuasion prediction is poignant. Industries are salivating and pouncing.
Sometimes this kind of work truly helps the world. Less paper is consumed when direct mail is more focus and consumers receive fewer "junk mail" items. Patients receive predictively improved healthcare. Police patrol more effectively by way of crime prediction. Fraud is similarly detected, several times more effectively. Movie and music recommendations improve.
How can this power be harnessed without doing harm? And how is "harm" to be defined in this arena?
On a related note, click here for my TV clip deliberating the tricky issue of Target's pregnancy-prediction.
More details: my article in The Fiscal Times on this topic
It depends what you mean by “Obama won”. Sure he “won” the election, mostly because he only had one single dinosaur competitor who did a lousy job. I would not credit his “success” to use of analytics, and many people, probably far more than 50% of the population, consider his performance (past and future) as a total failure, though not as severe as the “Republican Failure”.
In the 21st century, any president will dramatically fail anyway (no matter what political color she harbors), but this is the subject for another article.
I think that a general rule of thumb for defining “harm” in this arena has to do with the data collection process. Each of us willfully submits data on a daily basis – we choose to give a thumbs up or thumbs down to a Pandora song, or we decide whether an email is ham or spam. We do so willingly, because we understand that giving this data will improve our user experience as a whole. However, while we are essentially giving our data to Google to improve our personal Gmail spam filter, we are not authorizing them to sell that data (our ham and spam emails) to any third party (at least I certainly hope we are not). I think that analytics derived from data voluntarily given for internal use is not “harmful”, so long as it is not subject to resale in any way, even if the company or organization profits at our expense.
As for the protest that the election was possibly Moneyballed, and even though I supported the other guy, I say let the mathematics rule. There is nothing wrong with crunching numbers to run a campaign more effectively and more intelligently, so long as the data collection process is ethical.
For the model itself, what kind of data was collected, and was it voluntarily given? Were the feature variables essentially name, address, contact type, contact history, etc. that would naturally arise in contacting a person, or were they derived from some other source of information? To me, for such a mass persuasion model to be acceptable, the information should be public and/or voluntarily given. That would ensure that team X does not have access to more information than team Y, unless of course individuals voluntarily give more information to team X and not team Y.
Finally, it seems like the best model would be a balance of exploration and exploitation, akin in a way to bandit problems. From your articles, it sounds like there was simply a data collection period, after which predictions were made and “persuadable” people were identified. Did the model continue to “learn” from its results, or was the entire statistical learning procedure based solely on the initial data collection?
Did someone do an uplift modeling on "whether this whole uplift influence modeling will make Obama win vs no uplift modeling" ? I would first be interested in the result of that.
The Obama for America analytics team did not disclose enough info for it to be publicly known whether this effort had enough of an impact to be a deciding factor in the election.