Predicting The Unpredictable

http://neilperkin.typepad.com/only_dead_fish/2011/02/predicting-the-unpredictable.html

By Neil Perkin

Ever since I read Taleb’s Black Swan and Fooled By Randomness I’ve had a heightened awareness of the limitations of our attempts to predict the future, and our tendency to post-rationalise events, create false narratives and illusions around our own influence and control. Prediction is difficult.

But that shouldn’t stop us from trying. When Martin Bailie wrote that post (the one that won Post Of The Month back in December) about moving from ad-hoc research to real-time insight, he was writing about the failure of the research industry to innovate: to build new models for a future characterised by a multiplicity of new data sets that are now streaming from the web.

I thought he made an excellent point. One of his exceptions that proved his rule was Brainjuicer (disclosure: I have done some work with Brainjuicer in the past) whose successful Predictive Markets product asks 500 random respondents to play a game in which they can buy or sell shares in ideas based on how well they think the idea will perform in market (instead of, as most research methodologies would champion, asking them how likely they are to buy it).

Brainjuicer’s founder (and Chief Juicer) John Kearon co-authored an excellent Esomar paper (PDF) with Mark Earls not so long ago, on Me-To-We Research. It contains some good innovative thinking on a different approach to how research can add value to business. We are, says the paper, unreliable witnesses to our own motivations yet extremely good at noticing what other people are doing. So it shifts the focus onto a model shaped by mass anthropology (smart observation of mass behaviour from data streams), mass prediction (using the wisdom of the crowd to predict winning concepts), mass ethnography (using people’s natural ability to observe their peers) and mass creation (co-creation, collaboration).

Much market research is about the collection of fresh data. Yet with a sensible approach to the collection and application of such data sets, anaysis of existing streams can yield some truly fascinating results.

A Hewlett Packard study (PDF) that studied 3 million tweets about 25 blockbuster movies found that Twitter can accurately predict the future box-office takings of big release films. Taking account of the rate at which messages were posted proved to be an accurate indicator of the box office takings before the film opened. Sentiment analysis on the content of messages could similarly foresee the ongoing degree of success or failure. Both with a high degree of accuracy. Higher, for example, than the Hollywood Stock Exchange, the platform that enables thousands of people to buy and trade in virtual shares in actors and movies which is in itself lauded for it’s predictive capability.

So if twitter can predict the success of movie releases, can it predict the stock exchange? Indiana University Bloomington studied 9.8 million tweets from 2.7 million tweeters, using a complex but well recognised psychological profiling standard to select keywords which were then monitored to guage the mood of the twitterverse. The study found that on a given day this measure can foretell the direction of changes to the Dow Jones Industrial Average three days later with an accuracy of 86.7 %. Yes, you read that right: 87%.

Similarly a Facebook study found that social activity and the number of Facebook fans turned out to be a remarkably good predictor of the results in the US Mid-term elections last year. Facebook tracked 98 of the most hotly contested House races (as decided by leading independent political observers) and 74% of the candidates with the most Facebook fans won their races. Just over 82% of the 34 Senate races were won by the candidate with more Facebook fans.

Studies such as these may be something of a news-worthy curiosity right now, but they are surely illustrative of a greater, untapped opportunity: the capability to generate useful knowledge from the mass of data that accumulates around behaviours, consumption, and experiences. It’s not difficult to see where this might go. For Google (for example) the future of search isunlikely to be search. More likely it will be ‘contextual discovery’: moving towards being able to look at either a person’s browsing profile or their location profile and serving up interesting data to them without them searching for anything.

It strikes me that we have barely scratched the surface of what might be possible. The ability to utilise existing data streams to deliver new insights, tovisualise the intangible, but also to support what Martin Bailie called more ‘Flexible Guessing’: to work out what to anticipate and what to commit to; to help inform short, medium and long-term activity; to propose what might happen, then adopt agile methodology to adapt on the fly; to embrace the future whilst keeping a link to the past. And what he called ‘Responsive doing’: lighting lots of fires; starting small and amplifying; testing, learning and changing in real time collaboration with our communities.

How much value lies untapped in existing corporate and public data streams? My guess is a lot. How many companies are truly joining up their data to extract and interpret the kind of intangible insight that has real tangible value? My guess is very few. That, to me, has opportunity written all over it.