Saturday, January 12, 2008

Definining Probability in Prediction Markets

The New Hampshire Democratic primary was one of the few(?) events in which prediction markets did not give an "accurate" forecast for the winner. In a typical "accurate" prediction, the candidate that has the contract with the highest price ends up winning the election.

This result, combined with an increasing interest/hype about the predictive accuracy of prediction markets, generated a huge backslash. Many opponents of prediction markets pointed out the "failure" and started questioning the overall concept and the ability of prediction markets to aggregate information.

Interestingly enough, such failed predictions are absolutely necessary if we want to take the concept of prediction markets seriously. If the frontrunner in a prediction market was always the winner, then the markets would have been a seriously flawed mechanism. In such a case, an obvious trading strategy would be to buy the frontrunner's contract and then simply wait for the market to expire to get a guaranteed, huge profit. If for example Obama was trading at 66 cents and Clinton at 33 cents (indicating that Obama is twice as likely to be the winner), and the markets were "always accurate" then it would make sense to buy Obama's contract the day before the election and get $1 back the next day. If this was happening every time, then this would not be an efficient market. This would be a flawed, inefficient market.

In fact, I would like to argue that the late streak of successes of the markets to always pick the winner of the elections lately has been an anomaly, indicating the favorite bias that exists in these markets. The markets were more accurate than they should, according to the trading prices. If the market never fails then the prices do not reflect reality, and the favorite is actually underpriced.

The other point that has been raised in many discussions (mainly from a mainstream audience) is how we can even define probability for an one-time event like the Democratic nomination for the 2008 presidential election. What it means that Clinton has 60% probability of being the nominee and Obama has 40% probability? The common answer is that "if we repeat the event for many times, 60% of the cases Clinton will be the nominee and 40% of the cases, it will be Obama". Even though this is an acceptable answer for someone used to work with probabilities, it makes very little sense for the "average Joe" who wants to understand how these markets work. The notion of repeating the nomination process multiple times is an absurd concept.

The discussion brings in mind the ferocious battles between Frequentists and Bayesians for the definition of probability. Bayesians could not accept that we can use a Frequentist approach for defining probabilities for events. "How can we define the probability of success for an one-time event?" The Frequentist would approach the prediction market problem by defining a space of events and would say:
After examining prediction markets for many state-level primaries, we observed that 60% of the cases the frontrunners who had a contract priced at 0.60 one day before the election, were actually the winners of the election. In 30% of the cases, the candidates who had a contract priced at 0.30 one day before the election, was actually the winners of the election, and so on.
A Bayesian would criticize such an approach, especially when the sample size of measurement is small, and would point to the need to have an initial belief function, that should be updated as information signals come from the market. Interestingly enough, the two approaches tend to be equivalent in the presence of infinite samples, which is however rarely the case.

I just could not help but notice that the fight between the proponents and enemies of the prediction markets was reminiscent of the battles between Bayesians and Frequentists :-)