Sunday, November 18, 2012

How big is Mechanical Turk?

A question that people ask me very often is about the size of Mechanical Turk. How many tasks are being completed on the marketplace every day? What is the transaction volume? Let me give a quick answer: I have no idea. Since Amazon does not release any statistics about the marketplace, it is pretty much impossible to know for sure.

Mechanical Turk Tracker

However, I do have some estimates, mainly by using the data that I have been collecting through the Amazon Mechanical Turk Tracker. For those not familiar with the site, over the last four years, we are crawling the Mechanical Turk site every few minutes and we capture the complete state of the market: What tasks are available, their prices, the number of HITs available, etc.

One feature that we revamped lately is the ability to see the number of tasks that are posted and completed every day. You can check the "Arrivals" tab to see the details.



Estimating HITs posted and completed

How do we estimate the number of tasks that get posted and completed? The estimation is a little bit tricky and not 100% foolproof but it works reasonably well, based on my current observations.

Since we can keep track of the history of a task over time, we can see the changes in the number of available HITs over time. For example, we may observe a task that has the following number of HITs in sequential crawls, over time:
1000...700...500...2000...1000...100...[disappeared]

For this task, we estimate that we have an initial posting of 1000 HITs. Then, we see 1000-700 = 300 HITs completed between the first and second crawl. Then, 700-500=200 HITs completed between the second and third crawls. However, between the third and fourth crawl we see a "refill" with 2000-500=1500 HITs, which have been posted. Then we see 2000-1000 = 1000 HITs being completed, then 1000-100=900 HITs completed, and finally the task disappears and the last 100 HITs are assumed to be completed. This generates a total of 1000+1500 HITs posted, and 300+200+1000+900+100 HITs completed.

We do have some extra sanity tests but let's consider the current description as sufficient. For the record, I have checked with a few big requesters and my estimated numbers were pretty close to the actual ones, so I feel reasonably confident that I am not off completely.

Analyzing daily volumes

Now, by looking at the current arrivals data, we can see that my tracker estimates approximate \$30K-\$40K of tasks completed per day. Given that I cannot observe redundancy, and that I may miss HITs that are getting posted and completed between my crawls, I may be underestimating. However, I may also be wrong by considering as "completed" tasks that were simply taken down, without being done. To be on the safe side, I will put my under-reporting factor somewhere between 1 to 10. In other words, I estimate the real daily volume to be somewhere between \$30K to \$400K. Yes, there is a huge difference between the two, but we get the order of magnitude, and you can be as pessimistic or as optimistic as you want.

These numbers generate a yearly transaction volume for Mechanical Turk between \$10M and \$150M. Given that Mechanical Turk takes 10% to 20% as fees, this is a revenue for Amazon between \$1M (low estimate) to $30M (high estimate) per year.

What would be the value of Mechanical Turk as a startup?

I love that question. Not because it is sensible. But because I get to be completely tongue-in-cheek, and make fun of the absolutely ridiculous P/E ration for the Amazon stock: Currently the trailing P/E for Amazon is a wonderful 2,681 (yep, not a typo). Assuming that the Mechanical Turk division generates some earnings in the \$1M to \$5M range, the valuation of Mechanical Turk is somewhere between \$2 billion to \$10 billion dollars! Not shabby for a 7-year old startup :-p.

OK, getting more serious: The price-to-sales ratio for Amazon is somewhere in the 1.75 range. Therefore, given an estimated yearly transaction volume for Mechanical Turk between \$10M and \$150M, the estimated valuation for Amazon Mechanical Turk is somewhere between \$15M (pathetic) to \$250M (respectable).

What is the growth?

While I am less certain about the numbers that have to do with the absolute transaction volume, I am much more confident about the growth numbers. Since my methodology remained the same over time, the growth of the sample should match reasonably well the growth of the overall market.

If you go again to the Arrivals tab on Mechanical Turk Tracker, and change the date range to go back to 2009, you will be able to see how the arrivals and completions have changed over time.


Forget about the absolute numbers. What is very clear is the last few years were very good for Mechanical Turk. While the numbers were pretty low early on, there was a 3x to 6x YoY growth in terms of transaction volume. This was really healthy.

One thing that puzzles me is what happened around March 2012. My tracker seems to detect a sudden stop in the growth. I am not quite sure what is going on there. Is there something about my crawler? Did something change on the Mechanical Turk site that caused a lower rate of completed jobs? I noticed for example, that now Amazon puts the "Masters" qualification as a default option for all the HITs posted through the web interface. This can definitely decrease the rate of completing jobs but I am sure that it will also increase the overall level of satisfaction of the requesters with the answers submitted by the Turkers. Anyhoo, I have not enough information, so I do not want to try to overanalyze that part.

Conclusion

Mechanical Turk is an interesting experiment for Amazon. It is not clear how important is the project for the rest of the company and how much Jeff Bezos supports the effort after all these years. But Bezos is well-known for planning for the long term, and my (imperfect) statistics tend to confirm (tentatively) that the market is on a good path.

Let's see how things play out...