Sunday, November 18, 2012

How big is Mechanical Turk?

A question that people ask me very often is about the size of Mechanical Turk. How many tasks are being completed on the marketplace every day? What is the transaction volume? Let me give a quick answer: I have no idea. Since Amazon does not release any statistics about the marketplace, it is pretty much impossible to know for sure.

Mechanical Turk Tracker

However, I do have some estimates, mainly by using the data that I have been collecting through the Amazon Mechanical Turk Tracker. For those not familiar with the site, over the last four years, we are crawling the Mechanical Turk site every few minutes and we capture the complete state of the market: What tasks are available, their prices, the number of HITs available, etc.

One feature that we revamped lately is the ability to see the number of tasks that are posted and completed every day. You can check the "Arrivals" tab to see the details.

Estimating HITs posted and completed

How do we estimate the number of tasks that get posted and completed? The estimation is a little bit tricky and not 100% foolproof but it works reasonably well, based on my current observations.

Since we can keep track of the history of a task over time, we can see the changes in the number of available HITs over time. For example, we may observe a task that has the following number of HITs in sequential crawls, over time:

For this task, we estimate that we have an initial posting of 1000 HITs. Then, we see 1000-700 = 300 HITs completed between the first and second crawl. Then, 700-500=200 HITs completed between the second and third crawls. However, between the third and fourth crawl we see a "refill" with 2000-500=1500 HITs, which have been posted. Then we see 2000-1000 = 1000 HITs being completed, then 1000-100=900 HITs completed, and finally the task disappears and the last 100 HITs are assumed to be completed. This generates a total of 1000+1500 HITs posted, and 300+200+1000+900+100 HITs completed.

We do have some extra sanity tests but let's consider the current description as sufficient. For the record, I have checked with a few big requesters and my estimated numbers were pretty close to the actual ones, so I feel reasonably confident that I am not off completely.

Analyzing daily volumes

Now, by looking at the current arrivals data, we can see that my tracker estimates approximate \$30K-\$40K of tasks completed per day. Given that I cannot observe redundancy, and that I may miss HITs that are getting posted and completed between my crawls, I may be underestimating. However, I may also be wrong by considering as "completed" tasks that were simply taken down, without being done. To be on the safe side, I will put my under-reporting factor somewhere between 1 to 10. In other words, I estimate the real daily volume to be somewhere between \$30K to \$400K. Yes, there is a huge difference between the two, but we get the order of magnitude, and you can be as pessimistic or as optimistic as you want.

These numbers generate a yearly transaction volume for Mechanical Turk between \$10M and \$150M. Given that Mechanical Turk takes 10% to 20% as fees, this is a revenue for Amazon between \$1M (low estimate) to $30M (high estimate) per year.

What would be the value of Mechanical Turk as a startup?

I love that question. Not because it is sensible. But because I get to be completely tongue-in-cheek, and make fun of the absolutely ridiculous P/E ration for the Amazon stock: Currently the trailing P/E for Amazon is a wonderful 2,681 (yep, not a typo). Assuming that the Mechanical Turk division generates some earnings in the \$1M to \$5M range, the valuation of Mechanical Turk is somewhere between \$2 billion to \$10 billion dollars! Not shabby for a 7-year old startup :-p.

OK, getting more serious: The price-to-sales ratio for Amazon is somewhere in the 1.75 range. Therefore, given an estimated yearly transaction volume for Mechanical Turk between \$10M and \$150M, the estimated valuation for Amazon Mechanical Turk is somewhere between \$15M (pathetic) to \$250M (respectable).

What is the growth?

While I am less certain about the numbers that have to do with the absolute transaction volume, I am much more confident about the growth numbers. Since my methodology remained the same over time, the growth of the sample should match reasonably well the growth of the overall market.

If you go again to the Arrivals tab on Mechanical Turk Tracker, and change the date range to go back to 2009, you will be able to see how the arrivals and completions have changed over time.

Forget about the absolute numbers. What is very clear is the last few years were very good for Mechanical Turk. While the numbers were pretty low early on, there was a 3x to 6x YoY growth in terms of transaction volume. This was really healthy.

One thing that puzzles me is what happened around March 2012. My tracker seems to detect a sudden stop in the growth. I am not quite sure what is going on there. Is there something about my crawler? Did something change on the Mechanical Turk site that caused a lower rate of completed jobs? I noticed for example, that now Amazon puts the "Masters" qualification as a default option for all the HITs posted through the web interface. This can definitely decrease the rate of completing jobs but I am sure that it will also increase the overall level of satisfaction of the requesters with the answers submitted by the Turkers. Anyhoo, I have not enough information, so I do not want to try to overanalyze that part.


Mechanical Turk is an interesting experiment for Amazon. It is not clear how important is the project for the rest of the company and how much Jeff Bezos supports the effort after all these years. But Bezos is well-known for planning for the long term, and my (imperfect) statistics tend to confirm (tentatively) that the market is on a good path.

Let's see how things play out...

Tuesday, November 13, 2012

Why I love crowdsourcing (the concept) and hate crowdsourcing (the term)

The term crowdsourcing is in fashion. It is being used to describe pretty much everything under the sun today.

Unfortunately, the word crowdsourcing is also getting increasingly associated with "getting things done for free", or at least at ultra-cheap prices. The "crowd" will generate the content for the website. The "crowd" will fix the mistakes. The "crowd" will do everything, and preferably for "points", for "badges", for a spot on the leaderboard, or may be for a few pennies if we end up using Mechanical Turk.

But this association of the term crowdsourcing with low cost labor, is now visibly turning people off. Everybody wants to "use" the crowd but the workers in the crowd feel stiffed. The NoSpec movement was an early warning. The angry tone of some of the threads in Turker Nation is also an indication that many workers are not very happy with the way that they are treated by some requesters.

However, these negative associations are now endangering a very important concept: The idea that we can structure tasks in a way that are robust to the presence of imperfect workers, and that anyone can participate, as long as there is work available. Well-structured tasks allow the on-the-task evaluation of the workers, and can automatically infer whether someone is a good fit for a task or not.

This is not insignificant. It is well-known that one of the biggest barriers for breaking into the workforce is to have prior relevant experience. Students today often beg to get unpaid internships, just to have in their resume the lines with the coveted work experience. In online labor markers, newcomers often bid lower than what they would accept normally, just to build their feedback history. Crowdsourcing can change that.

But as long as crowdsourcing gets associated with low wages, nobody will see the real benefit: That work is within reach, immediately. That someone can experiment with different types of work easily (stock trading? product design?).

Perhaps a new term can describe better the true value of crowdsourcing, and also get the stigmatizing term "crowd" out of the name. (Nobody wants to be part of a "crowd".)

Personally, I favor the term "open work". As in the case of "open access" and "open source software", it describes the opportunity to access work, without barriers. I also like the "fair trade work" motto from MobileWorks but this is more closely connected to work being offered to developing countries. But I think that "open work" captures better the essence of the advantages behind crowdsourcing.

Update: The term open is indeed also associated with free-as-in-beer consumption. However, open can refer both to the supply-side (production) and the demand-side (consumption). For example:

  • Linux is open, in the sense that anyone can take the source code, modify it, and contribute back (open production); open source software is also available, often, for free, for installation to any machine (open consumption). 
  • In publishing, open access typically means accessing papers without paying (open consumption), but there are also journals (e.g., PLoS ONE) that accept pretty much any technically-valid paper (open production).
In the case of crowdsourcing, "open work" would refer mainly to the open production side. As in the production side of open source, and open access publishing, it does not mean that the participants are not paid for the generation of the artifacts.

What do you think?