Monday, February 29, 2016

A Cohort Analysis of Mechanical Turk Requesters

In my last post, I examined the number of "active requesters" on Mechanical Turk, and concluded that there is a significant decline in the numbers over the last year. The definition of "active requester" was: "A requester is active at time X if he has a HIT running at time X". A potential issue with this definition is that an improvement in the speed of HIT completion (e.g., due to increased labor supply) could drive down that number.

For this reason, I decided to perform a proper cohort analysis for the requesters on Mechanical Turk.  In the cohort analysis that follows, we will examine how many requesters that have first appeared in the platform on a given month (say September 2015), are still posting tasks in the subsequent months.

Here is the resulting "layer cake plot" that indicates that happens in each cohort. Each of the layers corresponds to requesters that were first seen on a given month. (code, data) (Read this post, if want a  little bit more background on how the plot should "look like".)

For example, the bottom layer corresponds to all the requesters that were first seen on May 2014 (the first month that the new version of MTurk Tracker started collecting data). We can see that we had ~2700 "new" requesters on that month. (The May-2014 cohort obviously contains all prior cohorts in our dataset, as we do not know when these requesters really started posting.) Out of these requesters, approximately 1700 also posted a task on June 2014 or later, approximately 1000 of these have posted a task on March 2015 or later, and approximately 500 have posted a task on February 2016.

The layer on top (slightly darker blue) illustrates the evolution of the June 2014 cohort. By stacking them on top of each other, we can see the composition of the requesters that have been active in every single month.

As the plot makes obvious, until March 2015, the acquisition of new requesters every month was compensating for the requesters that were lost from the prior cohorts. However, starting March 2015, we start seeing a decline in the overall numbers, as the total decline in requesters from prior cohorts dominates the acquisition of new requesters. So, the cohort analysis supports the conclusions of the prior post, as the trends and conclusions are very similar (always good to have a few robustness checks).

Of course, a more comprehensive cohort analysis would also analyze the revenue generated by each cohort, and not just the number of active users. That requires a little bit more digging in the data, but I will do that in a subsequent post.

Friday, February 26, 2016

The Decline of Amazon Mechanical Turk

It seems that after years of neglect, Mechanical Turk starts losing its appeal. In our latest measurement, we see Mechanical Turk losing 50% of its requesters in a YoY measurement.

A few days ago,  Kristy Milland (aka SpamGirl) asked me if there is a way to see the active requesters on Mechanical Turk over time. I did not have this dashboard on Mechanical Turk tracker, but it was an important metric, so I decided to add it in the MTurk Tracker website.

So, now MTurk Tracker has a tab called "Active Requesters" which shows how many requesters are "active" on Mechanical Turk at any given time. The definition of "Active at time X" means "had a task that was running on MTurk before time X and after time X".

Here is the chart for the active requesters between Jan 1, 2015 and February 28, 2016: 

As you can see, starting March 2016 (that is before the announcement of price increases), we see a decline in the number of active requesters. Interestingly, when the fee increases are announced, we see a small "valley" around the period of fee increases. The numbers remain stable until November, but after that we see a steady decline.

Overall, we observe a YoY decline of almost 50% in terms of active requesters.

What is driving the decline? Hard to tell. Perhaps requesters abandon crowdsourcing in favor of more automated solutions, such as deep learning. Perhaps requesters with long running jobs build their own workforce (eg using UpWork). Perhaps they use alternative platforms, such as Crowdflower. Or perhaps my own metric is flawed, and I need to revise it.

But, unless we have a bug in the code, the future does not seem promising for Mechanical Turk. And this is a shame.