Friday, July 20, 2012

On Retention Rates

I spent this week in Suncadia, a small resort near Seattle, in the amazing workshop on Crowdsourcing Personalized Education, organized by Dan WeldMausamEric Horvitz, and Meredith Ringel Morris. (The  slides should be available online soon.) It was an amazing workshop, and the quality of the projects that were presented was simply exceptional.

Beyond the main topic of the workshop (online education and crowdsourcing), I noticed one measure of success being mentioned by multiple projects: RetentionRetention is typically defined as the number of users that remain active, compared to the total number of users.

How exactly to define retention is a tricky issue. What is the definition of an "active" user? What is the total number of users? You can manipulate the number to give you back something that looks nice. 

For example, for online courses many people list the number of registered participants as users (e.g, 160,000 students enrolled for the AI class). Then, if you take the 22,000 students that graduated as active, you get a retention rate of 13.75%. 

Of course, if you want to make the number higher (13.75% seems low) you can just change the definition of what counts as user (e.g., "watched at least one video") and decrease the denominator, or change the definition of active user (e.g., "submitted an assignment") and increase the nominator. 

A relatively common definition is number of users that come back at least once a week, divided by the number of users registered in that time period. At least Duolingo, Foldit, and a few other projects seemed to have a similar definition. With this definition, a number of 20% and above is typically considered successful, as this was also noted to be the retention rate for Twitter.

How to measure retention in online labor?

So, I started wondering what is the appropriate measure of retention for online labor sites. The "come back" at least once every week" is a weak one. We need people to engage with the available tasks.

One idea is to measure percentage of users that earn >X dollars per week. To avoid comparing workers with different lifetimes, it is a good practice to compare users that started at the same time (e.g., the "May 2012 cohort") and see the retention rates stratified by cohort. The problem with the previous metric is that you need to examine it  not only for different cohorts but also for different values of X.

An alternative approach is to examine the "hour worked per week". In that case, we need to examine what percentage of the 40-hour working week is captured by the labor site.

Say that we have 500,000 registered users and we observe that at any given time we have 5,000 of them active on the site. (These are commonly quoted numbers for Mechanical Turk.) What this 1% activity mean?

First, we need to see what a good comparative metric. Suppose that full success is that all 500,000 workers come and work full time. In that case, we can expect an average activity level of (40*50)/(24*365)=22.8% (40 is the total working hours in a week, 50 is the working weeks in a year, and 24*365 is the total number of hours in a year). So an average of 22.8% of activity is the maximum attainable; to keep things simpler, we can say that seeing on average 20% of the users working on the site is perfect.

So, if a site has an average of 1% activity, it is not as bad as it sounds. It means that 1 out of 20 registered users actually work full time on the site.