Sunday, July 29, 2012

The disintermediation of the firm: The feature belongs to individuals

My experience with online outsourcing

I joined the Stern Business School, back in 2004. In my first couple of year, my research approach was pretty much a continuation of my PhD years: I was doing a lot of coding and experimentation myself. However, at some point I got tired to writing boring code: Crawlers, front-end websites, and other "non-research" pieces of code were not only uninteresting but were also a huge drain of time.

So, I started experimenting with hiring coders. First locally at NYU. Unfortunately, non-research student coders turned out to be a bad choice. They were not experienced enough to write good code, and were doing this task purely for monetary reasons, not for learning. I got nothing useful out of this. Just expensive pieces of crap code.

In summer of 2005, I started experimenting with online outsourcing. I tried eLance, Guru, and Rent-A-Coder. I tentatively started posting there programming projects that were not interesting conceptually (e.g., "crawl this website and store the data in a CSV file", "grab the CSV data from that website and put them in a database", "create a demo website that does that", etc)

Quickly, I realized that this was a win-win situation: The code was completed quickly, the quality of the websites was much better than what I could prepare myself, and I was free to focus on my research. Once I started getting PhD students, outsourcing non-research coding requirements became a key part of my research approach: PhD time was too valuable to waste on writing crawlers and dealing with HTML parsing peculiarities.

Seven years of outsourcing: Looking back

Seven years have passed since my first outsourcing experiments. I thought it is now a good time to look back and evaluate.

Across all outsourcing sites (excluding Mechanical Turk), I realized that I had posted and hired contractors for a total of 235 projects. Honestly, I was amazed by the number but amortized this is just one project per 10 days, which is reasonably close to my expectations.

Given the reasonably large number of projects, I thought that I may be able to do some basic quantitative analysis to figure out what patterns lead to my own, personal satisfaction with the result. I started coding the results, adding variables that were both personal (how much did I care about the project? how detailed were the specs? how much did I want to spend?) and contractor-specific (past history, country of origin, communication while bidding, etc).

Quickly, even before finished coding, a pattern emerged: All the "exceeded expectations" projects were done by individual contractors or small teams of 2-3 people. All the "disappointments" were with contractors that were employees of a bigger contracting firm.

In retrospect, it is a matter of incentives: The employees do not have the incentive to produce to the maximum of their labor power. In contrast, individuals with their own company, produce much closer to their maximum capacity; the contractor-owners are also are connected to the product of their work, and they are better workers overall.

I would not attribute causality to my observation but rather self-selection: Individuals that are knowledgeable understand that the bigger firm does not have much to offer. In the past, the bigger firm was fulfilling the role of being visible and, therefore, bringing projects; the firm also offers a stable salary but for talented individuals this quickly becomes a stagnating salary.

With the presence of online marketplaces, the need to have a big firm to get jobs started decreasing. Therefore, the talented contractors do not really a bigger firm to bring the work.

The capable individuals disintermediate the firm.

The emergence of the individual

Although the phenomenon is still in its infancy, I expect to see the rise of individuals and the emergence of small teams to be an important trend in the next few years. The bigger firms will feel the increase pressure from agile teams of individuals that can operate faster and get things done quicker. Furthermore, talented individuals, knowing that they can find good job prospects online, they will start putting higher pressure on their employers: Either there is a more equitable share of the surplus, or the value-producing individuals will move into their own ventures.

Marx would have been proud: The value-generating labor is now getting into the position of reaping the value of the generated work. Ironically, this "emancipation" is happening through the introduction of capitalist free markets that connect the planet, and not through a communist revolution.

Friday, July 20, 2012

On Retention Rates

I spent this week in Suncadia, a small resort near Seattle, in the amazing workshop on Crowdsourcing Personalized Education, organized by Dan WeldMausamEric Horvitz, and Meredith Ringel Morris. (The  slides should be available online soon.) It was an amazing workshop, and the quality of the projects that were presented was simply exceptional.

Beyond the main topic of the workshop (online education and crowdsourcing), I noticed one measure of success being mentioned by multiple projects: RetentionRetention is typically defined as the number of users that remain active, compared to the total number of users.

How exactly to define retention is a tricky issue. What is the definition of an "active" user? What is the total number of users? You can manipulate the number to give you back something that looks nice. 

For example, for online courses many people list the number of registered participants as users (e.g, 160,000 students enrolled for the AI class). Then, if you take the 22,000 students that graduated as active, you get a retention rate of 13.75%. 

Of course, if you want to make the number higher (13.75% seems low) you can just change the definition of what counts as user (e.g., "watched at least one video") and decrease the denominator, or change the definition of active user (e.g., "submitted an assignment") and increase the nominator. 

A relatively common definition is number of users that come back at least once a week, divided by the number of users registered in that time period. At least Duolingo, Foldit, and a few other projects seemed to have a similar definition. With this definition, a number of 20% and above is typically considered successful, as this was also noted to be the retention rate for Twitter.

How to measure retention in online labor?

So, I started wondering what is the appropriate measure of retention for online labor sites. The "come back" at least once every week" is a weak one. We need people to engage with the available tasks.

One idea is to measure percentage of users that earn >X dollars per week. To avoid comparing workers with different lifetimes, it is a good practice to compare users that started at the same time (e.g., the "May 2012 cohort") and see the retention rates stratified by cohort. The problem with the previous metric is that you need to examine it  not only for different cohorts but also for different values of X.

An alternative approach is to examine the "hour worked per week". In that case, we need to examine what percentage of the 40-hour working week is captured by the labor site.

Say that we have 500,000 registered users and we observe that at any given time we have 5,000 of them active on the site. (These are commonly quoted numbers for Mechanical Turk.) What this 1% activity mean?

First, we need to see what a good comparative metric. Suppose that full success is that all 500,000 workers come and work full time. In that case, we can expect an average activity level of (40*50)/(24*365)=22.8% (40 is the total working hours in a week, 50 is the working weeks in a year, and 24*365 is the total number of hours in a year). So an average of 22.8% of activity is the maximum attainable; to keep things simpler, we can say that seeing on average 20% of the users working on the site is perfect.

So, if a site has an average of 1% activity, it is not as bad as it sounds. It means that 1 out of 20 registered users actually work full time on the site.

Friday, July 13, 2012

Why is oDesk called oDesk?

[This post has been removed after a request. You will need to stay in the dark until the Singularity. Or may be not.]

Monday, July 9, 2012

Discussion on Disintermediating a Labor Channel

Last Friday, I wrote a short blog post with the title "Disintermediating a Labor Channel: Does it Make Sense?" where I argued that trying to bypass a labor channel (Mechanical Turk, oDesk, etc) in order to save on the extra fees does not make much sense.

Despite the fact that there was no discussion in the comments, that piece seemed to generate a significant amount of feedback, across various semi-private channels (fb/plus/twitter) and in many real-life discussions

Fernando Pereira wrote on Google Plus:
Your argument sounds right, but I'm wondering about quality: can I control quality/biases in the outside labor platform? How do I specify labor platform requirements to meet my requirements? It could be different from quality control for outsourced widgets because outsourced labor units might be interdependent, and thus susceptible to unwanted correlation between workers.?
Another friend wrote in my email:
So, do you advocate that oDesk should be controlling the process? Actually, I'd rather have higher control over my employees and know who is doing what.
Both questions have similar flavor, and it indicates that I failed in expressing my true thoughts on the issue.

I do not advocate giving up control of the "human computation" process. I advocate in letting a third-party platform handle the "low level" recruiting and payment of the workers, preferably through an API-fied process. Payments, money transfer regulations, and immigration are big tasks that are best handled by specialized platforms. They are too much for most other companies. Handling such things on your own is as interesting as handling issues like aircondition, electricity supply, and failed disks and motherboards when you are building a software application: Let someone else do these things for you.

One useful classification that I think will clarify further my argument. Consider the different "service models" for crowdsourcing, which I have adapted from the NIST definition of cloud services.
  • Labor Applications/Software as a Service (LSaaS). The capability provided to the client is to use the provider’s applications running on a cloud-labor infrastructure. [...] The client does not manage or control the underlying cloud labor, with the possible exception of limited user-specific application configuration settings. Effectively, the client only cares about the quality of the provided results of the labor and does not want to know about the underlying workflows, quality management, etc. [Companies like CastingWords and uTest fall into this category: They offer a vertical service, which is powered by the crowd, but the end client typically only cares about the result]
  • Labor Platform as a Service (LPaaS). The capability provided to the client is to deploy onto the labor pool consumer-created or acquired applications created using programming languages and tools supported by the provider. The client does not manage or control the underlying labor pool, but has control of the overall task execution, including workflows, quality control, etc. The platform provides the necessary infrastructure to support the generation and implementation of the task execution logic. [Companies like Humanoid fall into this category: Creating a platform for other people to build their crowd-powered services on top.]
  • Labor Infrastructure as a Service (LIaaS). The capability provided to the client is to provision labor for the client, who then allocates workers to tasks. The consumer of labor services does not get involved with the recruiting process or the details of payment, but has full control everything else. Much like the Amazon Web Services approach (use EC2, S3, RDS, etc. to build your app), the service provider just provides raw labor and guarantees that the labor force satisfies a particular SLA (e.g., response time within X minutes, has the skills that are advertised in the resume, etc) [Companies like Amazon Mechanical Turk, oDesk, etc. fall into this category]
From these definitions, I believe that it does not make sense to build your own "infrastructure" if you are going to rely on remote workers. (I have a very different attitude for creating an in-house, local, team of workers that provides the labor, but this gets very close to being a traditional temp agency, so I do not treat this as crowdsourcing.)

I have no formed opinion on the "platform as a service" or a "software as a service" model (yet).

For the software as a service model, I think it is up to you to decide whether you like the output of the system (transcription, software testing, etc). The crowdsourcing part is truly secondary.

For the platform as a service model, I do not have enough experience with existing offerings to know whether to trust the quality assurance scheme. (Usual cognitive bias of liking-best-what-you-built-yourself applies here.) Perhaps in a couple of years, it would make no sense to build your own quality assurance scheme. But at this point, I think that we are all still relying on bespoke, custom-made schemes, with no good argument to trust a standardized solution offered by a third-party.

Friday, July 6, 2012

Disintermediating a Labor Channel: Does it Make Sense?

Over the years, I have talked with plenty of startups on building crowdsourcing services and platforms. Mechanical Turk (for the majority) and oDesk (for the cooler kids :-p) are common choices for recruiting workers. (For a comparison of the two, based on my personal experiences, look here.)

A common aspiration of many startups is to be able to build their own labor force and channel. Through Facebook, through cell phone, through ads, everyone wants to have direct control of the labor.

My reaction: This is stupid!

(Usual disclaimer that I work for oDesk for this year, etc., applies here, but I will stand behind my opinion even without any relationship to any labor marketplace.)

A very short-sighted reason for this is cost savings: oDesk and Mechanical Turk have a 10% fee. Therefore by disintermediating the labor platform, the company can save 10% of the labor cost. Well, to immediately make the adjustment, you do not save 10%. You save maximum 7%. The other 3% is the fee that will be taken by the payment channel (credit card, paypal, etc). The fact that the cost is borne by the worker when using Paypal is not true savings. For foreign workers, you also have a 1%-2% hit when using Paypal or a credit card, which goes on top of the best FX exchange rates. Add extra overhead to handle fraud, mistakes, and other things-that-happen, and the true savings are at most 5% to 6%.

But even 5%, isn't that something worth saving? No.

Problem #1:  If you are a small startup, saving 5% in labor costs should not be the goal. Just the cost of developing, managing, and handling complaints about payment is going to cost much more of development time than the corresponding savings. Creating a payment network is typically not at the core of a crowdsourcing startup, and it should not be. Let others deal with the payment and build your product.

Problem #2: If you are a bigger company, saving 5% in labor costs may be more important. However, if we are talking about labor, then bigger companies start hitting compliance issues. Handling money laundering regulations, handling IRS regulations, and many other HR-related aspects are typically worth the 5% extra. Who wants to be in the HR business if they have a product that is doing something else?

So, why people still obsess about this? Why everyone wants to build its own labor platform?

Well, because VC's ask for this. "If you are building on top of MTurk/oDesk/whatever, what is your competitive advantage? What prevents others from duplicating what you have done?

The knee-jerk reaction to this demand from VC's is to build a bespoke labor network. Which works fine, as long as you are talking about a relatively-small sized network. Once the size of the labor force becomes bigger, then other problems appear: Identity verification, compliance, regulations, immigration, are all tasks that are time consuming. (Especially when dealing with foreign contractors.) And they are never tasks that add value to the company. They are all pure overhead and solving such issues is absolutely non-trivial. 

Do you think it is accidental that Amazon does not pay in cash the MTurk contractors outside India and US? Having seen from the inside at oDesk what is the overhead to build reliable and compliant solutions for handling international payments, I can easily say: Stay away, this is not something you want to do at scale, having to deal with bureaucrats from all different countries around the world.

The parallel with building your own data centers vs getting computing resources from the cloud is direct and should be evident. Unless there is a very good reason to handle your own machines (and space, and aircondition, and handling electrical failures over the summer, etc etc), you just build your infrastructure using the cloud.  Same thing with labor.

Allocating resources to handle overhead tasks, is taking aware resources from the main goal: Building a better product! Let others take care of infrastructural issues.

Monday, July 2, 2012

Visualizations of the oDesk "oConomy"

[Crossposted from the oDesk Blog. Blog post written together with John Horton.]

A favorite pastime of the oDesk Research Team is to run analyses using data from oDesk’s database in order to provide a better understanding of oDesk’s online workplace and the way the world works. Some of these analyses were so interesting we started sharing them with the general public, and posted them online for the world to see.

Deep inside, however, we were not happy with our current approach. All our analyses and plots were static. We wanted to share something more interactive, using one of the newer javascript-based visualization packages. So, we posted a job on oDesk looking for d3.js developers and found Zack Meril, a tremendously talented Javascript developer. Zack took our ideas and built a great tool for everyone to use:

The oDesk Country Dashboard

This dashboard allows you to interactively explore the world of work based upon oDesk’s data. We list below some of our favorite discoveries from playing with its visualizations. Do let us know if you find something interesting. Note that the tool supports “deep linking,” which means that the URL in your address bar fully encodes the view that you see.

Visualization #1: Global Activity

The first interactive visualization shows the level of contractor activity of different countries across different days of the week and times of day. The pattern seems pretty “expected”:

On a second thought, though, we started wondering. Why do we see such regularity? The x-axis is GMT time. Given that oDesk is a global marketplace, shouldn’t the contractor activity to be smoother? Furthermore, oDesk has a relatively smaller number of contractors from Western Europe, so it seems kind of strange that our contractor community generally follows the waking and sleeping patterns of UK. Investigating closer, if you hover around the visualization, you see a closer look at what contractors are doing throughout the world:

At 8am GMT on Wednesday morning: Russia, India, and China are awake and their activity is increasing.

As we move towards the peak of the global activity at 3pm, the activity of the Asian countries has already started declining. However, at the same time North and Latin America start waking up, compensating for the decrease in activity in Asia, and leading to the world peak.

After 4pm GMT, Asia starts going to sleep, and the activity decreases. The activity continues to decline as America signs off, hitting the low point of activity at 4am GMT (but notice how China, Philippines, and Australia start getting active, preventing the activity level from going to zero).

Visualization #2: Country-Specific Activity

A few weeks back, we also wrote about the rather unusual working pattern of Philippines: contractors from the Philippines tend to keep a schedule that mostly follows U.S. working hours, rather than a “normal” 9-5 day. Since then, we realized that the Philippines is not the only country following this pattern. For example, Bangladesh and Indonesia have similar activity patterns to Philippines. So, we thought, why not make it easy to explore and find working patterns. They reveal something about the culture, habits, and even type of work that gets done in these countries. A few findings of interest:

Visualization #3: Work Type By Country

Finally, we wondered “What are the factors that influence these working patterns?” Why do some culturally similar countries have very similar working patterns (e.g., Russia and Ukraine), while others have very different patterns (e.g., Pakistan, Bangladesh, and India)? So, with our third visualization we examine types of work completed on oDesk broken down by country. We used the bubble chart from d3.js to visualize the results. Here is, for example, the breakdown for U.S.:

U.S. contractors are mainly working in tasks related to writing. We do see many clients explicitly limit their search for writing contractors to U.S.-based only, both for English proficiency but also (and perhaps more importantly) for the cultural affinity of the writers to their audience. Take a look at Russia: Almost all the work done in Russia is Web programming and design, followed by mobile and desktop development.

At the opposite end is the Philippines, where few programming tasks are being completed, but significant amounts of data entry, graphic design, and virtual assistant work happen:

Another interesting example is Kenya. As you can see, most of the work done there (and there is a significant amount of work done in Kenya) is about blog and article writing:

Exploring Further: Activity Patterns and Types of Projects

One pattern that was not directly obvious was the correlation between activity patterns and type of work. Countries that are engaging mainly in computer programming tend to have a larger fraction of users that use oDesk. For example, see the similarity in the activity patterns of Bolivia, Poland, Russia, and Ukraine: and the corresponding project types that get completed in these countries:

We should note however that the opposite does not hold: There are other countries that have similar activity patterns and high degree of contractor stickiness (e.g., Argentina, Armenia, Bolivia, Belarus, China, Uruguay, and Venezuela) that have rather different project completion dates.

Source available on Github

One thing that attracted me to spend my sabbatical at oDesk was the fact that oDesk has been pretty open with its data from the beginning. To this end, you will notice that the Country Explorer is an open source project, so you are welcome to just fork us on Github and get the code for the visualizations.

New ideas and visualizations

I am thinking of what other types of graphs would be interesting to create. Supply and demand of skills? Asking prices and transaction prices of contractors across countries and across skills? Of course, if you have specific ideas you’d like to see us work on, tell us in the comments! Happy to explore directions and data that you are interested in exploring.