Friday, May 25, 2012

The Emergence of Teams in Online Work

When I started as an assistant professor, back in 2004, and I joined the NYU/Stern Business School, I got into a strange position. I had funding to spend, but no students to work with. I had work to be done (mainly writing crawlers) that was time-consuming, but not particularly novel, or intellectually rewarding. Semi-randomly, at the same time, I have heard about the website Rent-A-Coder, which was being used by undergraduate students that were "outsourcing" their programming assignments. I started using Rent-A-Coder, tentatively at first, to get programming tasks done, and then, over time, I got fascinated by the concept of online work, and the ability to hire people online, and get things done. (My Mechanical Turk research, and my current appointment at oDesk is a natural evolution of these interests.)

As I started completing increasingly complicated projects using remote contractors, I started thinking on how we can best manage a diverse team of remote workers, each one being in a different location, working on different tasks, etc. The topic has many interesting questions that arise, both in terms of theory, and in terms of developing practical "best practices" guidelines.

While trying to understand better the theoretical problems that arise in the space, I was reading the paper "Online Team Formation in Social Networks" that was published in WWW2012; the paper describes a technique for identifying teams of people in a social network (i.e., graph) that have complementary skills and can form a well-functioning unit, and tries to do so while preserving workload restrictions for individual workers.

Given my personal experience, from the practical side, and the existence of research papers that deal with the topic, I got curious to understand whether the topic of online team formation is a fringe topic, or something that deserves further attention.

Do we see teams being formed online? If yes, is this a phenomenon that increases in significance?

So, I pulled the oDesk data and tried to answer the question.

How many teams have a given size? How this distribution evolves over time? I plotted the number of projects in each week that had x contractors that were active in the project (i.e., billed some time)

The results were revealing: Not only we observe teams of people being formed online but we also see an exponential increase in the number of teams of any given size. 

In fact, in the above graph, if we account for the fact that bigger teams contain an (exponentially) larger number of people, we can see that the majority of the online workers today are not working as individuals but are now part of an online team.

Update [thanks for the question, Yannis!]: Since the exponential growth of makes it difficult to understand the fraction of people working in teams and whether it is increasing/decreasing , here is the chart that shows what percentage of workers work in teams of a given size:

What is interesting is the consistent decrease in the fraction of people working along (teams of one), and in teams of 2-3. Instead, we see a slow but consistent increase in teams with size 4-7 and 8-16, as an overall fraction of the population. As you can see, over the last year, the percentage of contractors in teams with size 4-7 is getting close to surpass the number of contractors working along. Similarly, the percentage of contractors in teams of 8-16 is getting close to surpass the percentage of contractors in teams of 2-3. The trends for bigger teams seem also to be increasing but there is still too much noise to be able to infer anything.

What's coming?

Given the trend for online work to be done in teams, formed online, I expect to see a change in the way that many companies are being formed in the future. At this point, it seems far fetched that a startup company can be formed online, being distributed across the globe, and operate on a common project. (Yes, there are such teams but they are more of an exception, rather than the norm.)

But if these trends continue, expect sooner rather than later to see companies naturally hiring online and working with remote collaborators, no matter where the talent is located. People have been talking about online work being an alternative to immigration, but this seemed to be a solution for the remote future.

With the exponential increase that we observe, the future may come much sooner than expected.

Thursday, May 10, 2012

TREC 2012 Crowdsourcing Track

TREC 2012 Crowdsourcing Track - Call for Participation

 June 2012 – November 2012


As part of the National Institute of Standards and Technology (NIST)'s annual Text REtrieval Conference (TREC), the Crowdsourcing track investigates emerging crowd-based methods for search evaluation and/or developing hybrid automation and crowd search systems.

This year, our goal is to evaluate approaches to crowdsourcing high quality relevance judgments for two different types of media:
  1. textual documents
  2. images
For each of the two tasks, participants will be expected to crowdsource relevance labels for approximately 20k topic-document pairs (i.e., 40k labels when taking part in both tasks). In the first task, the documents will be from an English news text corpora, while in the second task the documents will be images from Flickr and from a European news agency.

Participants may use any crowdsourcing methods and platforms, including home-grown systems. Submissions will be evaluated against a gold standard set of labels and against consensus labels over all participating teams.

Tentative Schedule

  • Jun 1: Document corpora, training topics (for image task) and task guidelines available
  • Jul 1: Training labels for the image task
  • Aug 1: Test data released
  • Sep 15: Submissions due
  • Oct 1: Preliminary results released
  • Oct 15: Conference notebook papers due
  • Nov 6-9: TREC 2012 conference at NIST, Gaithersburg, MD, USA
  • Nov 15: Final results released
  • Jan 15, 2013: Final papers due


To take part, please register by submitting a formal application directly to NIST (even if returning participant). See the bottom part of the page at

Participants should also join our Google Group discussion list, where all track related communications will take place.


  • Gabriella Kazai, Microsoft Research
  • Matthew Lease, University of Texas at Austin
  • Panagiotis G. Ipeirotis, New York University
  • Mark D. Smucker, University of Waterloo

Further information

For further information, please visit

We welcome any questions you may have, either by emailing the organizers or by posting on the Google Group discussion page.

Saturday, May 5, 2012