Monday, June 20, 2011

Crowdsourcing and the discovery of a hidden treasure

A few months back, I started advising Tagasauris, a company that provides media annotation services, using crowdsourcing. 

This month, Tagasauris is featured in a Wired article, titled "Hidden Treasure". It is a story of rediscovering a "lost" set of photos, from the shooting of the movie "American Graffiti". You can see the article by clicking the image:
Hidden Treasure

Rediscovered: Never before seen American Graffiti photos in the Magnum archive.

IN MARCH, the Magnum photo agency stumbled onto a remarkable find: Nearly two dozen lost photos from the set of American Graffiti. The images feature pre-Star Wars George Lucas as well as cast members like Richard Dreyfuss, Mackenzie Phillips, and Ron Howard, and they offer an unparalleled look at the making of the 1973 film. So where did Magnum discover these gems? In its own archive. Magnum had hired Tagasauris, a company that tags photos using Amazon Mechanical Turk workers, to add keywords to hundreds of thousands of untagged images. When those workers came across the Graffiti photos, they quickly identified the actors, scenes, and other image details. Magnum originally hoped the phototagging would improve its archive's searchability, which it has, but the agency was also thrilled that the initiative unearthed such an incredible trove - images that visually resurrect an American classic.

Since there are some interesting aspects of the story, which go beyond the simple "tag using MTurk" story, I would like to give a few more details that I consider interesting.

Magnum Photos

One of the clients of Tagasauris is Magnum Photos, a cooperative owned by its own photographer members, designated to handle the commercial aspect of their own work. The list of members of Magnum Photos include photographers such as Robert Capa, Henri Cartier-Bresson, David SeymourGeorge Rodger, Steve McCurry, and many others. (See their Wikipedia entry for further details.) A few photos in the Magnum Photos archive that you may recognize:


One of my favorite parts of the Magnum website is the Archival Calendar, where they have a set of photos showcasing various historic events. Beats Facebook browsing by a wide margin. But let's get back to the story.

The problem

So, what is the problem of Magnum Photos? The same problem that almost every single big media company faces: a very large number of media objects without useful, descriptive metadata. No keywords, no description, nothing to aid the discovery process. Just the image file and mechanical data about film number etc. (Well, my own photo archive looks very similar...)

This lack of metadata is the case not only for the archive but also for the new, incoming photos that arrive every day from its members. (To put it mildly, photographers are not exactly eager to sit, tag, and describe the hundreds of photos they shoot every day.) This means that a large fraction of the Magnum Photos archive, which contains millions of photos, is virtually unsearchable. The photos are effectively lost in the digital world, even though they are digitized and available on the Internet.

An example of such case of "lost" photos is a set of photos from the shooting of the movie "American Graffitti". People at Magnum Photos knew that one of their photographers, Dennis Stock who died in 2009, was on set during the production of the movie, and he had taken photos of the, then young and unknown, members of the team. Magnum Photos had no idea where these photos were. They knew they digitized the archive of Dennis Stock, they knew that the photos are in the archive, but nobody could locate the photos within the millions of other, untagged photos.

For those unfamiliar with the movie, American Graffiti is a 1973 film, by George Lucas (pre-Star Wars), with starring actors the, then unknowns, Richard DreyfussRon HowardPaul Le MatCharles Martin Smith,Cindy WilliamsCandy ClarkMackenzie Phillips and Harrison Ford. The latter shot to stardom of all the actors makes the movie almost a cult.

The Magnum Photos archive is a trove of similar "hidden treasures". Sitting there, waiting for some accidental, serendipitous discovery.

The tagging solution and the machine support

Magnum Photos had its own set of annotators. However, the annotators could not even catch up even with the volume of incoming photos. The task of going back and annotating the archive was an even more daunting task. This meant lost revenue for Magnum Photos, as if you cannot find a photo, you cannot license it, and you cannot sell it.

Tagasauris proposed to solve the problem using crowdsourcing. With hundreds of workers working in parallel, it became possible to tame the influx of untagged incoming photos, and start going backwards and tagging the archive.

Of course, vanilla photo tagging is not a solution. Workers type misspelled words (named entities are systematic offenders), try to get away with generic tags, etc. Following the lessons learned from ESP Game, and all the subsequent studies, Tagasauris built solutions for cleaning the tags, rewarding specificity, and, in general, clean up and ensure high-quality for the noisy tagging process.

A key component was the ability to match the tags entered by the workers with named entities, which themselves were then connected to Freebase entities.

The result? When workers were tagging the photos from Magnum Photos, they identified the actors in the shots, and the machine process in the background assigned "semantic tags" to the photos, such as [George Lucas], [Richard Dreyfuss], [Ron Howard], [Mackenzie Phillips], [Harrison Ford] and others.

Yes, humans + machines generate things that are better than the sum of the parts.

The machine support, cont.

So, how the workers discovered the photos from American Graffiti? As you may imagine, the workers had no idea that the photos that they were tagging were from the shooting of the film. They could identify the actors, but that was it.

Going from actor tagging to understanding the context of the photo shooting, is a task that cannot be required by layman, non-expert taggers. You need experts that can "connect the dots". Unfortunately, subject experts are expensive. And they tend not to be interested in tedious tasks, such as assigning tags to photos.

However, this "connecting the dots" is a task where machines are better than humans. We have recently seen how Watson, by having access to semantically connected ontologies (often generated by humans), could identify the correct answers to a wide variety of questions.

Tagasauris employed a similar strategy. Knowing the entities that appear in a set of photos, it is then possible to identify additional metadata. For example, look at the five actors that were identified in the photos (red boxes, with white background), and the associated semantic graph that links the different entities together:

Bingo! The entity that connects together the different entities is the entity "American Graffiti", which was not used by any worker.

At this point, you can understand how the story evolved. A graph activation/spreading algorithm suggests the tag, experts can verify it, and the rest is history.

Meagan Young looked at the stream of incoming photos, noticed the American Graffiti tag, realized that the "lost" photos were found, and she notified the others at Magnum Photos and Todd Carter, the CEO of Tagasauris. The "hidden treasure" was identified, and the Wired story was underway...

Crowdsourcing: It is not just about the humans

This is not a story to show how cool discovery based on linked entities is. This is old news for many people that work with such data. However, this is a simple example of using crowdsourcing in a more intelligent way that it is currently being used. Machines cannot do everything (in fact, they are especially bad in tasks that are "trivial" for humans) but when humans provide enough input, the machines can take it from there, and improve significantly the overall process.

Someone can even see the obvious next step: Use face recognition and allow tagging to be done collaboratively with humans and machines. Google and Facebook have very advanced algorithms for face recognition. Match them intelligently with humans, and you are way ahead of solutions that rely simply on humans to tag faces.

I think the lesson is clear: Let humans do what they do best, and let machines do what they do best. (And expect the balance to change as we move forward and machines can do more.) Undoing and ignoring decades of research in computer science, just because it is easier to use cheap labor, is a disservice not only to computer science. It is a disservice to the potential of crowdsourcing as well.