Friday, June 28, 2013

Facebook implements brand safety, doing it "manually" (crowdsourcing?)

I just read that Facebook started thinking about brand safety, and will restrict ads from appearing next to content that may be controversial (e.g., adult-oriented content).

I was rather surprised to find out that Facebook has not been doing that already. It is know that Facebook has been using crowdsourcing to detect content that violates the terms of service. So, I assumed that the categorization of the content as brand-inappropriate was also part of that process. Apparently not.

Given the similarities of the two tasks (the difference between no-ads-for-brand-safety and violating-terms-of-service is often just part of intensity of the offense), I assume that Facebook is also going to adopt a crowdsourcing-style solution (perhaps with a private crowd), and then they will build a machine learning algorithm on top using the crowd judgements. At least the wording "In order to be thorough, this review process will be manual at first, but in the coming weeks we will build a more scalable, automated way" in the announcement seems to imply that.

Or perhaps, to blow my own horn, Facebook should just use Integral Ad Science, (aka AdSafe Media). At AdSafe, we built a solution for exactly this problem back in 2009, employing a combination of crowdsourcing and machine learning to detect brand-inappropriate content. We did not go just for porn, but also for other categories, such as alcohol use, offensive language, hate speech, etc. In fact, most of my work in crowdsourcing was inspired, one way or another, through the problems faced when trying to deploy a crowdsourcing solution at scale. Also, except for the academic research, my work with Integral also led to one of the best blog posts that I have written, "Uncovering an advertising fraud scheme (or, the Internet is for Porn)".

Perhaps, the next step is to demonstrate how to use Project Troia, together with a good machine learning toolkit in order to deploy quickly a system for detecting brand inappropriate content. Maybe Facebook could use that ;-)