A Computer Scientist in a Business School

Last week I was at Google for the annual Google Faculty Summit. While discussing research challenges related to social media and shopping, a common theme and questions emerged: How can we evaluate the trustworthiness of the reviews that abound on the Internet?

Past Performance predicts Future Performance

Current solutions focus on the trustworthiness and history of a reviewer in each site. For example, Amazon has the reviewer rank, which is computed using the amount of reviews contributed, the helpful votes that they amassed, and other secret-sauce factors. Other sites, like Yelp, follow similar approaches to identify the best reviewers.

Of course, knowing the good reviewers is very valuable: Unlike investments, past performance is a strong indicator of future performance. Reviewers that wrote good reviews in the past are likely to write good reviews in the future. Or, more general, high-quality users in the past are likely to be high-quality in the future.

So, problem solved! We just need to know which user has high quality!

Network Effects and Closed Reputation Platforms

Well, at this point we have a problem. Today, we treat site-specific user profiles as separate individuals. A profile of a user on Amazon is not connected to the corresponding reviewer profile in Epinions, in NewEgg, in B&H. So, reviewers that have written tens of reviews and amassed thousands of helpful votes will be insignificant newbies if they decide to write some reviews on B&H website. Similarly, a person that has contributed plenty of reviews and discussions on Chowhound over the years, will be an insignificant newbie on Yelp. The result? We cannot trust the reviews of these individuals, even if they have proven themselves trustworthy in the past!

This is a lock-in associated with network effects, which is similar to the lock-in that happens under closed and proprietary standards. For example, Microsoft has achieved domination in Office Productivity software by keeping proprietary the file formats for Office. Since no other office productivity suite could inter-operate with Office, smaller players could simply not compete: the minority of users that did not use Office could not exchange files with the Office users. Under such scenarios, either we see a market split into isolated markets (the case of reputation-based sites today), or a single dominant player (e.g., in Office).

Closed Standards and Networks Effects

Interestingly enough, under closed standards, when the market ends up being fragmented, this is not optimum. An excellent example is the SMS market in the United States. Before 2000, the different telecoms did not allow their subscribers to send SMS messages to subscribers of other companies. The main rationale was to force users to adopt the network already chosen by their friends, if they wanted to text each other. So the market looked like that:

Interesting approach but the result was not ideal: texting in the US was essentially non-existent before 2000. (I vividly remember when I arrived in the US how nobody was using SMS to communicate with each other. In Europe sending SMS was commonplace.) However, once the networks decided to cooperate and allow SMS to flow freely across networks, the market took off, attracting more players and being much more useful.

In other words, if we do not have a monopoly, closed networks are suboptimal. Fragmented markets for goods with network effects are almost never optimal, even for the current dominant players.

The Need for Open Reputation and Reputation Integration

I believe that we see a similar case in the reputation identities. Identities are fragmented across sites and users have little incentive to contribute high-quality content in sites in which they do not have established presence: Nobody pays attention to them in any case.

This also makes it difficult to start from scratch any website that requires the existence of trustworthy reviews in order to attract viewers. Visitors will not come as there are no trustworthy reviews, and reviewers already established in other sites will not join.

So, it is important to start thinking on how to integrate reputations across different websites. It is simply not optimal to have fragmented identities. Even though there are privacy concerns for anonymous profiles, there should be at least the capability to connect identities across sites for users that do not try to preserve their anonymity. By integrating profiles across websites, we can allow reputation to flow across websites and creating a better, comprehensive profile of the participants in today's Internet.

This can allow participants to build their reputation profiles without worrying about lock in. OpenID moves towards the right direction. Next versions should allow explicit, distributed connection of profiles, without requiring a dominant players to control the profile. (Thank you for the offer, Facebook. I will pass.)

Third-party services can play this role as well. By identifying profiles across sites that belong to the same individual, we can quickly learn about the identity of a new contributor. The history can follow the users that are not worried about anonymity. Such strong reputation signals can help improve any existing site that relies on user contributed content. In the presence of strong reputations, low-quality contributors will simply never have a chance of getting a dominant position in the marketplace.

As I mentioned in my earlier blog post, lack of credible reputation mechanisms simply degenerates any market into a market for lemons. In the attention economy, lack of strong reputation signals simply gives the incentives to spammer to come and pollute. Reputation integration mechanisms can solve this issue.

Cross-posted at the Reppify blog (disclaimer: I am an advisor for Reppify)

A Computer Scientist in a Business School

Sunday, August 8, 2010

Reputation Integration and the Future of Reviews