Monday, May 3, 2010

KDD Accepted Papers, Deadlines for HCOMP 2010 and SNAKDD 2010 workshops

The accepted papers for KDD 2010 are now posted and available at

As a reminder, the KDD this year will take place in Washington DC, from July 25th to July 28th.

I would also like to draw your attention to two KDD workshops that I am involved with, the Human Computation Workshop (HCOMP 2010) and the Workshop on Social Network Mining and Analysis (SNAKDD 2010). Both have submission deadlines on May 7th. It is pretty easy to travel to DC, so if you have any cool idea, or demo that would be appropriate for these workshops, please submit!

For those too bored to visit the respective websites, here are the call for papers:

Human Computation Workshop (HCOMP 2010) - Call for Papers

Most research in data mining and knowledge discovery relies heavily on the availability of datasets. With the rapid growth of user generated content on the internet, there is now an abundance of sources from which data can be drawn. Compared to the amount of work in the field on techniques for pattern discovery and knowledge extraction, there has been little effort directed at the study of effective methods for collecting and evaluating the quality of data.

Human computation is a relatively new research area that studies the process of channeling the vast internet population to perform tasks or provide data towards solving difficult problems that no known efficient computer algorithms can yet solve. There are various genres of human computation applications available today. Games with a purpose (e.g., the ESP Game) specifically target online gamers who, in the process of playing an enjoyable game, generate useful data (e.g., image tags). Crowdsourcing marketplaces (e.g. Amazon Mechanical Turk) are human computation applications that coordinate workers to perform tasks in exchange for monetary rewards. In identity verification tasks, users need to perform some computation in order to access some online content; one example of such a human computation application is reCAPTCHA, which leverages millions of users who solve CAPTCHAs every day to correct words in books that optical character recognition (OCR) programs fail to recognize with certainty.

Human computation is an area with significant research challenges and increasing business interest, making this doubly relevant to KDD. KDD provides an ideal forum for a workshop on human computation as a form of cost-sensitive data acquisition. The workshop also offers a chance to bring in practitioners with complementary real-world expertise in gaming and mechanism design who might not otherwise attend this academic conference.

The first Human Computation Workshop (HComp 2009) was held on June 28th, 2009, in Paris, France, collocated with KDD 2009. The overall themes that emerged from this workshop were very clear: on the one hand, there is the experimental side of human computation, with research on new incentives for users to participate, new types of actions, and new modes of interaction. This includes work on new programming paradigms and game templates designed to enable rapid prototyping, allow partial completion of tasks, and aid in reusability of game design. On the more theoretic side, we have research modeling these actions and incentives to examine what theory predicts about these designs. Finally, there is work on noisy results generated by such games and systems: how can we best handle noise, identify labeler expertise, and use the generated data for data mining purposes?

Learning from HComp 2009, we have expanded the topics of relevance to the workshop. The goal of HComp 2010 is to bring together academic and industry researchers in a stimulating discussion of existing human computation applications and future directions of this new subject area. We solicit papers related to various aspects of both general human computation techniques and specific applications, e.g. general design principles; implementation; cost-benefit analysis; theoretical approaches; privacy and security concerns; and incorporation of machine learning / artificial intelligence techniques. An integral part of this workshop will be a demo session where participants can showcase their human computation applications. Specifically, topics of interests include, but are not limited to:

  • Abstraction of human computation tasks into taxonomies of mechanisms
  • Theories about what makes some human computation tasks fun and addictive
  • Differences between collaborative vs. competitive tasks
  • Programming languages, tools and platforms to support human computation
  • Domain-specific implementation challenges in human computation games
  • Cost, reliability, and skill of labelers
  • Benefits of one-time versus repeated labeling
  • Game-theoretic mechanism design of incentives for motivation and honest reporting
  • Design of manipulation-resistance mechanisms in human computation
  • Effectiveness of CAPTCHAs
  • Concerns regarding the protection of labeler identities
  • Active learning from imperfect human labelers
  • Creation of intelligent bots in human computation games
  • Utility of social networks and social credit in garnering data
  • Optimality in the context of human computation
  • Focus on tasks where crowds, not individuals, have the answers
  • Limitations of human computation

Workshop on Social Network Mining and Analysis - Call for Papers

Social networks research has come a long way since the notable “six-degree separation” experiment. In recent years, social network research has advanced significantly, thanks to the prevalence of the online social websites and the availability of a variety of offline large-scale social network systems such as collaboration networks. These social network systems are usually characterized by the complex network structures and rich accompanying contextual information. Researchers are increasingly interested in addressing a wide range of challenges residing in these disparate social network systems, including identifying common static topological properties and dynamic properties during the formation and evolution of these social networks, and how contextual information can help in analyzing the pertaining social networks. These issues have important implications on community discovery, anomaly detection, trend prediction and can enhance applications in multiple domains such as information retrieval, recommendation systems, security and so on.

The fourth SNA-KDD '2010 aims to bring together practitioners and researchers with a specific focus on the emerging trends and industry needs associated with the traditional Web, the social Web, and other forms of social networking systems. Both theoretical and experimental submissions are encouraged. The interesting topics include (1) data mining advances on the discovery and analysis of communities, on personalization for solitary activities (like search) and social activities (like discovery of potential friends), on the analysis of user behavior in open fora (like conventional sites, blogs and fora) and in commercial platforms (like e-auctions) and on the associated security and privacy-preservation challenges; (2) social network modeling, scalable, customizable social network infrastructure construction, dynamic growth and evolution patterns identification and discovery using machine learning approaches or multi-agent based simulation.

The fourth SNA-KDD '2010 solicits contributions on social network analysis and graph mining, including the emerging applications of the Web as a social medium. Papers should elaborate on data mining methods, issues associated to data preparation and pattern interpretation, both for conventional data (usage logs, query logs, document collections) and for multimedia data (pictures and their annotations, multi-channel usage data). Topics of interest include but are not limited to:

  • Communities discovery and analysis in large scale online and offline social networks
  • Personalization for search and for social interaction
  • Recommendations for product purchase, information acquisition and establishment of social relations
  • Data protection inside communities
  • Misbehavior detection in communities
  • Web mining algorithms for clickstreams, documents and search streams
  • Preparing data for web mining
  • Pattern presentation for end-users and experts
  • Evolution of patterns in the Web
  • Evolution of communities in the Web
  • Dynamics and evolution patterns of social networks, trend prediction
  • Contextual social network analysis
  • Temporal analysis on social networks topologies
  • Search algorithms on social networks
  • Multi-agent based social network modeling and analysis
  • Application of social network analysis
  • Anomaly detection in social network evolution