Earlier this year, together with Ahmed Elmagarmid and Vassilios Verykios, we published a survey article at IEEE TKDE on duplicate record detection (also known as record linkage, deduplication, and with many other names).
Although I see this paper as a good effort in organizing the literature in the field, I will be the first to recognize that the paper is incomplete. We tried our best to include every research effort that we identified, and the reviewers helped a lot in this respect. However, I am confident that there are still many nice papers that we missed.
Furthermore, since the time the paper has been accepted for publication, many more papers have been published and many more will be published in the future. So, this means that the useful half-life of (any?) such survey is necessarily short.
How can we make such papers more relevant and more resistant to deprecation? One solution that I am experimenting with is to make the survey article a wiki, and then post it to Wikipedia, allowing other researchers to add their own papers in the survey.
I am not sure if Wikipedia is the best option, due to licensing issues, though. A personal wiki may be a better option, but I do not have a good grasp of the pros and cons of each approach. One of the benefits of Wikipedia is the existence of nice templates for handling citations. One of the disadvantages is the copyright license of Wikipedia, which may discourage (or prevent) people from posting material there.
Furthermore, it is not clear that a wikified document is the best way to organize a survey. A few days back, I got a (forwarded) email from Foster Provost, who was seeking my opinion for the best way to organize an annotated bibliography. (Dragomir Radev had a similar question.) Is a wiki the best option? Or is it by construction too flat? Should we use some other type of software that allows people to generate explicit, annotated connections between the different papers? (Any public tool?)
Any ideas?