The reasons are rather obvious: We need to be able to reproduce the experiment, to avoid any hidden bias, catch errors, and even avoid outright fraud. Furthermore, this encourages publications of techniques that are easy to implement and test. Why do we care? If the method is impossible to implement then it is an obstacle to research progress. A published paper that claims to be the state of the art, but is not reproducible may prevent other reproducible methods from being published, just for lack of comparison with the current state of the art.
Now, to achieve experimental repeatability we need two things:
- Access to the data sets
- Access to the code
The second aspect is access to the underlying code. One may argue that instead of giving access to the code we should describe clearly how to implement the algorithms, give the settings, and so on. This avoids any intellectual property issues, and everyone is happy. Personally, I do not buy this. No matter how nicely someone implements someone else's algorithms, nobody is going to spend much of time optimizing the code for a competing technique. This may lead to flawed experimental comparisons. Another alternative is to use common datasets and simply pick the performance numbers from the published paper, without reimplementing the competing technique. (This works only when the underlying hardware is irrelevant -- e.g., for precision/recall experiments in information retrieval.)
My own take? Encourage publication of open source software. If the code is open and available, comparisons are easy, and the whole issue of experimental repeatability becomes moot. No need for committees to verify that the reported results are indeed correct, no need to upload code into machines with different architecture, making sure that the code runs without any segmentation faults, and so on. If the code is available, even if the results are incorrect, someone will catch that in the future. (If the results are incorrect, the code and data is available, and nobody cares to replicate the results, then experimental repeatability is a moot point.)
Now, it is easy to talk about open source, but anyone who tries knows what a pain it is to take the scripts used to run experiments and make them ready to use by anyone else. (Or even to be reused later, from the author :-) Therefore, we need to give further incentives. The idea of the JMLR journal to have a track for submissions of open source software; this track serves as "a venue for collection and dissemination of open source software"
Perhaps this is the way to proceed, an alternative to the "experimental repeatability requirements" that may be too difficult to follow.