Evaluate 2011 is not a mini-conference, but a "work"shop. Our goal is to improve the culture of experimental evaluation in our community. The participants of the workshop will author and publish a paper at a venue like the Communications of the ACM. That paper will make the case for good evaluations by presenting evaluation anti-patterns -- fallacies and pitfalls related to the experimental evaluation of systems and software.
We solicit submissions of short descriptions of anti-patterns related to aspects such as picking baselines, benchmarks or workloads, experimental context, metrics, bias, perturbation, or statistical methods. Each submission should describe a fallacy, preferably one the authors have made in their own past work. The submission should include a memorable title naming the fallacy, a concise description, the concrete example (including citations where possible) with potentially significant consequences, and a convincing motivation for why the example is important to be shared with the community. The entire submission has to be provided as a short text entered in the "abstract" field of the submission system.
At the workshop, authors of accepted submissions will introduce their anti-pattern in a short presentation, and together, the workshop participants will structure, organize, and integrate the collected fallacies into a common document.
Draft Title and Abstract of the Resulting Paper
Evaluation Anti-Patterns: A Guide to Bad Experimental Computer Science
Bad evaluations misdirect research and curtail creativity. A poorly performed but successfully published evaluation can encourage fruitless investigation of a flawed idea, while publication of flawed observation can discourage further exploration of an important area of research. In this paper we identify N common methodological pitfalls, including many we've fallen for ourselves. We argue that this exposure to methodological shortcomings is damaging our research. We claim that this reflects a lack of a well established culture of rigorous evaluation, and that this is due in part to the youth and dynamism of computer science, which work against the establishment of sound methodological norms. We challenge researchers to reconsider the importance of rigorous evaluation, and suggest that a critical reappraisal may lead to more productive and more creative computer science research.
Please submit your contribution in the form of just a short textual "abstract" on the Evaluate 2011 Easy Chair site.
Evaluate 2011 is the second workshop of the Evaluate workshop series. The first Evaluate workshop, Evaluate 2010, was held at SPLASH/OOPSLA 2010 in Reno/Tahoe, Nevada, USA.