Experimental evaluation typically involves subjecting the system, or systems, to one or more workloads. A workload may be characterized as a benchmark with a given set of inputs.
A workload can be inappropriate, ignored, inconsistent, or irreproducible:
Inappropriate Workloads
A workload is inappropriate when it is flawed or does not reflect the workload that is implicit in the claim.
Ignored Workloads
In the real world, most systems are subjected to diverse workloads. The selection of workloads must include sufficient diversity to support the intended claims.
Inconsistent Workloads
When an evaluation compares two or more systems, it is essential that the systems under test are subjected to the same workloads.
Irreproducible Workloads
The workloads used to evaluate an innovation should include at least some workloads that others have access to; this will enable others to reproduce the results.