1) Do the authors clearly state the aims of the research?


1.1) Do the authors state research questions, e.g., related to time-to-market, cost, product quality, process quality, developer productivity, and developer skills?

1.2) Do the authors state hypotheses and their underlying theories?

2) Is there an adequate description of the context in which the research was carried out?


2.1) The industry in which products are used (e.g. banking, telecommunications, consumer goods, travel, etc.)

2.2) If applicable, the nature of the software development organization (e.g. in-house department or independent software supplier)

2.3) The skills and experience of the subjects (e.g. with a language, a method, a tool, an application domain)

2.4) The type of software products used (e.g. a design tool, a compiler)

2.5) If applicable, the software processes being used (e.g. a company standard process, the quality assurance procedures, the configuration management process)

3) Do the authors explain how experimental units were defined and selected?


3.1) Do the authors explain how experimental units were defined and selected?

3.2) Do the authors state to what degree the experimental units are representative?

3.3) Do the authors explain why the experimental units they selected were the most appropriate for providing insight into the type of knowledge sought by the experiment?

3.4) Do the authors report the sample size?

4) Do the authors describe the design of the experiment?


4.1) Do the authors clearly describe the chosen design (blocking, within or between subject design, do treatments have levels)?

4.2) Do the authors define/describe all treatments and all controls?

5) Do the authors describe the data collection procedures and define the measures?


5.1) Are all measures clearly defined (e.g., scale, unit, counting rules)?

5.2) Is the form of the data clear (e.g., tape recording, video material, notes, etc.)?

5.3) Are quality control methods used to ensure consistency, completeness and accuracy of collected data?

5.4) Do the authors report drop-outs?

6) Do the authors define the data analysis procedures?


6.1) Do authors justify their choice / describe the procedures / provide references to descriptions of the procedures?

6.2) Do the authors report significance levels and effect sizes?

6.3) If outliers are mentioned and excluded from the analysis, is this justified?

6.4) Do the authors report or give references to raw data and/or descriptive statistics?

7) Do the authors discuss potential experimenter bias?


7.1) Were the authors the developers of some or all of the treatments? If yes, do the authors discuss the implications anywhere in the paper? (If the authors developed the treatments (or parts of them) without discussing the implications, the answer to question 7 is "not at all".)

7.2) Was training and conduct equivalent for all treatment groups?

7.3) Was there allocation concealment, i.e., did the researchers know to what treatment each subject was assigned?

8) Do the authors discuss the limitations of their study?


8.1) Do the authors discuss external validity with respect to subjects, materials, and tasks?

8.2) If the study was a quasi-experiment, do the authors discuss the design components that were used to address any study weaknesses?

8.3) If the study used novel measures, is the construct validity of the measures discussed?

9) Do the authors state the findings clearly?


9.1) Do the authors present results clearly?

9.2) Do the authors present conclusions clearly?

9.3) Are the conclusions warranted by the results and are the connections between the results and conclusions presented clearly?

9.4) Do the authors discuss their conclusions in relation to the original research questions?

9.5) Are limitations of the study discussed explicitly?

10) Is there evidence that the results can be used by other researchers / practitioners?


10.1) Do the authors discuss whether or how the findings can be transferred to other populations, or consider other ways in which the research can be used?

10.2) To what extent do authors interpret results in the context of other studies / the existing body of knowledge?