{\rtf1\ansi\deff0\deftab360 {\fonttbl {\f0\fswiss\fcharset0 Arial} {\f1\froman\fcharset0 Times New Roman} {\f2\fswiss\fcharset0 Verdana} {\f3\froman\fcharset2 Symbol} } {\colortbl; \red0\green0\blue0; } {\info {\author Biblio}{\operator }{\title Biblio RTF Export}} \f1\fs24 \paperw11907\paperh16839 \pgncont\pgndec\pgnstarts1\pgnrestart Allen, A, Aragon C, Becker C, Carver J, Chis A, Combemale B, Croucher M, Crowston K, Garijo D, Gehani A et al..\'a0 2017.\'a0\'a0Engineering Academic Software (Dagstuhl Perspectives Workshop 16252). Dagstuhl Manifestos. 6:1?20.\par \par Andujar, C, Schiaffonati V, Schreiber FA, Tanca L, Tedre M, van Hee K, van Leeuwen J.\'a0 2012.\'a0\'a0The Role and Relevance of Experimentation in Informatics. \par \par Atmanspacher, H, Lambert BL, Folkers G, Schubiger PA.\'a0 2014.\'a0\'a0Relevance relations for the concept of reproducibility. J. R. Soc. Interface. 11(94)\par \par Bachmann, A, Bird C, Rahman F, Devanbu P, Bernstein A.\'a0 2010.\'a0\'a0The missing links: bugs and bug-fix commits. Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering. :97?106.\par \par Bailey, DH.\'a0 2009.\'a0\'a0Misleading performance claims in parallel computations. Proceedings of the 46th Annual Design Automation Conference. :528?533.\par \par Bailey, DH.\'a0 1992.\'a0\'a0Misleading Performance Reporting in the Supercomputing Field. Sci. Program.. 1:141?151.\par \par Bailey, DH.\'a0 1991.\'a0\'a0Twelve ways to fool the masses when giving performance results on parallel computers. Supercomputing Review. :54--55.\par \par Baker, M.\'a0 2012.\'a0\'a0Independent labs to verify high-profile papers. Nature | News. \par \par Basili, VR, Caldiera G, Rombach DH.\'a0 1994.\'a0\'a0The Goal Question Metric Approach. Encyclopedia of Software Engineering. \par \par Basili, VR.\'a0 1996.\'a0\'a0The role of experimentation in software engineering: past, current, and future. Proceedings of the 18th international conference on Software engineering. :442?449.\par \par Begley, S.\'a0 2012.\'a0\'a0More trial, less error - An effort to improve scientific studies. Reuters. \par \par Bernard, C.\'a0 1950.\'a0\'a0An introduction to the study of experimental medicine. Journal of the American Pharmaceutical Association. 39(10)\par \par Beveridge, W I B.\'a0 1957.\'a0\'a0The Art of Scientific Investigation. \par \par Bird, C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu P.\'a0 2009.\'a0\'a0Fair and balanced?: bias in bug-fix datasets Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering. :121?130.\par \par Blackburn, SM, Garner R, Hoffmann C, Khang AM, McKinley KS, Bentzur R, Diwan A, Feinberg D, Frampton D, Guyer SZ et al..\'a0 2006.\'a0\'a0The DaCapo benchmarks: java benchmarking development and analysis. OOPSLA '06: Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications. :169?190.\par \par Blackburn, SM, Diwan A, Hauswirth M, Sweeney PF, Amaral J N, Brecht T, Bulej L\\'?r, Click C, Eeckhout L, Fischmeister S et al..\'a0 2016.\'a0\'a0The Truth, The Whole Truth, and Nothing But the Truth: A Pragmatic Guide to Assessing Empirical Evaluations. ACM Trans. Program. Lang. Syst.. 38:15:1?15:20.\par \par Blackburn, SM, McKinley KS, Garner R, Hoffmann C, Khan AM, Bentzur R, Diwan A, Feinberg D, Frampton D, Guyer SZ et al..\'a0 2008.\'a0\'a0Wake Up and Smell the Coffee: Evaluation Methodology for the 21st Century. Commun. ACM. 51:83?89.\par \par Bonnet, P, Manegold S, Bj\'f8rling M, Cao W, Gonzalez J, Granados J, Hall N, Idreos S, Ivanova M, Johnson R et al..\'a0 2011.\'a0\'a0Repeatability and workability evaluation of SIGMOD 2011. SIGMOD Rec.. 40:45?48.\par \par Buse, RPL, Sadowski C, Weimer W.\'a0 2011.\'a0\'a0Benefits and Barriers of User Evaluation in Software Engineering Research. OOPSLA '11: Proceedings of the ACM international conference on Object oriented programming systems languages and applications. \par \par Childers, BR, Fursin G, Krishnamurthi S, Zeller A.\'a0 2016.\'a0\'a0Artifact Evaluation for Publications (Dagstuhl Perspectives Workshop 15452). Dagstuhl Reports. 5:29?35.\par \par Clark, B, Deshane T, Dow E, Evanchik S, Finlayson M, Herne J, Matthews J N.\'a0 2004.\'a0\'a0Xen and the art of repeated research. Proceedings of the annual conference on USENIX Annual Technical Conference. :47?47.\par \par Committee on Academic Careers for Experimental Computer Scientists, National Research Council.\'a0 1994.\'a0\'a0Academic Careers for Experimental Computer Scientists and Engineers. :152.\par \par Computer Science and Telecommunications Board.\'a0 1994.\'a0\'a0Academic careers for experimental computer scientists and engineers. Commun. ACM. 37:87?90.\par \par Curtsinger, C, Berger ED.\'a0 2013.\'a0\'a0STABILIZER: Statistically Sound Performance Evaluation. Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems. :219?228.\par \par Curtsinger, C, Berger ED.\'a0 2015.\'a0\'a0Coz: Finding Code That Counts with Causal Profiling. Proceedings of the 25th Symposium on Operating Systems Principles. :184?197.\par \par Davidson, JE.\'a0 2005.\'a0\'a0Evaluation Methodology Basics: The Nuts and Bolts of Sound Evaluation. :280.\par \par Delling, D, Demetrescu C, Johnson DS, Vitek J.\'a0 2016.\'a0\'a0Rethinking Experimental Methods in Computing (Dagstuhl Seminar 16111). Dagstuhl Reports. 6:24?43.\par \par Denning, PJ.\'a0 1981.\'a0\'a0ACM president's letter: performance analysis: experimental computer science as its best. Commun. ACM. 24:725?727.\par \par Denning, PJ.\'a0 2005.\'a0\'a0Is Computer Science Science? Commun. ACM. 48:27?31.\par \par Dittrich, J.\'a0 2010.\'a0\'a0Paper Bricks: An Alternative to Complete-Story Peer Reviewing. SIGMOD Record. 39(4)\par \par Drummond, C.\'a0 2009.\'a0\'a0Replicability in not Reproducibility: Nor is it Good Science. The 4th workshop on Evaluation Methods for Machine Learning. \par \par Druschel, P, Isaacs R, Gross T, Shapiro M.\'a0 2006.\'a0\'a0Fostering Systems Research in Europe: A White Paper by EuroSys, the European Professional Society in Systems. \par \par Eide, E, Stoller L, Lepreau J.\'a0 2007.\'a0\'a0An experimentation workbench for replayable networking research. Proceedings of the 4th USENIX conference on Networked systems design & implementation. :16?16.\par \par Eisenberg, M.\'a0 2003.\'a0\'a0Creating a computer science canon: a course of "classic" readings in computer science. Proceedings of the 34th SIGCSE technical symposium on Computer science education. :336?340.\par \par Feitelson, DG.\'a0 2006.\'a0\'a0Experimental Computer Science: The Need for a Cultural Change. \par \par Fenton, NE, Pfleeger S L.\'a0 1998.\'a0\'a0Software Metrics: A Rigorous and Practical Approach. \par \par Feynman, R.\'a0 1974.\'a0\'a0Cargo Cult Science. Engineering and Science. 37(7):10-13.\par \par Fleming, PJ, Wallace JJ.\'a0 1986.\'a0\'a0How Not to Lie with Statistics: The Correct Way to Summarize Benchmark Results. Commun. ACM. 29:218?221.\par \par Frachtenberg, E, Feitelson DG.\'a0 2005.\'a0\'a0Pitfalls in parallel job scheduling evaluation. Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing. :257?282.\par \par Georges, A, Eeckhout L, Buytaert D.\'a0 2008.\'a0\'a0Java performance evaluation through rigorous replay compilation. Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications. :367?384.\par \par Georges, A, Buytaert D, Eeckhout L.\'a0 2007.\'a0\'a0Statistically rigorous java performance evaluation. OOPSLA '07: Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications. :57?76.\par \par Gil, J Y, Lenz K, Shimron Y.\'a0 2011.\'a0\'a0A microbenchmark case study and lessons learned. Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, &\\#38; VMIL'11. :297?308.\par \par Hanenberg, S.\'a0 2010.\'a0\'a0Faith, hope, and love: an essay on software science's neglect of human factors. Proceedings of the ACM international conference on Object oriented programming systems languages and applications. :933?946.\par \par Hoefler, T, Belli R.\'a0 2015.\'a0\'a0Scientific Benchmarking of Parallel Computing Systems: Twelve Ways to Tell the Masses when Reporting Performance Results. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. :73:1?73:12.\par \par H\'f6st, M, Wohlin C, Thelin T.\'a0 2005.\'a0\'a0Experimental context classification: incentives and experience of subjects. Proceedings of the 27th international conference on Software engineering. :470?478.\par \par Huff, D.\'a0 1954.\'a0\'a0How to Lie with Statistics. \par \par Ioannidis, JPA.\'a0 2005.\'a0\'a0Why Most Published Research Findings Are False. PLoS Medicine. 2:e124.\par \par Jain, R.\'a0 1991.\'a0\'a0The Art of Computer Systems Performance Analysis: techniques for experimental design, measurement, simulation, and modeling. \par \par Kaiser, JF.\'a0 2010.\'a0\'a0Richard Hamming - You and Your Research. Simula Research Laboratory. :37-60.\par \par Kalibera, T, Jones R.\'a0 2013.\'a0\'a0Rigorous Benchmarking in Reasonable Time. Proceedings of the 2013 International Symposium on Memory Management. :63?74.\par \par Keahey, K, Desprez F.\'a0 2012.\'a0\'a0Supporting Experimental Computer Science. \par \par Kitchenham, BA, Pfleeger S L, Pickard LM, Jones PW, Hoaglin DC, Emam K E, Rosenberg J.\'a0 2002.\'a0\'a0Preliminary guidelines for empirical research in software engineering. IEEE Trans. Softw. Eng.. 28:721?734.\par \par Ko, AJ, Latoza TD, Burnett MM.\'a0 2015.\'a0\'a0A Practical Guide to Controlled Experiments of Software Engineering Tools with Human Participants. Empirical Softw. Engg.. 20:110?141.\par \par Lea, D, Bacon DF, Grove D.\'a0 2008.\'a0\'a0Languages and performance engineering: method, instrumentation, and pedagogy. SIGPLAN Not.. 43:87?92.\par \par Lehrer, J.\'a0 2010.\'a0\'a0The Truth Wears Off: Is there something wrong with the scientific method? The New Yorker. (December 13, 2010):52.\par \par Lilja, DJ.\'a0 2005.\'a0\'a0Measuring Computer Performance: A Practitioner's Guide. \par \par Liskov, B.\'a0 1992.\'a0\'a0Report on Workshop on Research in Experimental Computer Science. :49.\par \par Manegold, S, Manolescu I, Afanasiev L, Feng J, Gou G, Hadjieleftheriou M, Harizopoulos S, Kalnis P, Karanasos K, Laurent D et al..\'a0 2010.\'a0\'a0Repeatability & workability evaluation of SIGMOD 2009. SIGMOD Rec.. 38:40?43.\par \par Manolescu, I, Afanasiev L, Arion A, Dittrich J, Manegold S, Polyzotis N, Schnaitter K, Senellart P, Zoupanos S, Shasha D.\'a0 2008.\'a0\'a0The repeatability experiment of SIGMOD 2008. SIGMOD Rec.. 37:39?45.\par \par Matthews, J N.\'a0 2004.\'a0\'a0The case for repeated research in operating systems. SIGOPS Oper. Syst. Rev.. 38:5?7.\par \par Liskov, B.\'a0 1992.\'a0\'a0Report on Workshop on Research in Experimental Computer Science. :49.\par \par Mitzenmacher, M.\'a0 2015.\'a0\'a0Theory Without Experiments: Have We Gone Too Far? Commun. ACM. 58:40?42.\par \par Mudge, T.\'a0 1996.\'a0\'a0Report on the panel: "how can computer architecture researchers avoid becoming the society for irreproducible results?" SIGARCH Comput. Archit. News. 24:1?5.\par \par Mytkowicz, T, Diwan A, Hauswirth M, Sweeney PF.\'a0 2009.\'a0\'a0Producing wrong data without doing anything obviously wrong!. ASPLOS '09: Proceeding of the 14th international conference on Architectural support for programming languages and operating systems. :265?276.\par \par Mytkowicz, T, Diwan A, Hauswirth M, Sweeney PF.\'a0 2010.\'a0\'a0Evaluating the accuracy of Java profilers. PLDI '10: Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation. :187?197.\par \par Norvig, P.\'a0 2012.\'a0\'a0Warning Signs in Experimental Design and Interpretation. 2012\par \par Oreskes, N.\'a0 2014.\'a0\'a0Why we should trust scientists. \par \par Perry, DE, Porter AA, Votta LG.\'a0 2000.\'a0\'a0Empirical studies of software engineering: a roadmap. Proceedings of the Conference on The Future of Software Engineering. :345?355.\par \par Pieterse, V, Flater D.\'a0 2014.\'a0\'a0The ghost in the machine: don't let it haunt your software performance measurements. \par \par Planck, M.\'a0 1949.\'a0\'a0The meaning and limits of exact science. Science. 110\par \par Popper, K.\'a0 1959.\'a0\'a0The Logic of Scientific Discovery. \par \par Prechelt, L.\'a0 1997.\'a0\'a0Why We Need an Explicit Forum for Negative Results. Journal of Universal Computer Science. 3:1074?1083.\par \par Robertson, J.\'a0 2011.\'a0\'a0Stats: We're Doing It Wrong. BLOG@CACM. \par \par Schulte, E, Davison D, Dye T, Dominik C.\'a0 2012.\'a0\'a0A Multi-Language Computing Environment for Literate Programming and Reproducible Research. Journal of Statistical Software. 46:1?24.\par \par Singer, J.\'a0 2011.\'a0\'a0A literate experimentation manifesto. Proceedings of the 10th SIGPLAN symposium on New ideas, new paradigms, and reflections on programming and software. :91?102.\par \par Sj"berg, DIK, Anda B, Arisholm E, Dyb\{\\aa\} T, J"rgensen M, Karahasanovic A, Koren EF, Vok\'e1c M.\'a0 2002.\'a0\'a0Conducting Realistic Experiments in Software Engineering. Proceedings of the 2002 International Symposium on Empirical Software Engineering. :17?.\par \par Small, C, Ghosh N, Saleeb H, Seltzer M, Smith K.\'a0 1997.\'a0\'a0Does Systems Research Measure Up? \par \par Tempero, E, Anslow C, Dietrich J, Han T, Li J, Lumpe M, Melton H, Noble J.\'a0 2010.\'a0\'a0Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies. 2010 Asia Pacific Software Engineering Conference (APSEC2010). \par \par Tichy, W.\'a0 2011.\'a0\'a0Empirical software research: an interview with Dag Sj\'f8berg, University of Oslo, Norway. Ubiquity. 2011:2:1?2:14.\par \par Tichy, WF.\'a0 1998.\'a0\'a0Should Computer Scientists Experiment More? Computer. 31:32?40.\par \par Tufte, ER.\'a0 1986.\'a0\'a0The Visual Display of Quantitative Information. \par \par Vitek, J, Kalibera T.\'a0 2011.\'a0\'a0Repeatability, reproducibility, and rigor in systems research. Proceedings of the ninth ACM international conference on Embedded software. :33?38.\par \par Wieringa, R, Heerkens H, Regnell B.\'a0 2009.\'a0\'a0How to Write and Read a Scientific Evaluation Paper. Proceedings of the 2009 17th IEEE International Requirements Engineering Conference, RE. :361?364.\par \par Wilson, BE.\'a0 1952.\'a0\'a0An Introduction to Scientific Research. \par \par Wilson, G, Aranda J.\'a0 2011.\'a0\'a0Empirical Software Engineering. American Scientist. 99(6)\par \par Zannier, C, Melnik G, Maurer F.\'a0 2006.\'a0\'a0On the success of empirical studies in the international conference on software engineering. Proceedings of the 28th international conference on Software engineering. :341?350.\par \par }