AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
Our evaluation has provided strong evidence of equivalence modulo inputs’s impressive power: 147 confirmed, unique bug reports for GCC and LLVM in a short few months
Compiler validation via equivalence modulo inputs
PLDI, no. 6 (2014): 216-226
We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) statically compiling the program to work on all possible inputs. Indeed, the test inputs induce...More
PPT (Upload PPT)
- Compilers are among the most important, widely-used and complex software ever written.
- Perhaps less known to application programmers is that production compilers do contain bugs, and quite a few.
- When compiler bugs occur, they frustrate programmers and may lead to unintended application behavior and disasters, especially in safety-critical domains.
- Compiler verification has been an important and fruitful area for the verification grand challenge in computing research 
- Compilers are among the most important, widely-used and complex software ever written
- We introduce the novel, general concept of equivalence modulo inputs (EMI) for systematic compiler validation
- From an EMI variant of a Csmith test, we found a miscompilation in GCC and reported it as bug 59747.8 Later, others discovered that GCC miscompiled the movie player mplayer, and filed a new bug report 59824.9 The two bugs turned out to share the same root cause, and subsequently bug 59824 was marked as duplicate
- We have found bugs that result in issues like compiler segfaults, internal compiler errors (ICEs), performance issues at compilation, and wrong code generation, and with various levels of severity, from rejecting valid code to release-blocking miscompilations
- We have described a new validation methodology — equivalence modulo inputs (EMI) — for testing compilers
- Our evaluation has provided strong evidence of EMI’s impressive power: 147 confirmed, unique bug reports for GCC and LLVM in a short few months
- The authors present the extensive evaluation of Orion to demonstrate the practical effectiveness of the EMI methodology besides its conceptual elegance and generality.
Since January 2013, the authors have been experimenting with and refining Orion to find new bugs in three widely-used C compilers, namely GCC, LLVM, and ICC.
- It is worth mentioning that the authors have focused more extensive testing on GCC because of the very quick responses from the GCC developers and relatively slow responses from the LLVM developers.
- This partially explains why the authors have reported more bugs for GCC over LLVM
- The authors have described a new validation methodology — equivalence modulo inputs (EMI) — for testing compilers.
- The distinctive benefits of EMI are its simplicity and wide applicability.
- The authors have applied it to test optimizing C compilers.
- This work complements decades of extensive work on compiler validation and verification by opening up a fresh, exciting avenue of research.
- The authors are actively pursuing future work to refine the general approach and extend it to other languages and settings
- Table1: Bugs reported, marked duplicate, confirmed, and fixed
- Table2: Bug classification
- Table3: Bugs found categorized by modes
- Our work is related to the large body of research on compiler testing and verification. This section surveys some representative, closely related work, which we group into three categories: (1) compiler testing, (2) verified compilers, and (3) translation validation.
Compiler Testing The most directly related is compiler testing, which remains the dominant technique for validating production compilers in practice. One common approach is to maintain a compiler test suite. For example, each major compiler (such as GCC and LLVM) has its own regression test suite, which grows over the time of its development. In addition, there are a number of popular, commercial compiler test suites (e.g. Plum Hall  and SuperTest ) designed for compiler conformance checking and validation. Most of these test suites are written manually.
- This research was supported in part by NSF Grants 0917392, 1117603, 1319187, and 1349528
- ACE. SuperTest compiler test and validation suite. http://www.ace.nl/compiler/supertest.html.
- A. Balestrat. CCG: A random C code generator. https://github.com/Merkil/ccg/.
- S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 394–403, 2006.
- Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 197–208, 2013.
- P. Cuoq, B. Monate, A. Pacalet, V. Prevosto, J. Regehr, B. Yakobowski, and X. Yang. Testing static analyzers with randomly generated programs. In A. Goodloe and S. Person, editors, NASA Formal Methods, volume 7226 of Lecture Notes in Computer Science, pages 120–12Springer Berlin Heidelberg, 2012.
- C. Ellison and G. Rosu. An executable formal semantics of C with applications. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 533–544, 2012.
- GNU Compiler Collection. gcov. http://gcc.gnu.org/onlinedocs/gcc/Gcov.html.
- A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr. Swarm testing. In International Symposium on Software Testing and Analysis (ISSTA), pages 78–88, 2012.
-  C. Holler, K. Herzig, and A. Zeller. Fuzzing with code fragments. In Proceedings of the 21st USENIX Security Symposium, 2012.
-  R. Joshi, G. Nelson, and K. H. Randall. Denali: A goal-directed superoptimizer. In Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 304–314, 2002.
-  X. Leroy. Formal certification of a compiler back-end, or: programming a compiler with a proof assistant. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (POPL), pages 42–54. ACM Press, 2006.
-  X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363–446, 2009.
-  G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 237–248, 2010.
-  H. Massalin. Superoptimizer – A look at the smallest program. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 122–126, 1987.
-  W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100–107, 1998.
-  S. McPeak, D. S. Wilkerson, and S. Goldsmith. Berkeley Delta. http://delta.tigris.org/.
-  E. Nagai, H. Awazu, N. Ishiura, and N. Takeda. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012), pages 48–53, 2012.
-  E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013), pages 88–93, 2013.
-  G. C. Necula. Translation validation for an optimizing compiler. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 83–94, 2000.
-  Plum Hall, Inc. The Plum Hall Validation Suite for C. http://www.plumhall.com/stec.html.
-  A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), pages 151–166, London, UK, UK, 1998. Springer-Verlag.
-  J. Regehr. Embedded in academia. http://blog.regehr.org/.
-  J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Testcase reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 335–346, 2012.
-  R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In Proceedings of the 36th ACM SIGPLANSIGACT symposium on Principles of Programming Languages, pages 264–276, 2009.
-  The Clang Team. Clang 3.4 documentation: LibTooling. http://clang.llvm.org/docs/LibTooling.html.
-  J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating value-graph translation validation for LLVM. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 295–305, 2011.
-  X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 283–294, 2011.
-  C. Zhao, Y. Xue, Q. Tao, L. Guo, and Z. Wang. Automated test program generation for an industrial optimizing compiler. In ICSE Workshop on Automation of Software Test (AST), pages 36–43, 2009.
-  J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 175–186, 2013.