AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Our evaluation has provided strong evidence of equivalence modulo inputs’s impressive power: 147 confirmed, unique bug reports for GCC and LLVM in a short few months

Compiler validation via equivalence modulo inputs

PLDI, no. 6 (2014): 216-226

Cited by: 162|Views226
EI

Abstract

We introduce equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers. Our key insight is to exploit the close interplay between (1) dynamically executing a program on some test inputs and (2) statically compiling the program to work on all possible inputs. Indeed, the test inputs induce...More

Code:

Data:

0
Introduction
  • Compilers are among the most important, widely-used and complex software ever written.
  • Perhaps less known to application programmers is that production compilers do contain bugs, and quite a few.
  • When compiler bugs occur, they frustrate programmers and may lead to unintended application behavior and disasters, especially in safety-critical domains.
  • Compiler verification has been an important and fruitful area for the verification grand challenge in computing research [9]
Highlights
  • Compilers are among the most important, widely-used and complex software ever written
  • We introduce the novel, general concept of equivalence modulo inputs (EMI) for systematic compiler validation
  • From an EMI variant of a Csmith test, we found a miscompilation in GCC and reported it as bug 59747.8 Later, others discovered that GCC miscompiled the movie player mplayer, and filed a new bug report 59824.9 The two bugs turned out to share the same root cause, and subsequently bug 59824 was marked as duplicate
  • We have found bugs that result in issues like compiler segfaults, internal compiler errors (ICEs), performance issues at compilation, and wrong code generation, and with various levels of severity, from rejecting valid code to release-blocking miscompilations
  • We have described a new validation methodology — equivalence modulo inputs (EMI) — for testing compilers
  • Our evaluation has provided strong evidence of EMI’s impressive power: 147 confirmed, unique bug reports for GCC and LLVM in a short few months
Results
  • The authors present the extensive evaluation of Orion to demonstrate the practical effectiveness of the EMI methodology besides its conceptual elegance and generality.

    Since January 2013, the authors have been experimenting with and refining Orion to find new bugs in three widely-used C compilers, namely GCC, LLVM, and ICC.
  • It is worth mentioning that the authors have focused more extensive testing on GCC because of the very quick responses from the GCC developers and relatively slow responses from the LLVM developers.
  • This partially explains why the authors have reported more bugs for GCC over LLVM
Conclusion
  • The authors have described a new validation methodology — equivalence modulo inputs (EMI) — for testing compilers.
  • The distinctive benefits of EMI are its simplicity and wide applicability.
  • The authors have applied it to test optimizing C compilers.
  • This work complements decades of extensive work on compiler validation and verification by opening up a fresh, exciting avenue of research.
  • The authors are actively pursuing future work to refine the general approach and extend it to other languages and settings
Tables
  • Table1: Bugs reported, marked duplicate, confirmed, and fixed
  • Table2: Bug classification
  • Table3: Bugs found categorized by modes
Download tables as Excel
Related work
  • Our work is related to the large body of research on compiler testing and verification. This section surveys some representative, closely related work, which we group into three categories: (1) compiler testing, (2) verified compilers, and (3) translation validation.

    Compiler Testing The most directly related is compiler testing, which remains the dominant technique for validating production compilers in practice. One common approach is to maintain a compiler test suite. For example, each major compiler (such as GCC and LLVM) has its own regression test suite, which grows over the time of its development. In addition, there are a number of popular, commercial compiler test suites (e.g. Plum Hall [21] and SuperTest [1]) designed for compiler conformance checking and validation. Most of these test suites are written manually.
Funding
  • This research was supported in part by NSF Grants 0917392, 1117603, 1319187, and 1349528
Reference
  • ACE. SuperTest compiler test and validation suite. http://www.ace.nl/compiler/supertest.html.
    Findings
  • A. Balestrat. CCG: A random C code generator. https://github.com/Merkil/ccg/.
    Findings
  • S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 394–403, 2006.
    Google ScholarLocate open access versionFindings
  • Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr. Taming compiler fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 197–208, 2013.
    Google ScholarLocate open access versionFindings
  • P. Cuoq, B. Monate, A. Pacalet, V. Prevosto, J. Regehr, B. Yakobowski, and X. Yang. Testing static analyzers with randomly generated programs. In A. Goodloe and S. Person, editors, NASA Formal Methods, volume 7226 of Lecture Notes in Computer Science, pages 120–12Springer Berlin Heidelberg, 2012.
    Google ScholarLocate open access versionFindings
  • C. Ellison and G. Rosu. An executable formal semantics of C with applications. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 533–544, 2012.
    Google ScholarLocate open access versionFindings
  • GNU Compiler Collection. gcov. http://gcc.gnu.org/onlinedocs/gcc/Gcov.html.
    Findings
  • A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr. Swarm testing. In International Symposium on Software Testing and Analysis (ISSTA), pages 78–88, 2012.
    Google ScholarLocate open access versionFindings
  • [10] C. Holler, K. Herzig, and A. Zeller. Fuzzing with code fragments. In Proceedings of the 21st USENIX Security Symposium, 2012.
    Google ScholarLocate open access versionFindings
  • [11] R. Joshi, G. Nelson, and K. H. Randall. Denali: A goal-directed superoptimizer. In Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 304–314, 2002.
    Google ScholarLocate open access versionFindings
  • [12] X. Leroy. Formal certification of a compiler back-end, or: programming a compiler with a proof assistant. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (POPL), pages 42–54. ACM Press, 2006.
    Google ScholarLocate open access versionFindings
  • [13] X. Leroy. A formally verified compiler back-end. Journal of Automated Reasoning, 43(4):363–446, 2009.
    Google ScholarLocate open access versionFindings
  • [14] G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 237–248, 2010.
    Google ScholarLocate open access versionFindings
  • [15] H. Massalin. Superoptimizer – A look at the smallest program. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 122–126, 1987.
    Google ScholarLocate open access versionFindings
  • [16] W. M. McKeeman. Differential testing for software. Digital Technical Journal, 10(1):100–107, 1998.
    Google ScholarLocate open access versionFindings
  • [17] S. McPeak, D. S. Wilkerson, and S. Goldsmith. Berkeley Delta. http://delta.tigris.org/.
    Findings
  • [18] E. Nagai, H. Awazu, N. Ishiura, and N. Takeda. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012), pages 48–53, 2012.
    Google ScholarLocate open access versionFindings
  • [19] E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up size and number of expressions in random testing of arithmetic optimization of C compilers. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2013), pages 88–93, 2013.
    Google ScholarLocate open access versionFindings
  • [20] G. C. Necula. Translation validation for an optimizing compiler. In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 83–94, 2000.
    Google ScholarLocate open access versionFindings
  • [21] Plum Hall, Inc. The Plum Hall Validation Suite for C. http://www.plumhall.com/stec.html.
    Findings
  • [22] A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In Proceedings of the 4th International Conference on Tools and Algorithms for Construction and Analysis of Systems (TACAS), pages 151–166, London, UK, UK, 1998. Springer-Verlag.
    Google ScholarLocate open access versionFindings
  • [23] J. Regehr. Embedded in academia. http://blog.regehr.org/.
    Findings
  • [24] J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang. Testcase reduction for C compiler bugs. In Proceedings of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 335–346, 2012.
    Google ScholarLocate open access versionFindings
  • [25] R. Tate, M. Stepp, Z. Tatlock, and S. Lerner. Equality saturation: a new approach to optimization. In Proceedings of the 36th ACM SIGPLANSIGACT symposium on Principles of Programming Languages, pages 264–276, 2009.
    Google ScholarLocate open access versionFindings
  • [26] The Clang Team. Clang 3.4 documentation: LibTooling. http://clang.llvm.org/docs/LibTooling.html.
    Findings
  • [27] J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating value-graph translation validation for LLVM. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 295–305, 2011.
    Google ScholarLocate open access versionFindings
  • [28] X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 283–294, 2011.
    Google ScholarLocate open access versionFindings
  • [29] C. Zhao, Y. Xue, Q. Tao, L. Guo, and Z. Wang. Automated test program generation for an industrial optimizing compiler. In ICSE Workshop on Automation of Software Test (AST), pages 36–43, 2009.
    Google ScholarLocate open access versionFindings
  • [30] J. Zhao, S. Nagarakatte, M. M. K. Martin, and S. Zdancewic. Formal verification of SSA-based optimizations for LLVM. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 175–186, 2013.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科