AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
This paper presents the first study attempting client-side temporal API mining with static analysis beyond trivial alias analysis and history abstractions

Static specification mining using automata-based abstractions

IEEE Transactions on Software Engineering, no. 5 (2008): 651-666

Cited by: 205|Views115
EI WOS

Abstract

We present a novel approach to client-side mining of temporal API specifications based on static analysis. Specifically, we present an interprocedural analysis over a combined domain that abstracts both aliasing and event sequences for individual objects. The analysis uses a new family of automata-based abstractions to represent unbounded...More

Code:

Data:

0
Introduction
  • Specifications of program behavior play a central role in many software engineering technologies.
  • Most such research addresses dynamic analysis, inferring specifications from observed behavior of representative program runs.
  • Dynamic analysis requires someone to build, deploy, and set up an appropriate environment for a program run.
  • These tasks, difficult and time-consuming for a human, lie far beyond the reach of today’s automated technologies
Highlights
  • There is only one thing more painful than learning from experience and that is not learning from experience. – Archibald MacLeish

    Specifications of program behavior play a central role in many software engineering technologies
  • The amount of code available for inspection vastly exceeds the amount of code amenable to automated dynamic analysis
  • We present a parameterized framework for history abstractions, based on intuition regarding the structure of API specifications
  • We have implemented a prototype of our analysis based on the WALA analysis framework [19] and the typestate analysis framework of [9]
  • When using Total merge, we only show results for Past history abstraction; results for Future would be similar under this aggressive merge criterion
  • This paper presents the first study attempting client-side temporal API mining with static analysis beyond trivial alias analysis and history abstractions
Methods
  • The naive approach outputs the union of all the automata in the author as the API specification, without any noise reduction.
  • This approach treats all traces uniformly, regardless of their frequency.
  • A better straightforward statistical approach uses a weighted union of the input automata to identify and eliminate infrequent behaviors.
Results
  • The authors have implemented a prototype of the analysis based on the WALA analysis framework [19] and the typestate analysis framework of [9].
  • The authors' analysis builds on a general Reps-Horwitz-Sagiv (RHS) IFDS tabulation solver implementation [17].
  • The authors extended the RHS solver to support dynamic changes and merges in the set of dataflow facts.
  • Base/Future/Ext APF/Past/Total APF/Past/Ext APF/Future/Ext API Auth Channel ChannelMgr Cipher Connection
  • Some APIs appear in several separate benchmarks, while others appear in several programs contained within the same benchstates edges avg. degree states edges avg. degree states edges avg. degree states edges avg. degree states edges avg. degree states edges avg. degree
Conclusion
  • The authors' experiments indicate that having both a precise-enough heap abstraction and a precise-enough history abstraction are required to be able to mine a reasonable specification.

    Without such abstractions, the collected abstract histories might deteriorate to a point in which no summarization algorithm will recover the lost information.
  • The specification mined for the Photo API using the Base heap abstraction has a single state.
  • This means that the specification does not contain any temporal information on the ordering of events.
  • It is possible to employ the analysis with a predetermined timeout
  • In such cases, the specification obtained using the analysis will not over-approximate code base behavior, but may still help understand some behaviors.
  • The authors plan to conduct further research into modular analysis techniques and improved summarization heuristics, to move closer to practical application of this technology
Tables
  • Table1: Results of mining the running example with varying heap abstractions and merge algorithms
  • Table2: Benchmarks
  • Table3: Characteristics of our mined specifications with varying data collectors. For every mined specification DFA, we show the number of states, edges, and the density of the DFA
Download tables as Excel
Related work
  • Dynamic Analysis. When it is feasible to run a program with adequate coverage, dynamic analysis represents the most attractive option for specification mining, since dynamic analysis does not suffer from the difficulties inherent to abstraction.

    Cook and Wolf [5] consider the general problem of extracting an FSM model from an event trace, and reduce the problem to the well-known grammar inference [11] problem. Cook and Wolf discuss algorithmic, statistical, and hybrid approaches, and present an excellent overview of the approaches and fundamental challenges. This work considers mining automata from uninterpreted event traces, attaching no semantic meaning to events.

    Ammons et al [2] infer temporal and data dependence specifications based on dynamic trace data. This work applies sophisticated probabalistic learning techniques to boil traces down to collections of finite automata which characterize the behavior.
Reference
  • R. Alur, P. Cerny, P. Madhusudan, and W. Nam. Synthesis of interface specifications for Java classes. SIGPLAN Not., 40(1):98–109, 2005.
    Google ScholarLocate open access versionFindings
  • G. Ammons, R. Bodik, and J. R. Larus. Mining specifications. In POPL ’02: Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 4–16, New York, NY, USA, 200ACM Press.
    Google ScholarLocate open access versionFindings
  • L. O. Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, Univ. of Copenhagen, May 1994. (DIKU report 94/19).
    Google ScholarFindings
  • D. Chase, M. Wegman, and F. Zadeck. Analysis of pointers and structures. In Proc. ACM Conf. on Programming Language Design and Implementation, pages 296–310, New York, NY, 1990. ACM Press.
    Google ScholarLocate open access versionFindings
  • J. E. Cook and A. L. Wolf. Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Methodol., 7(3):215–249, 1998.
    Google ScholarLocate open access versionFindings
  • P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL ’77: Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pages 238–252, New York, NY, USA, 1977. ACM Press.
    Google ScholarLocate open access versionFindings
  • D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: a general approach to inferring errors in systems code. In SOSP ’01: Proceedings of the eighteenth ACM symposium on Operating systems principles, pages 57–72, New York, NY, USA, 2001. ACM Press.
    Google ScholarLocate open access versionFindings
  • M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE Transactions on Software Engineering, 27(2):99–123, Feb. 2001.
    Google ScholarLocate open access versionFindings
  • S. Fink, E. Yahav, N. Dor, G. Ramalingam, and E. Geay. Effective typestate verification in the presence of aliasing. In ISSTA ’06: Proceedings of the 2006 international symposium on Software testing and analysis, pages 133–144, New York, NY, USA, 2006. ACM Press.
    Google ScholarLocate open access versionFindings
  • Gallery of mined specification. http://tinyurl.com/23qct8 or http://docs.google.com/View?docid=ddhtqgv6 10hbczjd.
    Findings
  • E. M. Gold. Language identification in the limit. Information and Control, 10:447–474, 1967.
    Google ScholarLocate open access versionFindings
  • S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. May 2002.
    Google ScholarFindings
  • V. B. Livshits and T. Zimmermann. Dynamine: Finding common error patterns by mining software revision histories. In Proceedings of the 13th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE-13), pages 296–305, Sept. 2005.
    Google ScholarLocate open access versionFindings
  • D. Mandelin, L. Xu, R. Bodik, and D. Kimelman. Jungloid mining: helping to navigate the API jungle. In PLDI ’05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 48–61, New York, NY, USA, 2005. ACM Press.
    Google ScholarLocate open access versionFindings
  • M. G. Nanda, C. Grothoff, and S. Chandra. Deriving object typestates in the presence of inter-object references. In OOPSLA ’05: Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming, systems, languages, and applications, pages 77–96, New York, NY, USA, 2005. ACM Press.
    Google ScholarLocate open access versionFindings
  • M. Pistoia, D. Reller, D. Gupta, M. Nagnur, and A. K. Ramani. Java 2 Network Security. Prentice Hall PTR, Upper Saddle River, NJ, USA, second edition, August 1999.
    Google ScholarFindings
  • T. Reps, S. Horwitz, and M. Sagiv. Precise interprocedural dataflow analysis via graph reachability. In Proc. ACM Symp. on Principles of Programming Languages, pages 49–61, 1995.
    Google ScholarLocate open access versionFindings
  • A. Salcianu and M. Rinard. Purity and side effect analysis for Java programs. In VMCAI’05: Proceedings of the 6th International Conference on Verification, Model Checking, and Abstract Interpretation, 2005.
    Google ScholarLocate open access versionFindings
  • WALA: The T. J. Watson Libraries for Analysis. http://wala.sourceforge.net.
    Findings
  • W. Weimer and G. Necula. Mining temporal specifications for error detection. In TACAS, 2005.
    Google ScholarLocate open access versionFindings
  • J. Whaley, M. C. Martin, and M. S. Lam. Automatic extraction of object-oriented component interfaces. In Proceedings of the International Symposium on Software Testing and Analysis, pages 218–228. ACM Press, July 2002.
    Google ScholarLocate open access versionFindings
  • J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: mining temporal API rules from imperfect traces. In ICSE ’06: Proceeding of the 28th international conference on Software engineering, pages 282–291, New York, NY, USA, 2006. ACM Press.
    Google ScholarLocate open access versionFindings
  • G. Yorsh, E. Yahav, and S. Chandra. Symbolic summarization with applications to typestate verification. Technical report, Tel Aviv University, 2007. www.cs.tau.ac.il/∼gretay.
    Findings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科