AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We have presented several sample applications: sharing the overhead of assertions, predicate guessing and elimination to isolate a deterministic bug, and regularized logistic regression to isolate a non-deterministic memory corruption error

Bug isolation via remote program sampling

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implement..., no. 5 (2003): 141-154

Cited by: 738|Views140
EI

Abstract

We propose a low-overhead sampling infrastructure for gathering information from the executions experienced by a program's user community. Several example applications illustrate ways to use sampled instrumentation to isolate bugs. Assertion-dense code can be transformed to share the cost of assertions among many users. Lacking assertions...More

Code:

Data:

Introduction
  • It is an unfortunate fact that essentially all deployed software systems have bugs, and that users often encounter these bugs.
  • Given that deployed software has problems, perhaps the authors can speed up the process of identifying and eliminating those problems by learning something from the enormous number of executions performed by the software’s user community.
  • The data gathered from all executions is analyzed to extract information that helps engineers find and fix problems more quickly; the authors call this automatic bug isolation.
  • In the view, such an infrastructure has several benefits:
Highlights
  • It is an unfortunate fact that essentially all deployed software systems have bugs, and that users often encounter these bugs
  • We present one approach to systematically gathering information about program runs from a large, distributed user community and performing subsequent automatic analysis of that information to help in isolating bugs
  • We show how to isolate deterministic bugs without the benefit of explicit assertions
  • We have described a sampling infrastructure for gathering information about software from the set of runs produced by its user community
  • Five benchmarks have less than a 10% slowdown, and only one is below 5%
  • We have presented several sample applications: sharing the overhead of assertions, predicate guessing and elimination to isolate a deterministic bug, and regularized logistic regression to isolate a non-deterministic memory corruption error
Results
  • Five benchmarks have less than a 10% slowdown, and only one is below 5%.
  • Most benchmarks suffer less than a 10% penalty relative to uninstrumented code, and half are below 5%.
  • Performance is uniformly good: at 1/1000 sampling, 94% of site-containing functions incur less than 5% slowdown versus instrumentationfree code, while even the worst single function has less than a 12% penalty.
  • Using an experimental setup similar to that described earlier in Section 3.1.1, the authors find that the overhead for 1/1000 sampling is less than 4%, and progressively sparser sampling rates shrink this still further
Conclusion
  • The authors have described a sampling infrastructure for gathering information about software from the set of runs produced by its user community.
  • To ensure that rare events are accurately represented, the authors use a Bernoulli process to do the sampling, and the authors have described an efficient implementation of that process.
  • The authors have presented several sample applications: sharing the overhead of assertions, predicate guessing and elimination to isolate a deterministic bug, and regularized logistic regression to isolate a non-deterministic memory corruption error
Tables
  • Table1: Static metrics for CCured benchmarks. Olden benchmarks are listed first, followed by SPECINT95
  • Table2: Relative performance of CCured benchmarks with unconditional or sampled instrumentation. Italics marks cases where sampled instrumentation outperforms unconditional instrumentation
Download tables as Excel
Related work
  • Sampling has a long history, with most applications focusing on performance profiling and optimization. Any sampling system must define a trigger mechanism that signals when a sample is to be taken. Typical triggers include periodic hardware timers/interrupts [8, 25, 27], periodic software event counters (e.g., every nth function call) [3], or both. In most cases, the sampling interval is strictly periodic; this may suffice when hunting for large performance bottlenecks, but may systematically miss rare events.

    The Digital Continuous Profiling Infrastructure [1] is unusual in choosing sampling intervals randomly. However, the random distribution is uniform, such as one sample every 60K to 64K cycles. Samples thus extracted are not independent. If one sample is taken, there is zero chance of taking any sample in the next 1–59,999 cycles and zero chance of not taking exactly one sample in the next 60K–64K cycles. We trigger samples based on a geometric distribution, which correctly models the interval between successful independent coin tosses. The resulting data is a statistically rigorous fair random sample, which in turn grants access to a large domain of powerful statistical analyses.
Funding
  • ∗This research was supported in part by NASA Grant No NAG2-1210; NSF Grant Nos
Reference
  • J. M. Anderson, L. M. Berc, J. Dean, S. Ghemawat, M. R. Henzinger, S.-T. A. Leung, R. L. Sites, M. T. Vandevoorde, C. A. Waldspurger, and W. E. Weihl. Continuous profiling: Where have all the cycles gone? ACM Transactions on Computer Systems, 15(4):357–390, Nov. 1997.
    Google ScholarLocate open access versionFindings
  • M. Arnold and B. Ryder. A framework for reducing the cost of instrumented code. ACM SIGPLAN Notices, 36(5):168–179, May 2001.
    Google ScholarLocate open access versionFindings
  • M. Arnold and P. F. Sweeney. Approximating the calling context tree via sampling. Research report RC 21789 (98099), IBM T.J. Watson Research Center, Yorktown Heights, New York, July 7 2000.
    Google ScholarLocate open access versionFindings
  • Association for Computing Machinery. Proceedings of the International Conference on Software Engineering, Buenos Aires, Argentina, May 2002.
    Google ScholarLocate open access versionFindings
  • J. Bowring, A. Orso, and M. J. Harrold. Monitoring deployed software using software tomography. In M. B. Dwyer, editor, Proceedings of the 2002 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering (PASTE-02), volume 28, 1 of SOFTWARE ENGINEERING NOTES, pages 2–9. ACM Press, 2002.
    Google ScholarLocate open access versionFindings
  • L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Statistics/Probability Series. Wadsworth Publishing Company, Belmont, California, U.S.A., 1984.
    Google ScholarLocate open access versionFindings
  • P. Broadwell, M. Harren, and N. Sastry. Scrash: A system for generating secure crash information. In Proceedings of the 11th USENIX Security Symposium, Washington, DC, Aug. 4–8 2003. To appear.
    Google ScholarLocate open access versionFindings
  • M. Burrows, U. Erlingson, S.-T. Leung, M. Vandevoorde, C. Waldspurger, K. Walker, and B. Weihl. Efficient and flexible value sampling. ACM SIGPLAN Notices, 35(11):160–167, Nov. 2000.
    Google ScholarLocate open access versionFindings
  • J. Canny. Collaborative filtering with privacy. In Proceedings of the IEEE Symposium on Research in Security and Privacy, pages 45–57, Oakland, CA, May 2002. IEEE Computer Society, Technical Committee on Security and Privacy, IEEE Computer Society Press.
    Google ScholarLocate open access versionFindings
  • M. C. Carlisle. Olden: Parallelizing Programs with Dynamic Data Structures on Distributed-Memory Machines. PhD thesis, Department of Computer Science, Princeton University, June 1996.
    Google ScholarFindings
  • C. Dellarocas. Immunizing online reputation reporting systems against unfair ratings and discriminatory behavior. In Proceedings of the 2nd ACM Conference on Electronic Commerce (EC-00), pages 150–157. ACM, 2000.
    Google ScholarLocate open access versionFindings
  • B. Demsky and M. C. Rinard. Role-based exploration of object-oriented programs. In Proceedings of the International Conference on Software Engineering [4].
    Google ScholarLocate open access versionFindings
  • M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE Transactions on Software Engineering, 27(2):1–25, Feb. 2001.
    Google ScholarLocate open access versionFindings
  • D. Esler. Welcome to the virtual ramp. Overhaul & Maintenance, VII(2):55, Mar. 2001.
    Google ScholarFindings
  • T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286(5439):531–537, 1999.
    Google ScholarLocate open access versionFindings
  • S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In Proceedings of the International Conference on Software Engineering [4], pages 291–301.
    Google ScholarLocate open access versionFindings
  • T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Stats. Springer, 2001.
    Google ScholarFindings
  • M. Hirzel and T. Chilimbi. Bursty tracing: A framework for low-overhead temporal profiling. In 4th ACM Workshop on Feedback-Directed and Dynamic Optimization, Austin, Texas, Dec. 1 2001.
    Google ScholarLocate open access versionFindings
  • Microsoft Corp. Microsoft 2002 annual report and form 10-K. Available at <http://www.microsoft.com/msft/ar02/>, Redmond, Washington, 2002.
    Locate open access versionFindings
  • B. Miller, D. Koski, C. P. Lee, V. Maganty, R. Murthy, A. Natarajan, and J. Steidl. Fuzz revisited: A re-examination of the reliability of UNIX utilities and services. Technical report, Computer Science Department, University of Wisconsin, Madison, WI, 1995.
    Google ScholarFindings
  • G. Necula, S. McPeak, and W. Weimer. CCured: Type-safe retrofitting of legacy code. In C. Norris and J. James B. Fenwick, editors, Proceedings of the 2002 ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL-02), volume 37, 1 of ACM SIGPLAN Notices, pages 128–139. ACM Press, 2002.
    Google ScholarLocate open access versionFindings
  • S. P. Reiss and M. Renieris. Encoding program executions. In Proceedings of the 23rd International Conference on Software Engeneering (ICSE-01), pages 221–232. IEEE Computer Society, 2001.
    Google ScholarLocate open access versionFindings
  • SPEC 95. Standard Performance Evaluation Corporation Benchmarks. <http://www.spec.org/osg/cpu95/CINT95/>, July 1995.
    Findings
  • R. Tibshirani, T. Hastie, B. Narasimhan, and G. Chu. Diagnosis of multiple cancer types by shrunken centroids of gene expression. PNAS, 99(10):6567–6572, 2002.
    Google ScholarLocate open access versionFindings
  • O. Traub, S. Schechter, and M. D. Smith. Ephemeral instrumentation for lightweight program profiling. Unpublished technical report, Department of Electrical Engineering and Computer Science, Hardward University, Cambridge, Massachusetts, June 2000.
    Google ScholarFindings
  • D. M. Volpano and G. Smith. A type-based approach to program security. In M. Bidoit and M. Dauchet, editors, TAPSOFT ’97: Theory and Practice of Software Development, volume 1214 of Lecture Notes in Computer Science, pages 607–621. Springer-Verlag, 1997.
    Google ScholarLocate open access versionFindings
  • J. Whaley. A portable sampling-based profiler for Java virtual machines. In Proceedings of the ACM 2000 conference on Java Grande, pages 78–87. ACM Press, 2000.
    Google ScholarLocate open access versionFindings
  • S. Zdancewic, L. Zheng, N. Nystrom, and A. C. Myers. Untrusted hosts and confidentiality: Secure program partitioning. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01), pages 1–14. Chateau Lake Louise, Banff, Alberta, Canada, Oct. 2001. Appeared as ACM Operating Systems Review 35.5.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科