## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Neural Bridge Sampling for Evaluating Safety-Critical Autonomous Systems

NIPS 2020, pp.6402-6416, (2020)

EI

Keywords

Abstract

Learning-based methodologies increasingly find applications in safety-critical domains like autonomous driving and medical robotics. Due to the rare nature of dangerous events, real-world testing is prohibitively expensive and unscalable. In this work, we employ a probabilistic approach to safety evaluation in simulation, where we are con...More

Code:

Data:

Introduction

- Data-driven and learning-based approaches have the potential to enable robots and autonomous systems that intelligently interact with unstructured environments.
- Currently deployed safety-critical autonomous systems are limited to structured environments that allow mechanisms such as PID control, simple verifiable protocols, or convex optimization to enable guarantees for properties like stability, consensus, or recursive feasibility.
- The stylized settings of these problems and the limited expressivity of guaranteeable properties are barriers to solving unstructured, real-world tasks such as autonomous navigation, locomotion, and manipulation.
- The authors assume access to a simulator to test the system’s performance.
- Given a distribution X ∼ P0 of simulation parameters that describe typical environments for the system under test, the governing problem is to estimate the probability of an adverse event pγ := P0(f (X) ≤ γ)

Highlights

- Data-driven and learning-based approaches have the potential to enable robots and autonomous systems that intelligently interact with unstructured environments
- A major focus of this work is empirical, and Section 4 empirically demonstrates the superiority of neural bridge sampling over competing techniques in a variety of applications: (i) we evaluate the sensitivity of a formallyverified system to domain shift, (ii) we consider design optimization for high-precision rockets, and (iii) we perform model comparisons for two learning-based approaches to autonomous navigation
- We describe an Markov-chain Monte Carlo (MCMC) method that combines exploration, exploitation, and optimization to draw samples Xik ∼ Pk
- We consider two examples of using neural bridge sampling as a tool for engineering design in high-dimensional settings: (a) comparing thruster sizes to safely land a rocket [12] in the presence of wind, and (b) comparing two algorithms on the OpenAI Gym CarRacing environment [54]
- We show only Monte Carlo (MC) and NB for clarity; comparisons with other methods are in Table 1
- We intend to investigate how efficiently sampling rare failures—like we propose here for evaluation—could enable the automated repair of safety-critical reinforcement-learning agents

Methods

- The authors evaluate the approach on a variety of scenarios showcasing its use in efficiently evaluating the safety of autonomous systems.
- All methods are given the same computational budget as measured by evaluations of the simulator.
- This varies from 50,000-100,000 queries to run Algorithm 1 as determined by pγ.
- Despite running Algorithm 1 with a given γ, the authors evaluate estimates pγtest for all γtest ≥ γ.
- The authors calculate the ground-truth values pγtest for non-synthetic problems using a fixed, very large number of MC queries

Conclusion

- There is a growing need for rigorous evaluation of safety-critical systems which contain components without formal guarantees.
- Evaluating the safety of such systems in the presence of rare, catastrophic events is a necessary component in enabling the development of trustworthy high-performance systems.
- Neural bridge sampling, employs three concepts—exploration, exploitation, and optimization—in order to evaluate system safety with provable statistical and computational efficiency.
- The authors intend to investigate how efficiently sampling rare failures—like the authors propose here for evaluation—could enable the automated repair of safety-critical reinforcement-learning agents.

- Table1: Relative mean-square error E[(pγ/pγ − 1)2] over 10 trials

Related work

- Safety evaluation Several communities [25] have attempted to evaluate the closed-loop performance of cyber-physical, robotic, and embodied agents both with and without learning-based components. Existing solutions are predicated on the definition of the evaluation problem: verification, falsification, or estimation. In this paper we consider a method that utilizes interactions with a gradient oracle in order to solve the estimation problem (1). In contrast to our approach, the verification community has developed tools (e.g. [55, 22, 3]) to investigate whether any adverse or unsafe executions of the system exist. Such methods can certify that failures are impossible, but they require that the model is written in a formal language (a barrier for realistic systems) and they require whitebox access to this formal model. Falsification approaches (e.g. [38, 29, 4, 104, 32, 79]) attempt to find any failure cases for the system (but not the overall probability of failure). Similar to our approach, some falsification approaches (e.g. [1, 103]) utilize gradient information, but their goal is to simply minimize f (x) rather than solve problem (1). Adversarial machine learning is closely related to falsification; the key difference is the domain over which the search for falsifying evidence is conducted. Adversarial examples (e.g. [59, 91, 52, 95]) are typically restricted to an pnorm ball around a point from a dataset, whereas falsification considers all possible in-distribution examples. Both verification and falsification methods provide less information about the system under test than estimation-based methods: they return only whether or not the system satisfies a specification. When the system operates in an unstructured environment (e.g. driving in an urban setting), the mere existence of failures is trivial to demonstrate [89]. Several authors (e.g. [74, 100]) have proposed that it is more important in such settings to understand the overall frequency of failures as well as the relative likelihoods of different failure modes, motivating our approach.

Reference

- H. Abbas, A. Winn, G. Fainekos, and A. A. Julius. Functional gradient descent method for metric temporal logic specifications. In 2014 American Control Conference, pages 2312–2317. IEEE, 2014.
- N. Akhtar and A. Mian. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access, 6:14410–14430, 2018.
- M. Althoff. An introduction to cora 2015. In Proc. of the Workshop on Applied Verification for Continuous and Hybrid Systems, 2015.
- Y. Annpureddy, C. Liu, G. Fainekos, and S. Sankaranarayanan. S-taliro: A tool for temporal logic falsification for hybrid systems. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 254–257.
- K. J. ̊Astrom and P. Eykhoff. System identification—a survey. Automatica, 7(2):123–162, 1971.
- M. Bansal, A. Krizhevsky, and A. Ogale. Chauffeurnet: Learning to drive by imitating the best and synthesizing the worst. arXiv preprint arXiv:1812.03079, 2018.
- N. Baram, O. Anschel, I. Caspi, and S. Mannor. End-to-end differentiable adversarial imitation learning. In International Conference on Machine Learning, pages 390–399, 2017.
- Y. Benkler. Don’t let industry write the rules for ai. Nature, 569(7754):161–162, 2019.
- C. H. Bennett. Efficient estimation of free energy differences from monte carlo data. Journal of Computational Physics, 22(2):245–268, 1976.
- M. Betancourt. A conceptual introduction to hamiltonian monte carlo. arXiv preprint arXiv:1701.02434, 2017.
- C. M. Bishop. Mixture density networks. Technical report, Citeseer, 1994.
- L. Blackmore. Autonomous precision landing of space rockets. In Frontiers of Engineering: Reports on Leading-Edge Engineering from the 2016 Symposium. National Academies Press, 2017.
- F. Borrelli, A. Bemporad, and M. Morari. Predictive control for linear and hybrid systems. Cambridge University Press, 2017.
- J. Bowen and V. Stavridou. Safety-critical systems, formal methods and standards. Software Engineering Journal, 8(4):189–209, 1993.
- C.-E. Brehier, T. Lelievre, and M. Rousset. Analysis of adaptive multilevel splitting algorithms in an idealized case. ESAIM: Probability and Statistics, 19:361–394, 2015.
- G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
- M. Brundage, S. Avin, J. Wang, H. Belfield, G. Krueger, G. Hadfield, H. Khlaaf, J. Yang, H. Toner, R. Fong, et al. Toward trustworthy ai development: Mechanisms for supporting verifiable claims. arXiv preprint arXiv:2004.07213, 2020.
- J. Bucklew. Introduction to rare event simulation. Springer Science & Business Media, 2013.
- F. Cerou and A. Guyader. Adaptive multilevel splitting for rare event analysis. Stochastic Analysis and Applications, 25(2):417–443, 2007.
- V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3):1–58, 2009.
- M.-H. Chen, Q.-M. Shao, and J. G. Ibrahim. Monte Carlo methods in Bayesian computation. Springer Science & Business Media, 2012.
- X. Chen, E. Abraham, and S. Sankaranarayanan. Flow*: An analyzer for non-linear hybrid systems. In Computer Aided Verification, pages 258–263.
- Y. Chen, R. Dwivedi, M. J. Wainwright, and B. Yu. Fast mixing of metropolized hamiltonian monte carlo: Benefits of multi-step gradients. arXiv preprint arXiv:1905.12247, 2019.
- H. Choi, E. Jang, and A. A. Alemi. Waic, but why? generative ensembles for robust anomaly detection. arXiv preprint arXiv:1810.01392, 2018.
- A. Corso, R. J. Moss, M. Koren, R. Lee, and M. J. Kochenderfer. A survey of algorithms for black-box safety validation. arXiv preprint arXiv:2005.02979, 2020.
- M. Deisenroth and C. E. Rasmussen. Pilco: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on machine learning (ICML-11), pages 465–472, 2011.
- P. Del Moral. Feynman-kac formulae. In Feynman-Kac Formulae, pages 47–93.
- P. Del Moral, A. Doucet, and A. Jasra. Sequential monte carlo samplers. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3):411–436, 2006.
- [30] A. Doucet, N. De Freitas, and N. Gordon. An introduction to sequential monte carlo methods. In Sequential Monte Carlo methods in practice, pages 3–14.
- [31] J. C. Doyle, B. A. Francis, and A. R. Tannenbaum. Feedback control theory. Courier Corporation, 2013.
- [32] T. Dreossi, D. J. Fremont, S. Ghosh, E. Kim, H. Ravanbakhsh, M. Vazquez-Chanlatte, and S. A. Seshia. Verifai: A toolkit for the formal design and analysis of artificial intelligencebased systems. In International Conference on Computer Aided Verification, pages 432–442.
- [33] S. Duane, A. D. Kennedy, B. J. Pendleton, and D. Roweth. Hybrid monte carlo. Physics letters B, 195(2):216–222, 1987.
- [34] J. C. Duchi, M. I. Jordan, M. J. Wainwright, and A. Wibisono. Optimal rates for zeroorder convex optimization: The power of two function evaluations. IEEE Transactions on Information Theory, 61(5):2788–2806, 2015.
- [35] A. Durmus, E. Moulines, et al. Nonasymptotic convergence analysis for the unadjusted langevin algorithm. The Annals of Applied Probability, 27(3):1551–1587, 2017.
- [37] P. M. Esfahani and D. Kuhn. Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. arXiv:1505.05116 [math.OC], 2015.
- [38] J. M. Esposito, J. Kim, and V. Kumar. Adaptive rrts for validating hybrid robotic control systems. In Algorithmic Foundations of Robotics VI, pages 107–121.
- [39] A. Gelman and X.-L. Meng. Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical science, pages 163–185, 1998.
- [40] M. Girolami and B. Calderhead. Riemann manifold langevin and hamiltonian monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2): 123–214, 2011.
- [41] I. Goodfellow, Y. Bengio, and A. Courville. Deep learning. MIT press, 2016.
- [42] D. Ha and J. Schmidhuber. World models. arXiv preprint arXiv:1803.10122, 2018.
- [43] J. M. Hammersley and D. C. Handscomb. Monte carlo methods. 1964.
- [45] T. C. Hesterberg and B. L. Nelson. Control variates for probability and quantile estimation. Management Science, 44(9):1295–1312, 1998.
- [46] M. Hoffman, P. Sountsov, J. V. Dillon, I. Langmore, D. Tran, and S. Vasudevan. Neutralizing bad geometry in hamiltonian monte carlo using neural transport. arXiv preprint arXiv:1903.03704, 2019.
- [47] R. Ivanov, J. Weimer, R. Alur, G. J. Pappas, and I. Lee. Verisig: verifying safety properties of hybrid systems with neural network controllers. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control, pages 169–178. ACM, 2019.
- [48] R. Ivanov, T. J. Carpenter, J. Weimer, R. Alur, G. J. Pappas, and I. Lee. Case study: verifying the safety of an autonomous racing car with a neural network controller. In Proceedings of the 23rd International Conference on Hybrid Systems: Computation and Control, pages 1–7, 2020.
- [49] H. Kahn and T. Harris. Estimation of particle transmission by random sampling. 1951.
- [50] N. Kalra. Challenges and Approaches to Realizing Autonomous Vehicle Safety. RAND, 2017.
- [51] A. Karpathy. Software 2.0. Medium. com, 2017.
- [52] G. Katz, C. Barrett, D. Dill, K. Julian, and M. Kochenderfer. Reluplex: An efficient smt solver for verifying deep neural networks. arXiv:1702.01135 [cs.AI], 1:1, 2017.
- [53] D. P. Kingma, T. Salimans, R. Jozefowicz, X. Chen, I. Sutskever, and M. Welling. Improved variational inference with inverse autoregressive flow. In Advances in neural information processing systems, pages 4743–4751, 2016.
- [54] O. Klimov. Carracing-v0. 2016. URL https://gym.openai.com/envs/CarRacing-v0.
- [55] S. Kong, S. Gao, W. Chen, and E. Clarke. dreach: δ-reachability analysis for hybrid systems. In International Conference on TOOLS and Algorithms for the Construction and Analysis of Systems, pages 200–205.
- [56] B. Leimkuhler and S. Reich. Simulating Hamiltonian Dynamics, volume 14. Cambridge University Press, 2004.
- [57] K. Leung, N. Arechiga, and M. Pavone. Back-propagation through stl specifications: Infusing logical structure into gradient-based methods.
- [58] W. Luo, B. Yang, and R. Urtasun. Fast and furious: Real time end-to-end 3d detection, tracking and motion forecasting with a single convolutional net. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 3569–3577, 2018.
- [59] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- [60] O. Mangoubi and A. Smith. Rapid mixing of hamiltonian monte carlo on strongly log-concave distributions. arXiv preprint arXiv:1708.07114, 2017.
- [61] A. W. Marshall. The use of multi-stage sampling schemes in monte carlo computations. Technical report, RAND CORP SANTA MONICA CALIF, 1954.
- [62] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pages 1273–1282, 2017.
- [63] X.-L. Meng and S. Schilling. Warp bridge sampling. Journal of Computational and Graphical Statistics, 11(3):552–586, 2002.
- [64] X.-L. Meng and W. H. Wong. Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statistica Sinica, pages 831–860, 1996.
- [65] A. W. Moore. Efficient memory-based learning for robot control. Technical report, University of Cambridge, Computer Laboratory, 1990.
- [66] B. Nachman and D. Shih. Anomaly detection with density estimation. Physical Review D, 101(7):075042, 2020.
- [67] S. Nakamoto and A. Bitcoin. A peer-to-peer electronic cash system. Bitcoin.–URL: https://bitcoin.org/bitcoin.pdf, 2008.
- [68] H. Namkoong and J. C. Duchi. Stochastic gradient methods for distributionally robust optimization with f-divergences. In Advances in neural information processing systems, pages 2208–2216, 2016.
- [69] R. M. Neal. Annealed importance sampling. Statistics and computing, 11(2):125–139, 2001.
- [70] R. M. Neal. Estimating ratios of normalizing constants using linked importance sampling. arXiv preprint math/0511216, 2005.
- [71] R. M. Neal. Mcmc using hamiltonian dynamics. arXiv preprint arXiv:1206.1901, 2012.
- [72] J. Norden, M. O’Kelly, and A. Sinha. Efficient black-box assessment of autonomous vehicle safety. arXiv preprint arXiv:1912.03618, 2019.
- [73] M.-S. Oh and J. O. Berger. Adaptive importance sampling in monte carlo integration. Journal of Statistical Computation and Simulation, 41(3-4):143–168, 1992.
- [74] M. O’Kelly, A. Sinha, H. Namkoong, R. Tedrake, and J. C. Duchi. Scalable end-to-end autonomous vehicle testing via rare-event simulation. In Advances in Neural Information Processing Systems, pages 9827–9838, 2018.
- [75] M. O’Kelly, A. Sinha, J. Norden, and H. Namkoong. In-silico risk analysis of personalized artificial pancreas controllers via rare-event simulation. arXiv preprint arXiv:1812.00293, 2018.
- [76] Y. V. Pant, H. Abbas, and R. Mangharam. Smooth operator: Control using the smooth robustness of temporal logic. In 2017 IEEE Conference on Control Technology and Applications (CCTA), pages 1235–1240. IEEE, 2017.
- [77] G. Papamakarios, T. Pavlakou, and I. Murray. Masked autoregressive flow for density estimation. In Advances in Neural Information Processing Systems, pages 2338–2347, 2017.
- [78] G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mohamed, and B. Lakshminarayanan. Normalizing flows for probabilistic modeling and inference. arXiv preprint arXiv:1912.02762, 2019.
- [79] X. Qin, N. Arechiga, A. Best, and J. Deshmukh. Automatic testing and falsification with dynamically constrained reinforcement learning. arXiv preprint arXiv:1910.13645, 2019.
- [80] H. Rahimian and S. Mehrotra. Distributionally robust optimization: A review. arXiv preprint arXiv:1908.05659, 2019.
- [81] C. Re. Software 2.0 and snorkel: beyond hand-labeled data. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2876–2876, 2018.
- [82] D. J. Rezende and S. Mohamed. Variational inference with normalizing flows. In Proceedings of the 32nd International Conference on International Conference on Machine LearningVolume 37, pages 1530–1538. JMLR. org, 2015.
- [83] J. Ridderhof and P. Tsiotras. Minimum-fuel powered descent in the presence of random disturbances. In AIAA Scitech 2019 Forum, page 0646, 2019.
- [84] G. O. Roberts and O. Stramer. Langevin diffusions and metropolis-hastings algorithms. Methodology and computing in applied probability, 4(4):337–357, 2002.
- [85] P. J. Rossky, J. Doll, and H. Friedman. Brownian dynamics as smart monte carlo simulation. The Journal of Chemical Physics, 69(10):4628–4633, 1978.
- [86] R. Y. Rubinstein and D. P. Kroese. The cross-entropy method: A unified approach to Monte Carlo simulation, randomized optimization and machine learning. Information Science & Statistics, Springer Verlag, NY, 2004.
- [87] R. Y. Rubinstein and R. Marcus. Efficiency of multivariate control variates in monte carlo simulation. Operations Research, 33(3):661–677, 1985.
- [88] B. Shahbaba, S. Lan, W. O. Johnson, and R. M. Neal. Split hamiltonian monte carlo. Statistics and Computing, 24(3):339–349, 2014.
- [89] S. Shalev-Shwartz, S. Shammah, and A. Shashua. On a formal model of safe and scalable self-driving cars. arXiv preprint arXiv:1708.06374, 2017.
- [90] D. Siegmund. Importance sampling in the monte carlo study of sequential tests. The Annals of Statistics, pages 673–684, 1976.
- [91] A. Sinha, H. Namkoong, and J. Duchi. Certifying some distributional robustness with principled adversarial training. arXiv preprint arXiv:1710.10571, 2017.
- [92] R. Sparrow. Killer robots. Journal of applied philosophy, 24(1):62–77, 2007.
- [93] R. Sparrow and M. Howard. When human beings are like drunk robots: Driverless vehicles, ethics, and the future of transport. Transportation Research Part C: Emerging Technologies, 80:206–215, 2017.
- [94] Y. Tang, D. Nguyen, and D. Ha. Neuroevolution of self-interpretable agents. arXiv preprint arXiv:2003.08165, 2020.
- [95] V. Tjeng and R. Tedrake. Verifying neural networks with mixed integer programming. arXiv:1711.07356 [cs.LG], 2017.
- [96] W. F. Tosney and P. G. Cheng. Space safety is no accident how the aerospace corporation promotes space safety. In Space Safety is No Accident, pages 101–108.
- [97] J. Uesato, A. Kumar, C. Szepesvari, T. Erez, A. Ruderman, K. Anderson, N. Heess, P. Kohli, et al. Rigorous agent evaluation: An adversarial approach to uncover catastrophic failures. arXiv preprint arXiv:1812.01647, 2018.
- [98] A. F. Voter. A monte carlo method for determining free-energy differences and transition state theory rate constants. The Journal of chemical physics, 82(4):1890–1899, 1985.
- [99] D. Wang, C. Devin, Q.-Z. Cai, P. Krahenbuhl, and T. Darrell. Monocular plan view networks for autonomous driving. arXiv preprint arXiv:1905.06937, 2019.
- [100] S. Webb, T. Rainforth, Y. W. Teh, and M. P. Kumar. A statistical approach to assessing neural network robustness. 2018.
- [101] R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229–256, 1992.
- [102] J. M. Wing. Trustworthy ai. arXiv preprint arXiv:2002.06276, 2020.
- [103] S. Yaghoubi and G. Fainekos. Falsification of temporal logic requirements using gradient based local search in space and time. IFAC-PapersOnLine, 51(16):103–108, 2018.
- [104] A. Zutshi, J. V. Deshmukh, S. Sankaranarayanan, and J. Kapinski. Multiple shooting, cegarbased falsification for hybrid systems. In Proceedings of the 14th International Conference on Embedded Software, pages 1–10, 2014. In this experiment we explore the effect of domain shift on a formally verified neural network. We utilize the neural network designed by Ivanov et al. [47]; it contains two hidden layers, each of 16 neurons, for a total of 337 parameters. For our experiments we use the trained network parameters available at: https://github.com/Verisig/verisig. Ivanov et al.[47] describe a layer-by-layer approach to verification which over-approximates the reachable set of the combined dynamics of the environment and the neural network. An encoding of this system (network and environment)is developed for the tool Flow∗ [22] which constructs the (overapproximate)reachable set via a Taylor approximation of the combined dynamics.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn