s semantics is computable, meaning that values can be computed to arbitrary precision, and we implement $\\lambda_S$ as an embedded language in Haskell. We use $\\lambda_S$ to construct novel differentiable libraries for representing probability distributions, implicit surfaces, and generalized parametric surfaces -- all as instances of higher-order datatypes -- and present case studies that rely on computing the derivatives of these higher-order functions and datatypes. In addition to modeling existing differentiable algorithms, such as a differentiable ray tracer for implicit surfaces, without requiring any user-level differentiation code, we demonstrate new differentiable algorithms, such as the Hausdorff distance of generalized parametric surfaces. ","authors":[{"name":"Benjamin Sherman"},{"name":"Jesse Michel"},{"id":"53f42b2fdabfaec09f0f132d","name":"Michael Carbin"}],"flags":[{"flag":"affirm_author","person_id":"53f42b2fdabfaec09f0f132d"}],"id":"5f116dc591e011264d4475c5","num_citation":0,"order":2,"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2007\u002F2007.08017.pdf","title":"$\\lambda_S$: Computable semantics for differentiable programming with higher-order functions and datatypes","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.08017"],"versions":[{"id":"5f116dc591e011264d4475c5","sid":"2007.08017","src":"arxiv","year":2020}],"year":2020},{"abstract":"Computer programs are increasingly being deployed in partially-observable environments. A partially observable environment is an environment whose state is not completely visible to the program, but from which the program receives partial observations. Developers typically deal with partial observability by writing a state estimator that, given observations, attempts to deduce the hidden state of the environment. In safety-critical domains, to formally verify safety properties developers may write an environment model. The model captures the relationship between observations and hidden states and is used to prove the software correct. \n\nIn this paper, we present a new methodology for writing and verifying programs in partially observable environments. We present belief programming, a programming methodology where developers write an environment model that the program runtime automatically uses to perform state estimation. A belief program dynamically updates and queries a belief state that captures the possible states the environment could be in. To enable verification, we present Epistemic Hoare Logic that reasons about the possible belief states of a belief program the same way that classical Hoare logic reasons about the possible states of a program. We develop these concepts by defining a semantics and a program logic for a simple core language called BLIMP. In a case study, we show how belief programming could be used to write and verify a controller for the Mars Polar Lander in BLIMP. We present an implementation of BLIMP called CBLIMP and evaluate it to determine the feasibility of belief programming.\n\n","authors":[{"name":"Eric Atkinson"},{"id":"53f42b2fdabfaec09f0f132d","name":"Michael Carbin"}],"doi":"10.1145\u002F3428268","id":"5fb0fe2ed4150a363c0a73f9","num_citation":0,"order":1,"pages":{"end":"28","start":"1"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Facm\u002F20\u002Fpacmpl\u002F10.1145\u002F3428268.pdf","title":"Programming and reasoning with partial observability","urls":["https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3428268","db\u002Fjournals\u002Fpacmpl\u002Fpacmpl4.html#AtkinsonC20","https:\u002F\u002Fdoi.org\u002F10.1145\u002F3428268","https:\u002F\u002Farxiv.org\u002Fabs\u002F2101.04742","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fjournals\u002Fpacmpl\u002Fpacmpl4.html#AtkinsonC20","https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3428268","https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3428268"],"venue":{"info":{"name":"Proceedings of the ACM on Programming Languages"},"issue":"OOPSLA","volume":"4"},"versions":[{"id":"5fb0fe2ed4150a363c0a73f9","sid":"10.1145\u002F3428268","src":"acm","vsid":"pacmpl","year":2020},{"id":"5ff7017ad4150a363c3b28e0","sid":"journals\u002Fpacmpl\u002FAtkinsonC20","src":"dblp","vsid":"journals\u002Fpacmpl","year":2020},{"id":"60000ca091e011b170d7be75","sid":"2101.04742","src":"arxiv","year":2021},{"id":"603769f6d3485cfff1df589e","sid":"3124211907","src":"mag","vsid":"0#dblp#journals\u002Fpacmpl","year":2020}],"year":2020},{"abstract":" In this paper, we demonstrate a compiler that can optimize sparse and recurrent neural networks, both of which are currently outside of the scope of existing neural network compilers (sparse neural networks here stand for networks that can be accelerated with sparse tensor algebra techniques). Our demonstration includes a mapping of sparse and recurrent neural networks to the polyhedral model along with an implementation of our approach in TIRAMISU, our state-of-the-art polyhedral compiler. We evaluate our approach on a set of deep learning benchmarks and compare our results with hand-optimized industrial libraries. Our results show that our approach at least matches Intel MKL-DNN and in some cases outperforms it by 5x (on multicore-CPUs). ","authors":[{"id":"53f4306edabfaedf4353d05f","name":"Baghdadi Riyadh"},{"name":"Debbagh Abdelkader Nadir"},{"name":"Abdous Kamel"},{"name":"Benhamida Fatima Zohra"},{"id":"61819b328672f199e5d95577","name":"Renda Alex"},{"name":"Frankle Jonathan Elliott"},{"id":"53f42b2fdabfaec09f0f132d","name":"Carbin Michael"},{"id":"53f439d0dabfaeecd697e422","name":"Amarasinghe Saman"}],"id":"5eb9222f91e0118cfef9826c","num_citation":0,"order":6,"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2005\u002F2005.04091.pdf","title":"TIRAMISU: A Polyhedral Compiler for Dense and Sparse Deep Learning","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2005.04091"],"versions":[{"id":"5eb9222f91e0118cfef9826c","sid":"2005.04091","src":"arxiv","year":2020}],"year":2020},{"abstract":"Neural network pruning is a popular technique for reducing inference costs by removing connections, neurons, or other structure from the network. In the literature, pruning typically follows a standard procedure: train the network, remove unwanted structure (pruning), and train the resulting network further to recover accuracy (fine-tuning). In this paper, we explore an alternative to fine-tuning: rewinding. Rather than continuing to train the resultant pruned network (fine-tuning), rewind the remaining weights to their values from earlier in training, and re-train the resultant network for the remainder of the original training process. We find that this procedure, which repurposes the strategy for finding lottery tickets presented by Frankle et al. (2019), makes it possible to prune networks further than is possible with fine-tuning for a given target accuracy, provided that the weights are rewound to a suitable point in training. We also find that there are wide ranges of suitable rewind points that achieve higher accuracy than fine-tuning across all tested networks. Based on these results, we argue that practitioners should explore rewinding as an alternative to fine-tuning for neural network pruning.","authors":[{"id":"61819b328672f199e5d95577","name":"Alex Renda"},{"name":"Jonathan Frankle"},{"id":"53f42b2fdabfaec09f0f132d","name":"Michael Carbin"}],"doi":"","id":"5e5e18c793d709897ce302ef","num_citation":0,"order":2,"pages":{"end":"","start":""},"title":"Comparing Fine-tuning and Rewinding in Neural Network Pruning","urls":["https:\u002F\u002Fopenreview.net\u002Fforum?id=S1gSj0NKvB","https:\u002F\u002Fopenreview.net\u002Fpdf?id=S1gSj0NKvB"],"venue":{"info":{"name":"international conference on learning representations"},"issue":"","volume":""},"versions":[{"id":"5e5e18c793d709897ce302ef","sid":"2995816250","src":"mag","vsid":"2584161585","year":2020},{"id":"5ecbc4ea9fced0a24b4e0975","sid":"3015634187","src":"mag","vsid":"2584161585","year":2020}],"year":2020},{"abstract":" We introduce \"instability analysis,\" a framework for assessing whether the outcome of optimizing a neural network is robust to SGD noise. It entails training two copies of a network on different random data orders. If error does not increase along the linear path between the trained parameters, we say the network is \"stable.\" Instability analysis reveals new properties of neural networks. For example, standard vision models are initially unstable but become stable early in training; from then on, the outcome of optimization is determined up to linear interpolation. We leverage instability analysis to examine iterative magnitude pruning (IMP), the procedure underlying the lottery ticket hypothesis. On small vision tasks, IMP finds sparse \"matching subnetworks\" that can train in isolation from initialization to full accuracy, but it fails to do so in more challenging settings. We find that IMP subnetworks are matching only when they are stable. In cases where IMP subnetworks are unstable at initialization, they become stable and matching early in training. We augment IMP to rewind subnetworks to their weights early in training, producing sparse subnetworks of large-scale networks, including Resnet-50 for ImageNet, that train to full accuracy. This submission subsumes 1903.01611 (\"Stabilizing the Lottery Ticket Hypothesis\" and \"The Lottery Ticket Hypothesis at Scale\"). ","authors":[{"name":"Frankle Jonathan"},{"id":"604b383101caee8440f2d028","name":"Dziugaite Gintare Karolina"},{"id":"53f431d4dabfaee43ebfe751","name":"Roy Daniel M."},{"id":"53f42b2fdabfaec09f0f132d","name":"Carbin Michael"}],"id":"5df371db3a55acfd20674a25","num_citation":75,"order":3,"pages":{"end":"3269","start":"3259"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F19\u002F1912\u002F1912.05671.pdf","title":"Linear Mode Connectivity and the Lottery Ticket Hypothesis","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.05671","https:\u002F\u002Ficml.cc\u002FConferences\u002F2020\u002FAcceptedPapersInitial","https:\u002F\u002Ficml.cc\u002FConferences\u002F2020\u002FAcceptedPapersInitial#934","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fjournals\u002Fcorr\u002Fcorr1912.html#abs-1912-05671","https:\u002F\u002Fwww.arxiv-vanity.com\u002Fpapers\u002F1912.05671\u002F","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Ficml\u002FFrankleD0C20","http:\u002F\u002Fproceedings.mlr.press\u002Fv119\u002Ffrankle20a.html"],"venue":{"info":{"name":"ICML"}},"versions":[{"id":"5df371db3a55acfd20674a25","sid":"1912.05671","src":"arxiv","year":2019},{"id":"5ede0553e06a4c1b26a841ae","sid":"icml2020#934","src":"conf_icml","year":2020},{"id":"5fae6dadd4150a363cec0ab6","sid":"3035081900","src":"mag","vsid":"1180662882","year":2020},{"id":"5ff881dd91e011c83266e49d","sid":"conf\u002Ficml\u002FFrankleD0C20","src":"dblp","vsid":"conf\u002Ficml","year":2020}],"year":2020},{"abstract":"Synchronous modeling is at the heart of programming languages like Lustre, Esterel, or Scade used routinely for implementing safety critical control software, e.g., fly-by-wire and engine control in planes. However, to date these languages have had limited modern support for modeling uncertainty --- probabilistic aspects of the software's environment or behavior --- even though modeling uncertainty is a primary activity when designing a control system. \n\n In this paper we present ProbZelus the first synchronous probabilistic programming language. ProbZelus conservatively provides the facilities of a synchronous language to write control software, with probabilistic constructs to model uncertainties and perform inference-in-the-loop. \n\n We present the design and implementation of the language. We propose a measure-theoretic semantics of probabilistic stream functions and a simple type discipline to separate deterministic and probabilistic expressions. We demonstrate a semantics-preserving compilation into a first-order functional language that lends itself to a simple presentation of inference algorithms for streaming models. We also redesign the delayed sampling inference algorithm to provide efficient streaming inference. Together with an evaluation on several reactive applications, our results demonstrate that ProbZelus enables the design of reactive probabilistic applications and efficient, bounded memory inference.\n\n","authors":[{"id":"53f43a81dabfaee02acf1f27","name":"Baudart Guillaume"},{"name":"Mandel Louis"},{"name":"Atkinson Eric"},{"name":"Sherman Benjamin"},{"id":"53f446cedabfaec09f1cac31","name":"Pouzet Marc"},{"id":"53f42b2fdabfaec09f0f132d","name":"Carbin Michael"}],"doi":"10.1145\u002F3385412.3386009","id":"5d5e6b943a55acfce79a1579","num_citation":2,"order":5,"pages":{"end":"912","start":"898"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F19\u002F1908\u002F1908.07563.pdf","title":"Reactive Probabilistic Programming","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F1908.07563","https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3385412.3386009","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fpldi\u002FBaudartMASPC20","https:\u002F\u002Fdoi.org\u002F10.1145\u002F3385412.3386009","http:\u002F\u002Fwww.webofknowledge.com\u002F"],"venue":{"info":{"name":"PLDI '20: 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation \n\t\t London\n\t\t UK\n\t\t June, 2020"}},"versions":[{"id":"5d5e6b943a55acfce79a1579","sid":"1908.07563","src":"arxiv","year":2019},{"id":"5ede11d39fced0a24bb2b50c","sid":"10.1145\u002F3385412.3386009","src":"acm","vsid":"pldi","year":2020},{"id":"5ee2062891e0116d10fcd30a","sid":"conf\u002Fpldi\u002FBaudartMASPC20","src":"dblp","vsid":"conf\u002Fpldi","year":2020},{"id":"619b9c9a1c45e57ce92c3c56","sid":"WOS:000614622300060","src":"wos","vsid":"PROCEEDINGS OF THE 41ST ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '20)","year":2020}],"year":2020},{"abstract":"Modern probabilistic programming languages aim to formalize and automate key aspects of probabilistic modeling and inference. Many languages provide constructs for programmable inference that enable developers to improve inference speed and accuracy by tailoring an algorithm for use with a particular model or dataset. Unfortunately, it is easy to use these constructs to write unsound programs that appear to run correctly but produce incorrect results. To address this problem, we present a denotational semantics for programmable inference in higher-order probabilistic programming languages, along with a type system that ensures that well-typed inference programs are sound by construction. A central insight is that the type of a probabilistic expression can track the space of its possible execution traces, not just the type of value that it returns, as these traces are often the objects that inference algorithms manipulate. We use our semantics and type system to establish soundness properties of custom inference programs that use constructs for variational, sequential Monte Carlo, importance sampling, and Markov chain Monte Carlo inference.\n\n","authors":[{"id":"617a679a60a9657d1c3f42f5","name":"Alexander K. Lew"},{"id":"6177d18a60a9653c6b9cbbb3","name":"Marco F. Cusumano-Towner"},{"name":"Benjamin Sherman"},{"id":"53f42b2fdabfaec09f0f132d","name":"Michael Carbin"},{"name":"Vikash K. Mansinghka"}],"doi":"10.1145\u002F3371087","id":"5dfdf62cdf1a9c0c4165926b","num_citation":0,"order":3,"pages":{"end":"32","start":"19"},"pdf":"https:\u002F\u002Fcz5waila03cyo0tux1owpyofgoryrooa.oss-cn-beijing.aliyuncs.com\u002F30\u002F46\u002FBF\u002F3046BF00D0B0584CD7FA4A7A691B9885.pdf","title":"Trace types and denotational semantics for sound programmable inference in probabilistic languages","urls":["https:\u002F\u002Fdl.acm.org\u002Fcitation.cfm?id=3371087&picked=prox&preflayout=flat","db\u002Fjournals\u002Fpacmpl\u002Fpacmpl4.html#LewCSCM20","https:\u002F\u002Fdoi.org\u002F10.1145\u002F3371087","https:\u002F\u002Fpopl20.sigplan.org\u002Fdetails\u002FPOPL-2020-Research-Papers\u002F34\u002FTrace-Types-and-Denotational-Semantics-for-Sound-Programmable-Inference-in-Probabilis"],"venue":{"info":{"name":"Proceedings of the ACM on Programming Languages"},"issue":"POPL","volume":"4"},"versions":[{"id":"5dfdf62cdf1a9c0c4165926b","sid":"3371087","src":"acm","vsid":"J1568","year":2019},{"id":"5e17015ddf1a9c0c41714217","sid":"journals\u002Fpacmpl\u002FLewCSCM20","src":"dblp","vsid":"journals\u002Fpacmpl","year":2020},{"id":"5e5e189d93d709897ce1fb90","sid":"2994785018","src":"mag","vsid":"1160032607","year":2020}],"year":2020},{"abstract":"Compilers and performance engineers use hardware performance models to simplify program optimizations. Performance models provide a necessary abstraction over complex modern processors. However, constructing and maintaining a performance model can be onerous, given the numerous microarchi-tectural optimizations employed by modern processors. Despite their complexity and reported inaccuracy (e.g., deviating from native measurement by more than 30%), existing performance models-such as IACA and llvm-mca-have not been systematically validated, because there is no scalable machine code profiler that can automatically obtain throughput of arbitrary basic blocks while conforming to common modeling assumptions. In this paper, we present a novel profiler that can profile arbitrary memory-accessing basic blocks without any user intervention. We used this profiler to build BHive, a benchmark for systematic validation of performance models of x86-64 basic blocks. We used BHive to evaluate four existing performance models: IACA, llvm-mca, Ithemal, and OSACA. We automatically cluster basic blocks in the benchmark suite based on their utilization of CPU resources. Using this clustering, our benchmark can give a detailed analysis of a performance modelu0027s strengths and weaknesses on different workloads (e.g., vectorized vs. scalar basic blocks). We additionally demonstrate that our dataset well captures basic properties of two Google applications: Spanner and Dremel.","authors":[{"name":"Yishen Chen"},{"name":"Ajay Brahmakshatriya"},{"name":"Charith Mendis"},{"id":"61819b328672f199e5d95577","name":"Alex Renda"},{"name":"Eric Atkinson"},{"name":"Ondrej Sýkora"},{"id":"53f439d0dabfaeecd697e422","name":"Saman P. Amarasinghe"},{"id":"53f42b2fdabfaec09f0f132d","name":"Michael Carbin"}],"doi":"10.1109\u002FIISWC47752.2019.9042166","flags":[{"flag":"affirm_author","person_id":"53f42b2fdabfaec09f0f132d"}],"id":"5e7b2bf391e011a9394ab45c","num_citation":0,"order":7,"pages":{"end":"177","start":"167"},"title":"BHive - A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models.","urls":["https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fiiswc\u002FChenBMRASAC19","https:\u002F\u002Fdoi.org\u002F10.1109\u002FIISWC47752.2019.9042166","http:\u002F\u002Fxplorestaging.ieee.org\u002Fielx7\u002F9027808\u002F9041927\u002F09042166.pdf?arnumber=9042166","http:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fconf\u002Fiiswc\u002Fiiswc2019.html#ChenBMRASAC19","http:\u002F\u002Fieeexplore.ieee.org\u002Fxpls\u002Fabs_all.jsp?arnumber=9042166"],"venue":{"info":{"name":"IISWC"},"issue":"","volume":""},"versions":[{"id":"5e7b2bf391e011a9394ab45c","sid":"conf\u002Fiiswc\u002FChenBMRASAC19","src":"dblp","vsid":"conf\u002Fiiswc","year":2019},{"id":"5ea013109fced0a24b9ce5a8","sid":"3011000655","src":"mag","vsid":"1203869711","year":2019}],"year":2019}],"profilePubsTotal":67,"profilePatentsPage":1,"profilePatents":[],"profilePatentsTotal":2,"profilePatentsEnd":true,"profileProjectsPage":0,"profileProjects":null,"profileProjectsTotal":null,"newInfo":null,"checkDelPubs":[]}};