Machine Learning OperationsMachine learning operations (MLOps) is the use of machine learning models by development/operations (DevOps) teams. MLOps seeks to add discipline to the development and deployment of machine learning models by defining processes to make ML development more reliable and productive.
Adopting principles from behavioral testing in software engineering, we propose CheckList, a model-agnostic and task-agnostic testing methodology that tests individual capabilities of the model using three different test types
Due to the lack of a process model for machine learning applications, many project organizations rely on alternative models that are closely related to ML, such as, the Cross-Industry Standard Process for Data Mining
We argue that in order to unlock the gains Machine Learning can bring, organizations should advance the maturity of their ML teams by investing in robust ML infrastructure and promoting ML Engineering education
Best practices are still being established in the MLOps community and we feel that in order to succeed, the open source tools need to be general enough to cover a majority of use cases whilst being flexible enough to allow for use case specific customization
Andrew Chen,Andy Chow,Aaron Davidson,Arjun DCunha,Ali Ghodsi,Sue Ann Hong,Andy Konwinski,Clemens Mewald,Siddharth Murching,Tomas Nykodym,Paul Ogilvie,Mani Parkhe
SIGMOD/PODS '20: International Conference on Management of Data
Portland
OR
..., pp.1-4, (2020)
MLflow is a popular open source platform for managing ML development, including experiment tracking, reproducibility, and deployment. In this paper, we discuss user feedback collected since MLflow was launched in 2018, as well as three major features we have introduced in respons...
In this paper we provide a description and classification of such tasks into high-levels groups, namely data organization, data quality and feature engineering
This paper described a set of enabling technologies that help increase the level of automation during AI operations, reducing the human effort and cost required
To the best of our knowledge, the results presented in this paper represent the first milestone in training a deep neural network having large memory footprint on a single-node server without hardware accelerators like GPUs
To fully demonstrate its feasibility and potential, we are conducting extensive offline profiling to cover all operations in popular frameworks and common hardware, and investigating the Machine learning approach for op-level performance estimation
We present CodeReef - an open platform to share all the components necessary to enable cross-platform MLOps, i.e. automating the deployment of ML models across diverse systems in the most efficient way
We provide an overview of the engineering challenges surrounding ML/DL solutions and, through this, present a research agenda and overview of open items that need to be addressed by the research community at large
This paper reports on design considerations for a course that teaches software engineering techniques for building systems with Artificial intelligence components and experience from teaching the course for the first time
We answered four research questions through a survey of software developers of ML systems and a systematic-literature review of both academic and gray literatures pertaining to ML systems and their software development
machine learning technology is increasingly at the core of sophisticated functionality provided by smart devices, household appliances and online services, often unbeknownst to their users