MLOps: How does it differ from DevOps?

Written on March 16, 2021 by Jason Horwood

DevOps (Development Operations) has changed the world of traditional software development, by making it possible for companies to ship software to production in minutes and keep this software running reliably.

There is a new kid on the block in the world of software though which is threatening to change everything again, and that is MLOps. But what is it and how can your software team help to achieve its goals? 

What is MLOps?

MLOps has been defined by Nvidia as “a set of best practices for businesses to run AI successfully.”

We would expand on that to say that MLOps (or Machine Learning Operations) is a set of best practices that combines Data Engineering, DevOps and Machine Learning in order to reliably and efficiently deploy and maintain Machine Learning systems in production.

What are MLOps Best Practices?

Best Practice 1: Hybrid Teams

One of the biggest challenges faced by Machine Learning teams is to actually get the Machine Learning systems into production. Potentially an MLOps Engineer could theoretically have all the skills necessary to productionize an ML model, but in reality, at the moment it is more likely that a hybrid team – made up of a Data Scientist, Data Engineer, ML Engineer and a DevOps Engineer – would be more successful.

Obviously, the exact composition of the hybrid team could vary, but the important thing for all business owners to remember is that to achieve their Machine Learning Ops goals is going to take much more than just one Data Scientist. It needs a team of people all working closely together in order to get the Machine Learning model to work.

Best Practice 2: Machine Learning Pipelines

As we have discussed before, one of the core concepts of Data Engineering is the data pipeline i.e. the series of transformations that are applied to data between its source and its final destination. These pipelines are sometimes called ETL (Extract, Transform and Load) pipelines.

All Machine Learning models require some sort of data transformation along the way, which is usually managed through the use of cells in a notebook or scripts – however, this tends to make them hard to run reliably and hard to manage. Switching over to more robust Machine Learning pipelines can not only help with management and scalability but also code reuse and run-time visibility as well.

Best Practice 3: Model and Data Versioning

Being able to track model versions is essential in Machine Learning, as well as being able to track the data used to train the model and other meta-information such as training hyperparameters.

Git software is often used for tracking models and metadata as it is a standard version control system, but some companies have found that it is not as practical with large amounts of data.

The ideal way to track model and data versioning for Machine Learning models is with a purpose-built tool that would be able to tie each model to the exact version of code, data and hyperparameters that were used – but this doesn’t seem to exist on the market today.

Best Practice 4: Model Validation

Test automation is a standard DevOps practice, which is usually done through the use of integration and unit tests which must be passed before the unit is deployed. Machine Learning models, however, are harder to test as you will never get 100% correct results.

Model Validation tests, therefore, need to be statistical in nature, rather than relying on a pass/fail status and so it is important to decide what metrics you are going to track and what the empirical threshold of their acceptable values is – as compared to previous benchmarks or models.

Best Practice 5: Data Validation

Machine Learning training and prediction data pipelines all rely on good validation of input data – such as column types, empty values, file format and size, and invalid values. If this doesn’t happen then you may end up with a misbehaving model and no idea of why this is the case.

As well as the basic data validations we have outlined above, Machine Learning data pipelines should also cover validations of higher statistical properties of the input.

Best Practice 6: Monitoring Model Performance

As with any production system, monitoring Machine Learning systems is essential as their performance not only depends on infrastructure and software (which you have some control over) but also data (which you have less control over).

This brings about its own challenges as you probably won’t have a verified label to compare your model’s predictions to – but there are solutions to these problems. It is also important to remember to monitor metrics across slices to be able to detect any problems that are affecting specific sectors.

MLOps is a brand new and exciting discipline that seems to be evolving very quickly. To take advantage of the opportunities relating to MLOps, please get in touch with the friendly and experienced team at Agile Recruit.