The Rise of MLOps: What We Can All Learn from DevOps

The MLOps Conference took place earlier this week at Hudson Mercantile in New York City. Experts from the New York Times, Twitter, Netflix and Iguazio, the host company, spoke about best practices and machine learning implementation throughout a variety of different organizations.

I learned of the technological void that exists when data scientists want to implement machine learning. With this new context in mind, I can approach conversations with our data team from a new perspective, and take the time to understand how we can implement new models on our team.

Machine learning as a technology has been around for more than 50 years, beginning with Arthur Samuel’s pioneering work at IBM where his program helped the computer improve with each game of checkers it played in 1952. But despite this progress, the ability to deploy new models remains a challenge. In fact, the pipeline to deploy new models can take weeks or months with many models never making it to production.

This is the importance of applying DevOps methods to machine learning (MLOps). As explained by Julie Pitt and Ashish Rastogi of Netflix, data scientists are most involved with the solution of applicable technologies. The challenge is that companies need to have the entire infrastructure underlying those technologies in order to implement machine learning.

Thankfully, new solutions and best practices are coming onto the market to address these problems. Brittany Wills explained that Twitter created a Feature Store (essentially a library) where data scientists can publish their latest work.

By sharing best practices and verified models across the organization, various teams no longer have to constantly build from scratch, enabling them to activate models in days rather than months.

Additionally, David Aronchick from Microsoft argued for the importance of regularly training your model. The data that the model was originally trained with goes stale very quickly in production and if you’re not careful, this could lead to data drift.

While a Feature Store may require organizational buy-in, focusing on the micro elements of one model can have monumental impacts on the way consumers experience your products.

One of the most interesting technologies shared during the conference was Iguazio’s Nuclio. Iguazio is creating the data science platform for production, enabling companies ranging from startups to large enterprises to introduce machine learning and AI into their products in a fast and scalable manner.

Orit Nissan-Messing, their VP of R&D, sat down with me to explain their latest innovation.

Could you explain what it is that Iguazio does?
Iguazio builds architecture to streamline data science to production. We recognize that data scientists waste a majority of their time on “plumbing” rather than building actual models. We seek to solve that problem.

And how does your latest innovation, Nuclio address this challenge?
Nuclio is an open source serverless platform that seeks to automate the delivery of code and models to production. Nuclio is unique in that it addresses ML workloads and provide high-performance, it can increase data throughput or accelerate ML training while decreasing latency by scaling computation resources dynamically. Essentially, serverless technology enables companies to focus on the application without worrying about dev ops, performance, or scalability.

How does this complement the work that Iguazio is already doing?
Our data science platform is unique to the market because we are the only one that can work on any cloud or even on-prem and handle any data type. We work with AWS and Microsoft to pull in data from their cloud resources, enabling companies to build insightful models with data that would have otherwise been siloed. Adding Nuclio supercharges this data and simplify the delivery of intelligent business application.

What what you tell Product Managers who rely on machine learning or are interested in learning more about it?
The way in which you collect your data will ultimately have a large impact on the final model and application. It’s critical to ensure that your data is accessible and stored properly. When you have data from different sources you must also normalize that data to prevent false quantitative bias. Ultimately, the Product Managers are most responsible for driving the vision and bringing the business problems to the Data Scientists, but it’s helpful for them to have context around the ML infrastructure so that they can have a sense of cost to develop and reliability of the model.

Overall, it was an insightful glimpse into the infrastructure that underlies machine learning and how each component can still present an obstacle at a company who is keen to venture into ML.

For Product Managers, here are a few key takeaways:

Bring the business problem – Product Managers are not data scientists and should instead focus on providing the problem that needs to be solved and why it’s valuable to commit resources to addressing it.

For example, a company like Lyft or Uber may ask how many customers will adopt bike/scooter sharing in a new city?
Know your data – understand where the data comes from and how it could influence the application. Bias begins with the data, be sure to consider a variety of use cases.

In this example, let’s assume that the company already has a city that has adopted bike/scooter sharing. Lyft and Uber both have bikes and scooters in San Francisco. With this in mind, consider how the usage data may be biased in SF. Some points of bias may be introduced by: the median age of the population in SF, the hilly terrain in the city may prevent usage in particular neighborhoods, population of SF has a large propensity to work in tech – does that influence their adoption of new technology? Or even, the number of single households vs. number of families. This is a similar exercise to considering any ambiguous opportunity that a PM must do and is highly critical when evaluating ML models. By understanding your data, you will be in a better position to understand the risk of this model and if and why it fails.
Learn how ML operates at your company – Ask your team how they approach building a model. Do they have to start from scratch? What blockers do they have? This will help you develop a better understanding of the cost/time to build so that you can make the appropriate trade off decisions.
In a large company, it is likely that the organization and/or individual teams have a process for implementing new ML models. However, as I learned from this conference, even Twitter only recently implemented their Feature Store to help various teams collaborate. It’s always worth asking how things work at your organization, don’t assume that everything is organized behind the curtain!

Image credits

Photo by Aleksejs Bergmanis from Pexels
Human-centric Machine Learning Infrastructure @Netflix