From Training to Real-Time Inference: How to Solve Computer Vision Challenges in Healthcare Learn More >

ML Production: A Model’s Journey Through Wallaroo

Model journey | Wallaroo.AI

If you’d like to learn about how Wallaroo can streamline your ML production process, reach out to us at for more information.

Training a model these days is becoming easier than ever. Data Scientists have several data management tools at their disposal, to clean and prepare data before going to the model training and validation step. But once the Data Scientist has built their best-trained models, those models must be put into production to start producing actionable insights. How does this happen? It’s the million-dollar question: even though 90% of Fortune 1000 companies are investing in data science-related efforts, only about half of those efforts make it to production, and only about 10%  generate substantial ROI.

At Wallaroo, our mission is to help companies realize value from their AI efforts, by focusing on the last mile of the Machine Learning (ML) process: going to production. In this article, we’ll describe a model’s journey through Wallaroo.

The ML Model Lifecycle

Every ML project starts with a business problem. How does the business maximize revenue? How do we make our manufacturing process more efficient? How do we know what our customers want?  Data Scientists excel at turning data into models that help solve these business problems and make business decisions. But the expertise and skills required to build great models aren’t necessarily the same skills needed to put those models in the real world, monitor and maintain them. That phase requires a different set of expertise and is usually taken on by engineers with titles such as ML Engineer or DevOps. 

The skillset for ML Engineers is completely different from the Data Scientist skillset. Whereas the Data Scientist focuses on training accurate models, the ML Engineer has to consider issues like:

  • Data Pipelines: How the data is moved in and out of the system
  • Resource Management: How to best split and use resources for one or more models
  • Authorization and Authentication: How to verify that only authorized users can submit information and retrieve the results

Because of this mismatched expertise, Data Scientists often struggle with deploying and validating their ML models in production; they depend heavily on ML Engineers or DevOps to help them in testing the models against unseen data. Once a model is in production, Data Scientists and ML Engineers have to make sure that the model is performing correctly, and that it continues to meet business objectives. If a model could be performing better, then the team has to iterate to develop and deploy a new version.

Without an efficient interface between the Data Scientist and ML Engineer roles, it’s hard to get models up and running in a timely manner and to keep them up to date.

This is where Wallaroo comes in.

The Wallaroo platform provides a standardized process that ML Engineering and data science teams can use to deploy, run and observe models across platforms, clouds, and environments (in the cloud, on-premises, or at the edge). Our Connector Framework neatly plugs our platform with your incoming and outgoing data points and takes care of the integration to get you up and running in no time. Data Scientists and ML Engineers can then collaborate via the Wallaroo SDK, UI, and API to deploy, manage, and monitor their models.

Let’s look at what happens when a model is uploaded to Wallaroo.

The Data Scientist Uploads A Model

Wallaroo focuses on the ML last mile, rather than the entire process; our platform fits smoothly into your existing data science ecosystem. This means that Data Scientists can train their models using their preferred environment. Wallaroo supports the ONNX model format for maximal interoperability with a variety of frameworks; the platform directly supports TensorFlow, as well.

Once models have been uploaded to Wallaroo, they can be put into a model pipeline. Pipelines define an ML workflow. The simplest pipeline is a single model (with its sources and sinks); more sophisticated pipelines can chain multiple models together and can include intermediate data processing steps needed to reshape or otherwise massage data into a form compatible with the models. Pipelines can also be used to set up to easily set up experiments in the production environment, such as A/B tests or shadow deployments: all the models in an experiment pipeline receive data from the same endpoint, and the pipeline manages routing that data to the appropriate models, and tracking which inferences came from which model.

A defined pipeline can then be deployed and undeployed.

This whole process can easily be done by a Data Scientist in a few lines of Python:

What Happens Next?

Once a pipeline has been deployed, an inference REST API endpoint is generated, which can be accessed via HTTP through a notebook, with the pipeline.infer() method. This lets the Data Scientist verify that the pipeline has been deployed successfully.

And of course, those RESTFUL endpoints can be accessed by other systems and applications, either for batch, streaming, or request/response inferencing. The resulting model predictions are sent to the appropriate sink, as well as being made available for analysis by Wallaroo’s model insights functionality.

Once the pipeline is up and running, both Data Scientists and ML Engineers can monitor model performance via the Wallaroo dashboard. They can define anomaly detection and data validation checks on pipelines, and set up alerts so the appropriate people get notified if something goes wrong. Because Wallaroo tracks model versioning, it’s easy to update models or roll them back, as needed.

And that’s it! Wallaroo’s high-performance Rust compute engine is designed to run your models at blazing speed, using fewer resources. The platform’s autoscaling capabilities let individual engines scale their resource consumption up and down within preset limits, so you are only expending those resources when they are needed.

With Wallaroo, Data Scientists and ML Engineers can collaborate to get ML models into production quickly, efficiently, and painlessly. If you’d like to learn about how Wallaroo can streamline your ML production process, reach out to us at for more information.

Table of Contents



Related Blog Posts

Get Your AI Models Into Production, Fast.

Unblock your AI team with the easiest, fastest, and most flexible way to deploy AI without complexity or compromise. 

Keep up to date with the latest ML production news sign up for the Wallaroo.AI newsletter

Platform Learn how our unified platform enables ML deployment, serving, observability and optimization
Technology Get a deeper dive into the unique technology behind our ML production platform
Solutions See how our unified ML platform supports any model for any use case
Computer Vision (AI) Run even complex models in constrained environments, with hundreds or thousands of endpoints