Product Deep Dive: How Wallaroo Simplifies and Automates Recurring ML Workloads for Data Scientists and ML Engineers

June 8, 2023

Machine learning (ML) pipelines are complex workflows that involve multiple steps, such as data preprocessing, feature engineering, model training, evaluation, and deployment. Orchestrating these steps in a scalable, reliable, and efficient manner is a challenging problem for ML practitioners and engineers involved in MLOps.

We often see do-it-yourself solutions utilizing DevOps orchestration tools like Airflow or Conductor. However, since ML is different from standard software development, particularly around the need for continuous feedback and optimization, there are key differences between the Wallaroo Enterprise Edition’s Workload Orchestration and these tools. 

That’s why early-access customers of ML Workload Orchestration features in the Enterprise Edition of Wallaroo have seen data scientists and ML engineers free up to 40% of their weekly time. Instead, they can use that time to focus on delivering and scaling additional value generating initiatives, instead of low-level plumbing.

Full documentation, including tutorials, on our ML Workflow Orchestration features is available here, but below is a brief overview of what makes using Wallaroo different from standard software development tools for ML and how it can help AI teams.

Unlock ML Production by Automating and Standardizing Key Functions

The new Workload Orchestration features let data scientists and ML engineers easily define, automate and scale recurring production ML workloads that ingest data from predefined data sources, then run inference via multiple ML pipelines, and finally send the results to predefined destinations to ensure a tight feedback loop from data to business insights. 

Here’s how it works:

  • You upload your models, define your ML Workload steps, and set your schedule (using just a few lines of Python)
  • Wallaroo takes care of orchestration, scheduling, infrastructure, resilience, data gathering and inference
  • Then you monitor the workloads and get your results when you need them. It’s really efficient, scalable, and easy to use. 
ML Production Process

We focused our efforts on helping ML practitioners who want to orchestrate ML pipelines to ingest and deposit live streams and batches of data across all environments. 

Our goal was to enable them to:

  • Easily run inferences on data across different modalities and across the environments in which the models are deployed, and
  • Easily and quickly deploy models to run without bottlenecks related to portability of features engineered in training

Automating and scaling ML workflows allows practitioners to scale models with existing or new production pipelines. Portable pipelines from training to production and the flexibility of being data source-agnostic allows AI teams to drive business value and efficiently drive business continuity.

Plus, there are several specific new capabilities, as shown below.

ML process with Wallaroo Model Operations Center and Workload Orchestration

Data Connectors

A Wallaroo platform user in a workspace can specify target data sources and connections that the pipeline can use for ingestion of data and storage of inference results. This includes databases like BigQuery and Databricks as well as block storage like GCS, S3, and Azure Cloud file storage. 

Run Pipelines Interactively with Simple Script Commands

Once data connections have been configured, Wallaroo users can run arbitrary SQL and python processing scripts with their associated libraries to shape source data into the expected model inputs and outputs.

This allows users to deploy and undeploy pipelines and schedule orchestrators to run on large data sources, whether push or pull, controlling for start/end dates, number of times to run (once, scheduled, or until canceled), as well as frequency (hourly, daily, etc). Additionally, by running ML workloads as pipelines, data scientists can easily validate and hotswap models without major reconfigurations to upstream data sources or downstream business applications. 

Easy Run Fail Detection and Identification

When an orchestration runs interactively or on a schedule and it fails, an orchestration error log can be retrieved via an API call and SDK function, with the error log providing job ID as well as run date information. 

We’ve found this helps resolve one of the greatest contributors to wasted time by data scientists and ML engineers, namely identifying when and where a job failed to run. 

The Underlying Design

Of course, the new Workload Orchestration features are only possible because of the underlying design of the Wallaroo technology. For new readers, the Wallaroo Enterprise Edition is a unified, install-based platform built from the ground up to support end-to-end, enterprise-grade production ML workflows.

The platform consists of 3 key components:

  1. The Wallaroo Integration Toolkit: The Integration Toolkit enables Wallaroo to work with any data, ML, and infrastructure ecosystem and provides a seamless workflow for your AI teams
  2. The Wallaroo Model Ops Center: The Model Ops Center is the single place to deploy, observe, manage and optimize ML through an easy-to-use self-service toolkit (SDK, UI, API)
  3. The Wallaroo Inference Server: Written in Rust-lang and highly optimized for performance, the serving engine can run any model, any use case, streaming or batch, and deploys in cloud and edge from raspberry Pi to large K8s clusters


Interested in learning more? Speak to an expert.