LogoLogo
ProductResourcesGitHubStart free
  • Documentation
  • Learn
  • ZenML Pro
  • Stacks
  • API Reference
  • SDK Reference
  • Overview
  • Starter guide
    • Create an ML pipeline
    • Cache previous executions
    • Manage artifacts
    • Track ML models
    • A starter project
  • Production guide
    • Deploying ZenML
    • Understanding stacks
    • Connecting remote storage
    • Orchestrate on the cloud
    • Configure your pipeline to add compute
    • Configure a code repository
    • Set up CI/CD
    • An end-to-end project
  • LLMOps guide
    • RAG with ZenML
      • RAG in 85 lines of code
      • Understanding Retrieval-Augmented Generation (RAG)
      • Data ingestion and preprocessing
      • Embeddings generation
      • Storing embeddings in a vector database
      • Basic RAG inference pipeline
    • Evaluation and metrics
      • Evaluation in 65 lines of code
      • Retrieval evaluation
      • Generation evaluation
      • Evaluation in practice
    • Reranking for better retrieval
      • Understanding reranking
      • Implementing reranking in ZenML
      • Evaluating reranking performance
    • Improve retrieval by finetuning embeddings
      • Synthetic data generation
      • Finetuning embeddings with Sentence Transformers
      • Evaluating finetuned embeddings
    • Finetuning LLMs with ZenML
      • Finetuning in 100 lines of code
      • Why and when to finetune LLMs
      • Starter choices with finetuning
      • Finetuning with 🤗 Accelerate
      • Evaluation for finetuning
      • Deploying finetuned models
      • Next steps
  • Tutorials
    • Managing scheduled pipelines
    • Trigger pipelines from external systems
    • Hyper-parameter tuning
    • Inspecting past pipeline runs
    • Train with GPUs
    • Running notebooks remotely
    • Managing machine learning datasets
    • Handling big data
  • Best practices
    • 5-minute Quick Wins
    • Keep Your Dashboard Clean
    • Configure Python environments
    • Shared Components for Teams
    • Organizing Stacks Pipelines Models
    • Access Management
    • Setting up a Project Repository
    • Infrastructure as Code with Terraform
    • Creating Templates for ML Platform
    • Using VS Code extension
    • Leveraging MCP
    • Debugging and Solving Issues
    • Choosing an Orchestrator
  • Examples
    • Quickstart
    • End-to-End Batch Inference
    • Basic NLP with BERT
    • Computer Vision with YoloV8
    • LLM Finetuning
    • More Projects...
Powered by GitBook
On this page
  • Why there are limitations
  • Checklist for step cells
  • Run a single step remotely
  • Next steps – from notebook to production

Was this helpful?

Edit on GitHub
  1. Tutorials

Running notebooks remotely

Leveraging Jupyter notebooks with ZenML.

A Jupyter notebook is often the fastest way to prototype an ML experiment, but sooner or later you will want to execute heavy‑weight ZenML steps or pipelines on a remote stack. This tutorial shows how to

  1. Understand the limitations of defining steps inside notebook cells;

  2. Execute a single step remotely from a notebook; and

  3. Promote your notebook code to a full pipeline that can run anywhere.


Why there are limitations

When you call a step or pipeline from a notebook, ZenML needs to export the cell code into a standalone Python module that gets packaged into a Docker image. Any magic commands, cross‑cell references or missing imports break that process. Keep your cells pure and self‑contained and you are good to go.

Checklist for step cells

  • Only regular Python code – no Jupyter magics (%…) or shell commands (!…).

  • Do not access variables or functions defined in other notebook cells. Import from .py files instead.

  • Include all imports you need inside the cell (including from zenml import step).


Run a single step remotely

You can treat a ZenML @step like a normal Python function call. ZenML will automatically create a temporary pipeline with just this one step and run it on your active stack.

from zenml import step
import pandas as pd
from sklearn.base import ClassifierMixin
from sklearn.svm import SVC

@step(step_operator="<STEP_OPERATOR_NAME>")  # remove argument if not using a step operator
def svc_trainer(
    X_train: pd.DataFrame,
    y_train: pd.Series,
    gamma: float = 0.001,
) -> tuple[ClassifierMixin, float]:
    """Train an SVC model and return it together with its training accuracy."""
    model = SVC(gamma=gamma)
    model.fit(X_train.to_numpy(), y_train.to_numpy())
    acc = model.score(X_train.to_numpy(), y_train.to_numpy())
    print(f"Train accuracy: {acc}")
    return model, acc

# Prepare some data …
X_train = pd.DataFrame(...)
y_train = pd.Series(...)

# ☁️  This call executes remotely on the active stack
model, train_acc = svc_trainer(X_train=X_train, y_train=y_train)

Tip: If you prefer YAML, you can also pass a config_path when calling the step.


Next steps – from notebook to production

Once your logic stabilizes it usually makes sense to move code out of the notebook and into regular Python modules so that it can be version‑controlled and tested. At that point just assemble the same steps inside a @pipeline function and trigger it from the CLI or a CI workflow.

PreviousTrain with GPUsNextManaging machine learning datasets

Last updated 18 days ago

Was this helpful?

For a deeper dive into how ZenML packages notebook code have a look at the .

Notebook Integration docs