In this article, you will learn how to use Docker to package, run, and ship a complete machine learning prediction service, covering the workflow from training a model to serving it as an API and distributing it as a container image.
Topics we will cover include:
- Core Docker concepts (images, containers, layers, caching) for machine learning work.
- Training a simple classifier and serving predictions with FastAPI.
- Authoring an efficient Dockerfile, running the container locally, and pushing to Docker Hub.
Let’s get to it.
The Complete Guide to Docker for Machine Learning Engineers
Image by Author
Introduction
Machine learning models often behave differently across environments. A model that works on your laptop might fail on a colleague’s machine or in production due to version mismatches, missing dependencies, or system-level differences. This makes collaboration and deployment unnecessarily complicated.
Docker solves these problems by packaging your entire machine learning application — model, code, dependencies, and runtime environment — into a standardized container that runs identically everywhere. So you can build once and run anywhere without configuration mismatches or dependency conflicts.
This article shows you how to containerize machine learning models using a simple example. You’ll learn:
- Docker basics for machine learning
- Building and serving a machine learning model
- Containerizing machine learning applications using Docker
- Writing Dockerfiles optimized for machine learning applications
Let’s take the first steps towards shipping models that actually work everywhere.
Prerequisites
Before we learn about containerizing machine learning models with Docker, make sure you have the following.
Required:
- Python 3.11 (or a recent version) installed on your machine
- FastAPI and required dependencies (no worries, we’ll install them as we go!)
- Basic command line/terminal knowledge
- Docker Desktop installed (download here)
- A text editor or IDE
Helpful but not required:
- Basic understanding of machine learning concepts
- Familiarity with Python virtual environments
- Experience with REST APIs
Check your Docker installation:
|
docker —version docker run hello–world |
If both of these commands work, you’re ready to go!
Docker Basics for Machine Learning Engineers
Before we build our first machine learning container, let’s understand the fundamental concepts. Docker might seem complex at first, but once you grasp these core ideas, everything clicks into place.
What is Docker and Why Should Machine Learning Engineers Care?
Docker is a platform that packages your application and all its dependencies into a standardized unit called a container. For machine learning engineers, Docker addresses several relevant challenges in development and deployment.
A common issue in machine learning workflows arises when code behaves differently across machines due to mismatched Python or library versions. Docker eliminates this variability by encapsulating the entire runtime environment, ensuring consistent behavior everywhere.
Machine learning projects often rely on complex software stacks with strict version requirements such as TensorFlow tied to specific CUDA releases, or PyTorch conflicting with certain NumPy versions. Docker containers isolate these dependencies cleanly, preventing version conflicts and simplifying setup.
Reproducibility is foundational in machine learning research and production. By packaging code, libraries, and system dependencies into a single image, Docker enables exact recreation of experiments and results.
Deploying models typically involves reconfiguring environments across different machines or cloud platforms. With Docker, an environment built once can run anywhere, minimizing setup time and deployment risk.
Docker Images vs Containers
This is the most important concept to understand. Many beginners confuse images and containers, but they’re fundamentally different.
A Docker image is like a blueprint or a recipe. It’s a read-only template that contains:
- The operating system (usually a lightweight Linux distribution)
- Your application code
- All dependencies and libraries
- Configuration files
- Instructions for running your app
Think of it like a class definition in programming. It defines the specifics, but doesn’t do anything by itself.
A Docker container is a running instance of an image. It’s like an object instantiated from a class. You can create multiple containers from the same image, just like you can create multiple objects from the same class.
Here’s an example:
# These are CONTAINERS – running instances
docker run –name experiment-1 my-ml-model:v1
docker run –name experiment-2 my-ml-model:v1
docker run –name experiment-3 my-ml-model:v1
|
# This is an IMAGE – a template docker build –t my–ml–model:v1 .
# These are CONTAINERS – running instances docker run —name experiment–1 my–ml–model:v1 docker run —name experiment–2 my–ml–model:v1 docker run —name experiment–3 my–ml–model:v1 |
We haven’t covered Docker commands yet. But for now, know that you can build an image using the docker build command, and start containers from an image using the docker run command. You’ve created one image but three separate running containers. Each container runs independently with its own memory and processes, but they all started from the same image.
Dockerfile
The Dockerfile is where you write instructions for building an image. It’s a plain text file (literally named Dockerfile with no extension) that Docker reads from top to bottom.
Docker builds images in layers. Each instruction in your Dockerfile creates a new layer in your image. Docker caches these layers, which makes rebuilds faster if nothing changed.
Persisting Data with Volumes
Containers are ephemeral. Meaning when you delete a container, everything inside disappears. This is a problem for machine learning engineers who need to save training logs, model checkpoints, and experimental results.
Volumes solve this by mounting directories from your host machine into the container:
|
docker run –v /path/on/host:/path/in/container my–model |
Now files written to /path/in/container actually live on your host at /path/on/host. They survive even if you delete the container.
For machine learning workflows, you might mount:
|
docker run \ –v $(pwd)/data:/app/data \ –v $(pwd)/models:/app/models \ –v $(pwd)/logs:/app/logs \ my–training–container |
This way your trained models, datasets, and logs persist outside the container.
Networking and Port Mapping
When you run a container, it gets its own network namespace. To access services running inside, you need to map ports:
|
docker run –p 8000:8000 my–api |
This maps port 8000 on your machine to port 8000 in the container. The format is host_port:container_port.
For machine learning APIs, this lets you run multiple model versions simultaneously:
|
# Run two versions side by side docker run –d –p 8000:8000 —name wine–api–v1 yourusername/wine–predictor:v1 docker run –d –p 8001:8000 —name wine–api–v2 yourusername/wine–predictor:v2 # v1 served at http://localhost:8000, v2 at http://localhost:8001 |
Why Docker Over Virtual Environments?
You might wonder: “Why not just use venv or conda?” Here’s why Docker is better for machine learning:
Virtual environments only isolate Python packages. They do not isolate system libraries (like CUDA drivers), operating system differences (Windows vs Linux), or system-level dependencies (libgomp, libgfortran).
Docker isolates everything. Your container runs the same on your MacBook, your teammate’s Windows PC, and a Linux server in the cloud. Plus, Docker makes it trivial to run different Python versions simultaneously, which is painful with virtual environments.
Containerizing a Machine Learning App with Docker
Now that we understand Docker basics, let’s build something practical. We’ll create a wine quality prediction model using scikit-learn’s wine dataset and deploy it as a production-ready API. Here’s what we’ll cover:
- Building and training a Random Forest classifier
- Creating a FastAPI application to serve predictions
- Writing an efficient Dockerfile
- Building and running the container locally
- Testing the API endpoints
- Push the image to Docker Hub for distribution
Let’s get started!
Step 1: Setting Up Your Project
First, create a project directory with the following recommended structure:
|
wine–predictor/ ├── train_model.py ├── app.py ├── requirements.txt ├── Dockerfile └── .dockerignore |
Next, create and activate a virtual environment:
|
python3 –m venv v1 source v1/bin/activate |
Then install the required packages:
|
pip install fastapi uvicorn pandas scikit–learn |
Step 2: Building the Machine Learning Model
First, we need to create our machine learning model. We’ll use the wine dataset that’s built into scikit-learn.
Create a file called train_model.py:
# Load the wine dataset
wine = load_wine()
X, y = wine.data, wine.target
# Split the data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Train the model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train_scaled, y_train)
# Evaluate
accuracy = model.score(X_test_scaled, y_test)
print(f”Model accuracy: {accuracy:.2f}”)
# Save both the model and scaler
with open(‘model.pkl’, ‘wb’) as f:
pickle.dump(model, f)
with open(‘scaler.pkl’, ‘wb’) as f:
pickle.dump(scaler, f)
print(“Model and scaler saved successfully!”)
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
import pickle from sklearn.datasets import load_wine from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.preprocessing import StandardScaler
# Load the wine dataset wine = load_wine() X, y = wine.data, wine.target
# Split the data X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 )
# Scale features scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test)
# Train the model model = RandomForestClassifier(n_estimators=100, random_state=42) model.fit(X_train_scaled, y_train)
# Evaluate accuracy = model.score(X_test_scaled, y_test) print(f“Model accuracy: {accuracy:.2f}”)
# Save both the model and scaler with open(‘model.pkl’, ‘wb’) as f: pickle.dump(model, f)
with open(‘scaler.pkl’, ‘wb’) as f: pickle.dump(scaler, f)
print(“Model and scaler saved successfully!”) |
Here’s what this code does: We load the wine dataset which contains 13 chemical features of different wines. After splitting our data into training and testing sets, we scale the features using StandardScaler. We train a Random Forest classifier and save both the model and the scaler. Why save the scaler? Because when we make predictions later, we need to scale new data the exact same way we scaled the training data.
Run this script to train and save your model:
You should see output showing your model’s accuracy and confirmation that the files were saved.
Step 3: Creating the FastAPI Application
Now let’s create an API using FastAPI that loads our trained model and serves predictions.
Create a file called app.py:
app = FastAPI(title=”Wine Quality Predictor”)
# Load model and scaler at startup
with open(‘model.pkl’, ‘rb’) as f:
model = pickle.load(f)
with open(‘scaler.pkl’, ‘rb’) as f:
scaler = pickle.load(f)
# Wine class names for better output
wine_classes = [‘Class 0’, ‘Class 1’, ‘Class 2’]
class WineFeatures(BaseModel):
alcohol: float
malic_acid: float
ash: float
alcalinity_of_ash: float
magnesium: float
total_phenols: float
flavanoids: float
nonflavanoid_phenols: float
proanthocyanins: float
color_intensity: float
hue: float
od280_od315_of_diluted_wines: float
proline: float
# Pydantic v2-compatible schema example
model_config = {
“json_schema_extra”: {
“example”: {
“alcohol”: 13.2,
“malic_acid”: 2.77,
“ash”: 2.51,
“alcalinity_of_ash”: 18.5,
“magnesium”: 96.0,
“total_phenols”: 2.45,
“flavanoids”: 2.53,
“nonflavanoid_phenols”: 0.29,
“proanthocyanins”: 1.54,
“color_intensity”: 5.0,
“hue”: 1.04,
“od280_od315_of_diluted_wines”: 3.47,
“proline”: 920.0
}
}
}
@app.get(“/”)
def read_root():
return {
“message”: “Wine Quality Prediction API”,
“endpoints”: {
“/predict”: “POST – Make a prediction”,
“/health”: “GET – Check API health”,
“/docs”: “GET – API documentation”
}
}
@app.get(“/health”)
def health_check():
return {“status”: “healthy”, “model_loaded”: model is not None, “scaler_loaded”: scaler is not None}
@app.post(“/predict”)
def predict(features: WineFeatures):
try:
# Convert input to array
input_data = np.array([[
features.alcohol, features.malic_acid, features.ash,
features.alcalinity_of_ash, features.magnesium,
features.total_phenols, features.flavanoids,
features.nonflavanoid_phenols, features.proanthocyanins,
features.color_intensity, features.hue,
features.od280_od315_of_diluted_wines, features.proline
]])
# Scale the input
input_scaled = scaler.transform(input_data)
# Make prediction
prediction = model.predict(input_scaled)
probabilities = model.predict_proba(input_scaled)[0]
pred_index = int(prediction[0])
return {
“prediction”: wine_classes[pred_index],
“prediction_index”: pred_index,
“confidence”: float(probabilities[pred_index]),
“all_probabilities”: {
wine_classes[i]: float(p) for i, p in enumerate(probabilities)
}
}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
from fastapi import FastAPI, HTTPException from pydantic import BaseModel import pickle import numpy as np
app = FastAPI(title=“Wine Quality Predictor”)
# Load model and scaler at startup with open(‘model.pkl’, ‘rb’) as f: model = pickle.load(f)
with open(‘scaler.pkl’, ‘rb’) as f: scaler = pickle.load(f)
# Wine class names for better output wine_classes = [‘Class 0’, ‘Class 1’, ‘Class 2’]
class WineFeatures(BaseModel): alcohol: float malic_acid: float ash: float alcalinity_of_ash: float magnesium: float total_phenols: float flavanoids: float nonflavanoid_phenols: float proanthocyanins: float color_intensity: float hue: float od280_od315_of_diluted_wines: float proline: float
# Pydantic v2-compatible schema example model_config = { “json_schema_extra”: { “example”: { “alcohol”: 13.2, “malic_acid”: 2.77, “ash”: 2.51, “alcalinity_of_ash”: 18.5, “magnesium”: 96.0, “total_phenols”: 2.45, “flavanoids”: 2.53, “nonflavanoid_phenols”: 0.29, “proanthocyanins”: 1.54, “color_intensity”: 5.0, “hue”: 1.04, “od280_od315_of_diluted_wines”: 3.47, “proline”: 920.0 } } }
@app.get(“/”) def read_root(): return { “message”: “Wine Quality Prediction API”, “endpoints”: { “/predict”: “POST – Make a prediction”, “/health”: “GET – Check API health”, “/docs”: “GET – API documentation” } }
@app.get(“/health”) def health_check(): return {“status”: “healthy”, “model_loaded”: model is not None, “scaler_loaded”: scaler is not None}
@app.post(“/predict”) def predict(features: WineFeatures): try: # Convert input to array input_data = np.array([[ features.alcohol, features.malic_acid, features.ash, features.alcalinity_of_ash, features.magnesium, features.total_phenols, features.flavanoids, features.nonflavanoid_phenols, features.proanthocyanins, features.color_intensity, features.hue, features.od280_od315_of_diluted_wines, features.proline ]])
# Scale the input input_scaled = scaler.transform(input_data)
# Make prediction prediction = model.predict(input_scaled) probabilities = model.predict_proba(input_scaled)[0] pred_index = int(prediction[0])
return { “prediction”: wine_classes[pred_index], “prediction_index”: pred_index, “confidence”: float(probabilities[pred_index]), “all_probabilities”: { wine_classes[i]: float(p) for i, p in enumerate(probabilities) } } except Exception as e: raise HTTPException(status_code=500, detail=str(e)) |
The /predict endpoint does the heavy lifting. It takes the input features, converts them to a NumPy array, scales them using our saved scaler, and makes a prediction. We return not just the prediction, but also the confidence score and probabilities for all classes, which is useful for understanding how certain the model is.
You can test this locally before containerizing:
You can also visit http://localhost:8000/docs to see the interactive API documentation.
Step 4: Creating the Requirements File
Before we containerize, we need to list all Python dependencies. Create a file called requirements.txt:
|
fastapi==0.115.5 uvicorn[standard]==0.30.6 scikit–learn==1.5.2 numpy==2.1.3 pydantic==2.9.2 |
We’re pinning specific versions because dependencies can be sensitive to version changes, and we want predictable, reproducible builds.
Step 5: Writing the Dockerfile
Now let’s get to the interesting part – writing the Dockerfile. This file tells Docker how to build an image of our application.
# Set working directory in container
WORKDIR /app
# Copy requirements first (for better caching)
COPY requirements.txt .
# Install Python dependencies
RUN pip install –no-cache-dir -r requirements.txt
# Copy application code and artifacts
COPY app.py .
COPY model.pkl .
COPY scaler.pkl .
# Expose port 8000
EXPOSE 8000
# Command to run the application
CMD [“uvicorn”, “app:app”, “–host”, “0.0.0.0”, “–port”, “8000”]
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# Use official Python runtime as base image FROM python:3.11–slim
# Set working directory in container WORKDIR /app
# Copy requirements first (for better caching) COPY requirements.txt .
# Install Python dependencies RUN pip install —no–cache–dir –r requirements.txt
# Copy application code and artifacts COPY app.py . COPY model.pkl . COPY scaler.pkl .
# Expose port 8000 EXPOSE 8000
# Command to run the application CMD [“uvicorn”, “app:app”, “–host”, “0.0.0.0”, “–port”, “8000”] |
Let’s break this down line by line.
FROM python:3.11-slim: We start with a lightweight Python 3.11 image. The “slim” variant excludes unnecessary packages, resulting in faster builds and smaller images.
WORKDIR /app: Sets /app as our working directory. All subsequent commands run from here, and it’s where our application lives inside the container.
COPY requirements.txt .: We copy requirements first, before application code. This is a Docker best practice. If you only change your code, Docker reuses the cached layer with installed dependencies, making rebuilds much faster.
RUN pip install –no-cache-dir -r requirements.txt: Installs Python packages. The --no-cache-dir flag prevents pip from storing download cache, reducing the final image size.
COPY app.py . / COPY model.pkl . / COPY scaler.pkl .: Copies our application files and trained artifacts into the container. Each COPY creates a new layer.
EXPOSE 8000: Documents that our container listens on port 8000. Note that this doesn’t actually publish the port. That happens when we run the container with -p.
CMD […]: The command that runs when the container starts.
Step 6: Building the Docker Image
Now let’s build our Docker image. Make sure you’re in the directory with your Dockerfile and run:
|
docker buildx build –t wine–predictor:v1 . |
Here’s what this command does: docker buildx build tells Docker to build an image using BuildKit, -t wine-predictor:v1 tags the image with a name and version (v1), and . tells Docker to look for the Dockerfile in the current directory.
You’ll see Docker execute each step in your Dockerfile. The first build takes a few minutes because it downloads the base image and installs all dependencies. Subsequent builds are much faster thanks to Docker’s layer caching.
Check that your image was created:
You should see your wine-predictor image listed with its size.
Step 7: Running Your Container
Let’s run a container from our image:
|
docker run –d –p 8000:8000 —name wine–api wine–predictor:v1 |
Breaking down these flags:
- -d: Runs the container in detached mode (in the background)
- -p 8000:8000: Maps port 8000 on your machine to port 8000 in the container
- –name wine-api: Gives your container a friendly name
- wine-predictor:v1: The image to run
Your API is now running in a container! Test it:
|
curl http://localhost:8000/health |
You should get a response showing the API is healthy.
|
{ “status”: “healthy”, “model_loaded”: true, “scaler_loaded”: true } |
Step 8: Making Predictions
Let’s test our model with a real prediction. You can use curl:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
curl –X POST “http://localhost:8000/predict” \ –H “Content-Type: application/json” \ –d ‘{ “alcohol”: 13.2, “malic_acid”: 2.77, “ash”: 2.51, “alcalinity_of_ash”: 18.5, “magnesium”: 96.0, “total_phenols”: 2.45, “flavanoids”: 2.53, “nonflavanoid_phenols”: 0.29, “proanthocyanins”: 1.54, “color_intensity”: 5.0, “hue”: 1.04, “od280_od315_of_diluted_wines”: 3.47, “proline”: 920.0 }’ |
You should get back a JSON response with the prediction, confidence score, and probabilities for each class.
|
{ “prediction”: “Class 1”, “prediction_index”: 1, “confidence”: 0.97, “all_probabilities”: { “Class 0”: 0.02, “Class 1”: 0.97, “Class 2”: 0.01 } } |
Step 9: (Optional) Pushing to Docker Hub
You can share your image through Docker Hub. First, create a free account at hub.docker.com if you don’t have one.
Log in to Docker Hub:
Enter your Docker Hub username and password when prompted.
Tag your image with your Docker Hub username:
|
docker tag wine–predictor:v1 yourusername/wine–predictor:v1 |
Replace yourusername with your actual Docker Hub username.
Push the image:
|
docker push yourusername/wine–predictor:v1 |
The first push takes a few minutes as Docker uploads all layers. Subsequent pushes are faster because Docker only uploads changed layers.
You can now pull and run your image from anywhere:
|
docker pull yourusername/wine–predictor:v1 docker run –d –p 8000:8000 yourusername/wine–predictor:v1 |
Your model is now publicly available and anyone can pull your image and run the app!
Best Practices for Building Machine Learning Docker Images
1. Use multi-stage builds to keep images small
When building images for your machine learning models, consider using multi-stage builds.
# Runtime stage
FROM python:3.11-slim
WORKDIR /app
COPY –from=builder /root/.local /root/.local
COPY app.py model.pkl scaler.pkl ./
ENV PATH=/root/.local/bin:$PATH
CMD [“uvicorn”, “app:app”, “–host”, “0.0.0.0”, “–port”, “8000”]
|
# Build stage FROM python:3.11 AS builder WORKDIR /app COPY requirements.txt . RUN pip install —user —no–cache–dir –r requirements.txt
# Runtime stage FROM python:3.11–slim WORKDIR /app COPY —from=builder /root/.local /root/.local COPY app.py model.pkl scaler.pkl ./ ENV PATH=/root/.local/bin:$PATH CMD [“uvicorn”, “app:app”, “–host”, “0.0.0.0”, “–port”, “8000”] |
Using a dedicated build stage lets you install dependencies separately and copy only the necessary artifacts into the final image. This reduces size and attack surface.
2. Avoid training models inside Docker images
Model training should happen outside of Docker. Save the trained model files and copy them into the image. This keeps builds fast, reproducible, and focused on serving, not training.
3. Use a .dockerignore file
Exclude datasets, notebooks, test artifacts, and other large or unnecessary files. This keeps the build context small and avoids unintentionally bloating the image.
|
# .dockerignore __pycache__/ *.pyc *.pyo .ipynb_checkpoints/ data/ models/ logs/ .env .git |
4. Version your models and images
Tag images with model versions so you can roll back easily. Here’s an example:
|
docker buildx build –t wine–predictor:v1.0 . docker buildx build –t wine–predictor:v1.1 . |
Wrapping Up
You’re now ready to containerize your machine learning models with Docker! In this article, you learned:
- Docker basics: images, containers, Dockerfiles, layers, and caching
- Serving model predictions using FastAPI
- Writing an efficient Dockerfile for machine learning apps
- Building and running containers smoothly
Docker ensures your machine learning model runs the same way everywhere — locally, in the cloud, or on any teammate’s machine. It removes the guesswork and makes deployment consistent and reliable.
Once you’re comfortable with the basics, you can take things further with CI/CD pipelines, Kubernetes, and monitoring tools to build a complete, scalable machine learning infrastructure.
Now go ahead and containerize your model. Happy coding!