Deploying ML Models with FastAPI and Docker - Tutorials - Mohammed Gamal Ragab

Overview

You've trained a great model — now what? This tutorial shows you how to serve it as a REST API using FastAPI and package everything in a Docker container for easy deployment.

Prerequisites

A trained model (we'll use a scikit-learn classifier as an example)
Python 3.10+
Docker installed

pip install fastapi uvicorn scikit-learn joblib

Step 1: Save Your Model

import joblib
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
model = RandomForestClassifier().fit(X, y)
joblib.dump(model, 'model.joblib')

Step 2: Create the FastAPI App

# app.py
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np

app = FastAPI(title='ML Model API')
model = joblib.load('model.joblib')

class PredictRequest(BaseModel):
    features: list[float]

class PredictResponse(BaseModel):
    prediction: int
    confidence: float

@app.post('/predict', response_model=PredictResponse)
def predict(req: PredictRequest):
    X = np.array(req.features).reshape(1, -1)
    pred = model.predict(X)[0]
    proba = model.predict_proba(X).max()
    return PredictResponse(prediction=int(pred), confidence=float(proba))

@app.get('/health')
def health():
    return {'status': 'ok'}

Run locally:

uvicorn app:app --reload

Step 3: Write the Dockerfile

FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY model.joblib .
COPY app.py .

EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Step 4: Build and Run

docker build -t ml-api .
docker run -p 8000:8000 ml-api

Test with curl:

curl -X POST http://localhost:8000/predict \
  -H 'Content-Type: application/json' \
  -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

Step 5: Production Considerations

Add input validation and error handling
Use multi-stage Docker builds to reduce image size
Add logging and monitoring (Prometheus metrics)
Set up CI/CD to auto-build and deploy on push
Use GPU-enabled base images for deep learning models
Consider model versioning with a registry

Next Steps

Add authentication with API keys
Deploy to AWS ECS, GCP Cloud Run, or Kubernetes
Add batch prediction endpoints
Implement A/B testing between model versions