Overview
You've trained a great model — now what? This tutorial shows you how to serve it as a REST API using FastAPI and package everything in a Docker container for easy deployment.
Prerequisites
- A trained model (we'll use a scikit-learn classifier as an example)
- Python 3.10+
- Docker installed
pip install fastapi uvicorn scikit-learn joblib
Step 1: Save Your Model
import joblib
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
model = RandomForestClassifier().fit(X, y)
joblib.dump(model, 'model.joblib')
Step 2: Create the FastAPI App
# app.py
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np
app = FastAPI(title='ML Model API')
model = joblib.load('model.joblib')
class PredictRequest(BaseModel):
features: list[float]
class PredictResponse(BaseModel):
prediction: int
confidence: float
@app.post('/predict', response_model=PredictResponse)
def predict(req: PredictRequest):
X = np.array(req.features).reshape(1, -1)
pred = model.predict(X)[0]
proba = model.predict_proba(X).max()
return PredictResponse(prediction=int(pred), confidence=float(proba))
@app.get('/health')
def health():
return {'status': 'ok'}
Run locally:
uvicorn app:app --reload
Step 3: Write the Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY model.joblib .
COPY app.py .
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
Step 4: Build and Run
docker build -t ml-api .
docker run -p 8000:8000 ml-api
Test with curl:
curl -X POST http://localhost:8000/predict \
-H 'Content-Type: application/json' \
-d '{"features": [5.1, 3.5, 1.4, 0.2]}'
Step 5: Production Considerations
- Add input validation and error handling
- Use multi-stage Docker builds to reduce image size
- Add logging and monitoring (Prometheus metrics)
- Set up CI/CD to auto-build and deploy on push
- Use GPU-enabled base images for deep learning models
- Consider model versioning with a registry
Next Steps
- Add authentication with API keys
- Deploy to AWS ECS, GCP Cloud Run, or Kubernetes
- Add batch prediction endpoints
- Implement A/B testing between model versions