How to Cache Python Pip Requirements for Reliable Docker Builds

You’re working on a Python project, and every time you run docker compose up --build, the RUN pip install -r requirements.txt step fails halfway because your internet connection is slower than a dial-up modem. When you retry, Docker starts from scratch, redownloading all packages again. Frustrating? Absolutely.

Why does this happen?

  1. Docker’s Layer Caching: If a step (like pip install) fails, Docker invalidates the cache for that layer and everything after it.
  2. No Persisted Pip Cache: By default, pip doesn’t save downloaded packages between builds. Every failure means starting over.

Cache Packages or Go Offline

Here’s how I solved this for good and how you can too.

Use Docker BuildKit Cache Mounts

Modern Docker builds (BuildKit) can persist pip’s cache across runs.

# syntax=docker/dockerfile:1.4
FROM python:3.11

ENV PYTHONUNBUFFERED=1
WORKDIR /app

# Copy requirements first to leverage Docker layer caching
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

COPY . .

CMD celery -A myapp worker -l info -Q ${CELERY_QUEUE}

What’s happening here?

  • --mount=type=cache,target=/root/.cache/pip: This tells Docker to reuse the pip cache directory across builds.
  • Even if the build fails, downloaded packages are retained for the next attempt.

Run it with BuildKit enabled:

DOCKER_BUILDKIT=1 docker compose up --build

Why this works:

  • BuildKit’s cache mounts persist the pip cache between builds.
  • Subsequent builds skip downloading packages already in the cache.

Offline Installs with Pre-Downloaded Packages

No internet? No problem. Pre-download packages and install them offline.

Download packages locally
On your host machine:

pip download -r requirements.txt -d ./pip_packages

This creates a pip_packages folder with all dependencies.

Modify the Dockerfile

FROM python:3.11

ENV PYTHONUNBUFFERED=1
WORKDIR /app

# Copy pre-downloaded packages
COPY pip_packages /pip_packages
COPY requirements.txt .

# Install from local directory (no internet!)
RUN pip install --no-index --find-links=/pip_packages -r requirements.txt

COPY . .

CMD celery -A myapp worker -l info -Q ${CELERY_QUEUE}

Key flags:

  • --no-index: Skip PyPI.
  • --find-links=/pip_packages: Use the local directory for packages.

Perfect for:

  • Airplane coding.
  • Rural internet (or no internet).

Hybrid Caching for Best Results

Combine BuildKit caching with a local package fallback:

# syntax=docker/dockerfile:1.4
FROM python:3.11

ENV PYTHONUNBUFFERED=1
WORKDIR /app

COPY requirements.txt .

# Try using BuildKit cache first
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt || true  # Don't fail if cache is incomplete

# Fallback to local packages
COPY pip_packages /pip_packages
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install --no-index --find-links=/pip_packages -r requirements.txt

COPY . .

CMD celery -A myapp worker -l info -Q ${CELERY_QUEUE}

Why this rocks:

  • Speed of BuildKit caching + reliability of local packages.
  • Retries failed downloads gracefully.

Final Thoughts

  1. Use BuildKit if you control the build environment. It’s seamless and fast.
  2. Pre-download packages for offline scenarios or flaky networks.
  3. Hybrid approach is gold for mission-critical builds.

Pro Tips:

  • Always pin versions in requirements.txt (e.g., requests==2.31.0) to avoid surprises.
  • For teams, set up a private PyPI mirror (like devpi) for blazing-fast, reliable builds.

By caching pip’s downloads or going fully offline, I turned my Docker builds from a hair-pulling ordeal into a smooth process. Now I can finally focus on coding not waiting for packages to download.

Related blog posts