You’re working on a Python project, and every time you run docker compose up --build
, the RUN pip install -r requirements.txt
step fails halfway because your internet connection is slower than a dial-up modem. When you retry, Docker starts from scratch, redownloading all packages again. Frustrating? Absolutely.
Why does this happen?
- Docker’s Layer Caching: If a step (like
pip install
) fails, Docker invalidates the cache for that layer and everything after it. - No Persisted Pip Cache: By default, pip doesn’t save downloaded packages between builds. Every failure means starting over.
Cache Packages or Go Offline
Here’s how I solved this for good and how you can too.
Use Docker BuildKit Cache Mounts
Modern Docker builds (BuildKit) can persist pip’s cache across runs.
# syntax=docker/dockerfile:1.4 FROM python:3.11 ENV PYTHONUNBUFFERED=1 WORKDIR /app # Copy requirements first to leverage Docker layer caching COPY requirements.txt . RUN --mount=type=cache,target=/root/.cache/pip \ pip install -r requirements.txt COPY . . CMD celery -A myapp worker -l info -Q ${CELERY_QUEUE}
What’s happening here?
--mount=type=cache,target=/root/.cache/pip
: This tells Docker to reuse the pip cache directory across builds.- Even if the build fails, downloaded packages are retained for the next attempt.
Run it with BuildKit enabled:
DOCKER_BUILDKIT=1 docker compose up --build
Why this works:
- BuildKit’s cache mounts persist the pip cache between builds.
- Subsequent builds skip downloading packages already in the cache.
Offline Installs with Pre-Downloaded Packages
No internet? No problem. Pre-download packages and install them offline.
Download packages locally
On your host machine:
pip download -r requirements.txt -d ./pip_packages
This creates a pip_packages
folder with all dependencies.
Modify the Dockerfile
FROM python:3.11 ENV PYTHONUNBUFFERED=1 WORKDIR /app # Copy pre-downloaded packages COPY pip_packages /pip_packages COPY requirements.txt . # Install from local directory (no internet!) RUN pip install --no-index --find-links=/pip_packages -r requirements.txt COPY . . CMD celery -A myapp worker -l info -Q ${CELERY_QUEUE}
Key flags:
--no-index
: Skip PyPI.--find-links=/pip_packages
: Use the local directory for packages.
Perfect for:
- Airplane coding.
- Rural internet (or no internet).
Hybrid Caching for Best Results
Combine BuildKit caching with a local package fallback:
# syntax=docker/dockerfile:1.4 FROM python:3.11 ENV PYTHONUNBUFFERED=1 WORKDIR /app COPY requirements.txt . # Try using BuildKit cache first RUN --mount=type=cache,target=/root/.cache/pip \ pip install -r requirements.txt || true # Don't fail if cache is incomplete # Fallback to local packages COPY pip_packages /pip_packages RUN --mount=type=cache,target=/root/.cache/pip \ pip install --no-index --find-links=/pip_packages -r requirements.txt COPY . . CMD celery -A myapp worker -l info -Q ${CELERY_QUEUE}
Why this rocks:
- Speed of BuildKit caching + reliability of local packages.
- Retries failed downloads gracefully.
Final Thoughts
- Use BuildKit if you control the build environment. It’s seamless and fast.
- Pre-download packages for offline scenarios or flaky networks.
- Hybrid approach is gold for mission-critical builds.
Pro Tips:
- Always pin versions in
requirements.txt
(e.g.,requests==2.31.0
) to avoid surprises. - For teams, set up a private PyPI mirror (like
devpi
) for blazing-fast, reliable builds.
By caching pip’s downloads or going fully offline, I turned my Docker builds from a hair-pulling ordeal into a smooth process. Now I can finally focus on coding not waiting for packages to download.