A Worked Example: Microservices Build Pipeline at Scale
This is a worked example based on patterns commonly seen in microservices teams running dozens of services across mixed runtimes. It is a generalized walkthrough, not the story of any specific company. Numbers are illustrative ranges that match what teams typically report when applying the techniques described here.
As organizations scale their microservices architectures, Docker build pipelines often become bottlenecks in the development process. This worked example walks through how a microservices team with dozens of services can transform their Docker build process to improve developer productivity and reduce infrastructure costs.
The Initial Situation
Picture an engineering team responsible for 50–100 microservices spread across many teams. Their architecture typically includes services built with various technologies, for example:
- A large group of Node.js services
- Java/Spring Boot services
- Python services
- Go services
- Other specialized services in mixed languages
Each service has its own repository and CI/CD pipeline, and the team deploys frequently across all services, with each deployment requiring a Docker build and push to a container registry.
The Breaking Point
Leadership recognises a problem when CI/CD costs balloon and Docker builds dominate build-minute consumption. Developers complain about waiting 15–20 minutes for pipelines to complete for even minor changes.
Assessment and Planning
A small working group of DevOps engineers and senior developers audits the Docker build process. Common findings: unoptimised Dockerfiles, no shared caching, oversized base images, and inconsistent practices across teams.
Implementation
The team rolls out optimisations across services, starting with the most frequently updated ones. They publish standard Dockerfile templates for each runtime and update CI/CD pipelines to use BuildKit and remote caching.
Results and Ongoing Improvements
Once all services have moved to the optimised process, automation keeps new services aligned with best practices and the team monitors metrics to keep refining the approach.
Key Challenges and Solutions
Challenge #1: Slow Build Times
Build times of 8–15 minutes in CI are common in unoptimised microservices repos. Developers wait for feedback and deployments are delayed, especially for urgent fixes.
Typical underlying issues:
- Poor layer ordering causing unnecessary rebuilds
- No caching between builds in CI/CD
- Full rebuilds for minor code changes
Solution: Optimized Dockerfile Templates
The team created standardized Dockerfile templates for each technology stack with:
- Optimized layer ordering for better cache utilization
- Multi-stage builds to separate build and runtime dependencies
- BuildKit cache mounts for package manager caches
- Remote cache storage and retrieval in CI/CD
Example for Node.js services:
# syntax=docker/dockerfile:1.4
FROM node:18-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
npm ci
FROM node:18-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN --mount=type=cache,target=/root/.npm \
npm run build
FROM node:18-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
COPY --from=builder /app/dist ./dist
COPY --from=deps /app/node_modules ./node_modules
USER node
EXPOSE 3000
CMD ["node", "dist/main.js"]
Challenge #2: Oversized Images
Container images were unnecessarily large, causing:
- Slower deployments due to large image pulls
- Higher storage costs in container registries
- Increased attack surface with unneeded tools
- Wasted resources in production
It is common for unoptimised Node.js images to weigh in around 1 GB and Java services to land in the high hundreds of MB before any cleanup work.
Solution: Image Size Optimization
The team implemented several image size reduction techniques:
- Strict multi-stage builds with minimal final images
- Alpine-based images where appropriate
- Distroless images for Java services
- Production-only dependencies in final stage
- Removal of development tools and documentation
Example for Java services:
# syntax=docker/dockerfile:1.4
FROM eclipse-temurin:17-jdk-alpine AS builder
WORKDIR /app
COPY gradle/ gradle/
COPY gradlew build.gradle settings.gradle ./
RUN --mount=type=cache,target=/root/.gradle \
./gradlew dependencies
COPY src/ src/
RUN --mount=type=cache,target=/root/.gradle \
./gradlew bootJar
FROM gcr.io/distroless/java17-debian11
WORKDIR /app
COPY --from=builder /app/build/libs/*.jar app.jar
EXPOSE 8080
USER nonroot
ENTRYPOINT ["java", "-jar", "app.jar"]
Challenge #3: CI/CD Inefficiencies
CI/CD pipelines were not optimized for Docker builds:
- No reuse of layer cache between pipeline runs
- Separate build and push steps causing duplication
- BuildKit features not enabled in CI
- High CI-minute consumption growing month over month
Solution: CI/CD Pipeline Optimization
The team redesigned their CI/CD pipeline approach:
- Implemented BuildKit's remote caching in all pipelines
- Created shared base images for common dependencies
- Added distributed caching to store and retrieve layers
- Combined build and push steps to reduce overhead
Example GitHub Actions workflow excerpt:
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to Container Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}
- name: Build and push
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: ${{ env.IMAGE_NAME }}:${{ env.TAG }}
cache-from: type=registry,ref=${{ env.IMAGE_NAME }}:cache
cache-to: type=registry,ref=${{ env.IMAGE_NAME }}:cache,mode=max
Results and Impact
When the optimisations described above are applied together, the typical improvements teams report fall into ranges like the following:
| Metric | Before (typical) | After (typical) | Improvement |
|---|---|---|---|
| Average build time in CI | 10–15 minutes | 3–5 minutes | ~70% |
| Average Node.js image size | ~1 GB | ~150 MB | ~85% |
| Average Java image size | ~850 MB | ~180 MB | ~80% |
| CI compute minutes | Baseline | ~30% of baseline | ~70% |
| Deployment time | 4–5 minutes | ~2 minutes | ~60% |
The downstream effects are usually broader than the raw numbers suggest:
- Developer productivity: Faster feedback cycles lead to more iterations and fewer context switches.
- Incident response: Time to deploy critical fixes drops noticeably.
- Infrastructure costs: CI compute is one of the easiest line items to reduce once builds are tuned.
- Security posture: Smaller, minimal production images reduce attack surface.
- Deployment reliability: Faster, more reliable deployments with fewer timeout issues.
The original pipeline had long build times with many redundant steps and no layer caching between runs.
The optimized pipeline leverages BuildKit caching, parallel builds, and shared base images for dramatic speed improvements.
Key Learnings and Best Practices
Several key learnings from this kind of programme generalise well to other microservices environments:
Standardise across teams
Standardised Dockerfile templates for each runtime keep builds consistent and easier to maintain. A central repository of templates that teams adapt to their specific services pays dividends as new services are created.
Standardisation makes it easier to roll out improvements across all services and onboards new services with optimised builds from day one.
Invest in shared base images
Custom base images for each runtime, bundling common dependencies and security configuration, reduce duplication. Rebuilding these base images on a regular cadence picks up upstream patches automatically.
This pattern improves security compliance and further reduces build times by giving every service an optimised starting point.
Measure everything
Comprehensive metrics collection for the Docker build process is essential:
- Build times for each pipeline stage
- Image sizes and layer counts
- Cache-hit ratios in CI/CD
- CI minutes consumed per service
These metrics make it possible to identify bottlenecks, prioritise optimisations, and quantify improvements over time.
Pipeline architecture matters
The design of CI/CD pipelines significantly affects build performance. Useful patterns include:
- Running tests in parallel with Docker builds where possible
- Using ephemeral environments for integration tests
- Smart skipping of stages when nothing relevant has changed
- Optimising cold starts with distributed caching
Pipeline architecture improvements complement Dockerfile optimisations for maximum effect.
Lessons for other organisations
Based on the patterns above, here are key recommendations for teams looking to optimise their Docker build pipelines for microservices:
- Start with Measurement: Collect baseline metrics before making changes to quantify improvements
- Prioritize High-Impact Services: Begin with the most frequently built services or those with the longest build times
- Standardize for Scale: Create templates and standards that can be applied consistently across all services
- Educate Teams: Ensure all developers understand Docker build best practices through workshops and documentation
- Automate Compliance: Implement CI checks to ensure Dockerfile best practices are followed
- Consider Total Costs: Factor in both infrastructure costs and developer time when evaluating optimizations
- Iterate and Improve: Continuously monitor metrics and refine your approach based on real-world results
Conclusion
The pattern this worked example walks through demonstrates that with careful analysis and disciplined application of best practices, microservices teams can achieve substantial improvements in build performance, image size, and cost.
The key to their success was a holistic approach that addressed:
- Dockerfile structure and optimization
- CI/CD pipeline architecture
- Caching strategies at multiple levels
- Standardization across teams and services
- Continuous measurement and improvement
For organisations with growing microservices architectures, investing in Docker build optimisation can yield meaningful returns in both direct cost and developer productivity. The principles and techniques demonstrated above adapt to microservices environments of any size.