How I Slashed CI/CD Pipeline Times from 45 to 8 Minutes

Sep 22, 2023

Our developer productivity was grinding to a halt. Our main CI/CD pipeline was taking a staggering 45 minutes to run, and with 20-30 deployments a day, we were losing hours of valuable engineering time. I knew we had to fix it. Here’s the story of how I led the effort to analyze our pipeline, identify the bottlenecks, and optimize it down to just 8 minutes.

The problem had gotten so bad that developers would push code, then go get coffee, check Slack, attend a meeting, and come back to see if their build passed. Some were batching multiple changes together just to avoid waiting for the pipeline multiple times. This was destroying our ability to iterate quickly and ship features.

My Optimization Journey

graph LR
    Problem[45 Minutes<br/>Pipeline Time<br/>❌ My Problem]

    subgraph Analysis["My Analysis Phase"]
        Measure[Measure Each Stage]
        Identify[Identify Bottlenecks]
    end

    subgraph Optimization["My Optimization Strategies"]
        Parallel[Parallel Execution]
        Cache[Layer Caching]
        Smart[Smart Testing]
    end

    Result[8 Minutes<br/>Pipeline Time<br/>✅ My Result]

    Problem --> Analysis
    Analysis --> Optimization
    Optimization --> Result

    style Problem fill:#ff6b6b,color:#fff
    style Result fill:#51cf66,color:#fff

Finding the Bottlenecks

I couldn’t optimize what I couldn’t measure, so I spent a day adding timing instrumentation to every stage of our pipeline. I wanted hard data on where time was being spent.

The results were eye-opening. Our test stage was taking 20 minutes. Our build stage was taking 18 minutes. Together, these two stages accounted for 38 of the 45 total minutes. The remaining 7 minutes was split across checkout, linting, security scanning, and deployment.

The test stage breakdown was particularly revealing. Unit tests took 6 minutes, integration tests took 8 minutes, and end-to-end tests took 6 minutes. They were running sequentially, one after another, even though they were completely independent. That was low-hanging fruit.

The build stage was spending 15 of its 18 minutes just running npm install. Every single build, we were downloading and installing the exact same dependencies. No caching. Just brute force reinstalling everything. That had to change.

My Optimization Strategies

Parallelize Everything That Can Be Parallelized

The test bottleneck had an obvious fix. Unit tests, integration tests, and E2E tests don’t depend on each other. They can run simultaneously. So why were we running them one after another?

Legacy. That’s why. Someone had written the pipeline years ago when we had one test suite, and nobody had ever questioned it as we added more suites. I modified our Jenkins pipeline to run all three test suites in parallel.

Before:

stage('Tests') {
    steps {
        sh 'npm run test:unit'
        sh 'npm run test:integration'
        sh 'npm run test:e2e'
    }
}

After:

stage('Tests') {
    parallel {
        stage('Unit') { steps { sh 'npm run test:unit' } }
        stage('Integration') { steps { sh 'npm run test:integration' } }
        stage('E2E') { steps { sh 'npm run test:e2e' } }
    }
}

This simple change dropped the test stage from 20 minutes to 8 minutes. The total time was now limited by the longest suite (integration tests at 8 minutes), not the sum of all three. The first time I saw the pipeline complete in 8 minutes instead of 20, I actually checked if something had failed. Nope. It just worked.

Cache Everything Aggressively

Our 18-minute build time was almost entirely npm install. Every single pipeline run, we’d download hundreds of megabytes of dependencies from npm, even though 99% of the time our package.json hadn’t changed.

The fix was restructuring our Dockerfile to leverage Docker layer caching properly. I moved the dependency installation step before copying the application code.

Before:

FROM node:18
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build

After:

FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

Now Docker would cache the npm ci layer and only re-run it when package.json or package-lock.json changed. Combined with BuildKit’s cache mounts in our CI environment, this brought our build stage from 18 minutes down to 4 minutes when dependencies hadn’t changed. On the rare occasion they did change, it still took 18 minutes, but that was maybe 5% of builds.

Only Test What Changed

Even with parallelization and caching, I knew we could do better for developer branches. Running the full test suite on every commit to a feature branch was overkill. If you changed one file in the authentication module, you didn’t need to run tests for the payment module.

I configured Jest to run only tests affected by changed files on pull request branches.

# Run tests only on changed files
jest --changedSince=origin/main

This was huge for developer iteration speed. On feature branches, test time often dropped to under a minute. We still ran the full suite on main branch merges to catch any integration issues, but developers got fast feedback during development.

Spend Money to Save Time

I ran the numbers on our CI runner costs and had an uncomfortable realization. We were using the smallest, cheapest runners available. A build would sit in queue for 30 seconds waiting for a runner, then run slowly on an underpowered machine. We were being penny-wise and pound-foolish.

I upgraded our runners to machines with 4x the CPU and memory. The cost went up by about $200/month. But with 30 builds a day, the time savings paid for itself within the first week. The bigger runners also meant we could run more tests in parallel, which compounded the speedup.

What We Achieved

After implementing all these optimizations over about two weeks, our pipeline went from 45 minutes to 8 minutes. That’s an 82% reduction in wait time.

The impact on developer productivity was immediate and dramatic. We were running 25-30 builds per day. At 45 minutes each, that was 18-22 hours of total wait time. At 8 minutes each, it dropped to 3-4 hours. We got back 15+ hours of productive engineering time every single day.

pie title My Pipeline Time - After (8 min, 82% Reduction)
    "Build (Optimized)" : 3
    "Test (Parallel)" : 2.5
    "Docker Build (Cached)" : 1.5
    "Checkout (Cached)" : 0.5
    "Deploy" : 0.5

What I Learned

Measure everything before optimizing anything. I see teams try to optimize CI/CD based on gut feel, and it never works. Instrument your pipeline, collect data, find the actual bottlenecks. I was surprised that linting and security scanning (which I thought were slow) only took 2 minutes total. I would have wasted time optimizing the wrong things.

Parallelize ruthlessly. If stages don’t depend on each other, run them in parallel. This was the single biggest win for us. Most pipelines have lots of parallelization opportunities that nobody has taken advantage of.

Caching is not optional at scale. Every layer, every dependency, every build artifact that can be cached should be cached. The speedup compounds over time.

Don’t cheap out on CI infrastructure. Fast runners cost money, but slow pipelines cost more in lost developer productivity. Do the math. The ROI on upgrading runners is usually obvious.

Test smarter on feature branches, test everything on main. Developers need fast feedback during iteration, but you still need comprehensive testing before merging. This balance works well.

The psychological impact of fast pipelines is underestimated. Developers stopped batching changes. They pushed code more frequently because the feedback loop was fast. This led to smaller, easier-to-review pull requests and fewer merge conflicts. The benefits cascaded beyond just time savings.

El Muhammad's Portfolio

© 2025 Aria

Instagram YouTube TikTok 𝕏 GitHub