How I Use Google Cloud Run for Production Serverless Containers

I love the power of Kubernetes, but I don’t always love the operational overhead of managing clusters. For many of our services (APIs, webhooks, and event-driven workers), I found myself wanting a simpler solution. That’s when I discovered Google Cloud Run. For me, it’s the perfect middle ground: all the power of running containers, but without any of the cluster management. It’s like a fully managed, serverless Kubernetes.

Let me be honest about what drove me to Cloud Run. We had a GKE cluster running 15 microservices. Most of them handled maybe 10 requests per minute. We were paying for multiple nodes 24/7 to run services that were idle 95% of the time. The ops overhead was killing us. Kubernetes upgrades, node pool management, monitoring, it all took time. For what? Services that could run just as well on a simpler platform.

When Cloud Run Makes Sense

I’ve developed a simple decision tree for choosing compute platforms. For simple, event-triggered functions under 512MB of memory, I use Cloud Functions. For complex microservices that need StatefulSets, DaemonSets, or fine-grained network policies, I stick with GKE. But for the vast majority of containerized, stateless web services, Cloud Run is my default choice.

The biggest advantage is the massive reduction in operational work. Google handles all the scaling, all the infrastructure, all the maintenance. They provide zero-downtime deployments out of the box. And the pricing model is incredible because you only pay for actual request time, not idle capacity. I’ve seen services that cost $200/month on GKE cost $8/month on Cloud Run.

graph TB
    Decision{My Use Case?}

    Decision -->|APIs, Webhooks<br/>Event-driven| CloudRun[Cloud Run<br/>✓ My default choice]
    Decision -->|Complex microservices<br/>StatefulSets, DaemonSets| GKE[GKE<br/>I use full Kubernetes]
    Decision -->|Simple functions<br/>< 512MB, Node/Python| CloudFunctions[Cloud Functions<br/>A simpler option]

    CloudRun --> Benefits[My Gains:<br/>- No cluster management<br/>- Scale to zero<br/>- Pay per use<br/>- Auto HTTPS]

    style CloudRun fill:#4285f4,color:#fff
    style Benefits fill:#e8f5e9,stroke:#4caf50

The downsides exist but are manageable. You give up some control compared to running your own Kubernetes cluster. Cold start latency can be an issue, though I mitigate this by setting minScale to at least 1 for production services. And you’re locked into GCP, but honestly, we were already committed to GCP anyway.

My Production Template

I created a standard service template after deploying my tenth Cloud Run service and realizing I was copy-pasting the same configuration over and over. This template is my production-ready starting point with all the lessons I’ve learned baked in.

# template-service.yml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: ${SERVICE_NAME}
  annotations:
    run.googleapis.com/ingress: internal-and-cloud-load-balancing
spec:
  template:
    metadata:
      annotations:
        run.googleapis.com/sessionAffinity: 'true'
        run.googleapis.com/vpc-access-egress: all-traffic
        run.googleapis.com/execution-environment: gen2
        autoscaling.knative.dev/minScale: '1'
        autoscaling.knative.dev/maxScale: '100'
        run.googleapis.com/vpc-access-connector: >-
          projects/my-project/locations/asia-southeast2/connectors/my-vpc-connector
    spec:
      containerConcurrency: 80
      timeoutSeconds: 300
      serviceAccountName: ${SERVICEACCOUNT_NAME}
      containers:
      - image: '${IMAGE_NAME}'
        ports:
        - containerPort: 8080
        resources:
          limits:
            cpu: '2'
            memory: 2Gi
        livenessProbe:
          httpGet:
            path: ${HEALTHCHECK_ENDPOINT}

The key decisions in this template: Gen2 execution environment for better performance (it’s noticeably faster than Gen1). MinScale of 1 for production to avoid cold starts (users don’t wait for your container to start). MaxScale of 100 to prevent runaway costs if something goes wrong. VPC connector so services can securely access our Cloud SQL databases and other private resources.

I learned the VPC connector lesson the hard way. I deployed a service that needed database access but forgot to configure the VPC connector. The service couldn’t reach the database and just returned 500 errors. Took me an embarrassingly long time to figure out why.

Automation Because Manual Deployment Is Pain

I automated deployments after manually deploying services five times and making mistakes every time. I created shell scripts that read service definitions from a CSV file and generate manifests from the template.

# services.csv
service-name,image-name,serviceaccount,healthcheck-endpoint
api-service,asia.gcr.io/my-project/api-service:1.0.0,cloudrun-sa@my-project.iam.gserviceaccount.com,/health
worker-service,asia.gcr.io/my-project/worker:1.2.0,cloudrun-sa@my-project.iam.gserviceaccount.com,/healthz

The generate-manifests.sh script reads the CSV and uses envsubst to fill in the template variables. The deploy-services.sh script takes those manifests and deploys them with gcloud run services replace. It also sets IAM policies to control who can invoke the services.

I run all of this in CI/CD, obviously. GitHub Actions builds the container, pushes it to GCR, then deploys it to Cloud Run. The whole process takes about 3 minutes from git push to new version live in production.

# .github/workflows/deploy.yml
name: Deploy to Cloud Run
on:
  push:
    branches: [ main ]
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - name: Setup Cloud SDK
      uses: google-github-actions/setup-gcloud@v0
      with:
        service_account_key: ${{ secrets.GCP_SA_KEY }}
        project_id: ${{ secrets.GCP_PROJECT_ID }}
    - name: Build and Deploy
      run: |
        gcloud builds submit --tag gcr.io/${{ secrets.GCP_PROJECT_ID }}/api-service:${{ github.sha }}
        gcloud run deploy api-service \
          --image gcr.io/${{ secrets.GCP_PROJECT_ID }}/api-service:${{ github.sha }} \
          --region asia-southeast2 \
          --platform managed \
          --allow-unauthenticated

What I Learned

Cloud Run simplified our infrastructure dramatically. We went from managing GKE clusters for 15 low-traffic services to just deploying containers and letting Google handle everything else. The operational burden dropped to almost zero.

Always use Gen2 execution environment. The performance difference is real. Always set minScale to at least 1 for production services unless you’re okay with cold start delays. Always use a VPC connector if you need to access private resources like databases.

Treat Cloud Run configurations as code. My template-and-CSV approach lets me deploy new services in minutes and ensures consistency across all services. The automation through CI/CD means deployments are repeatable and low-risk.

For stateless applications that run in containers, Cloud Run is now the first tool I reach for. It’s not appropriate for everything, but for the 80% use case of “I have a containerized API that needs to scale,” it’s perfect.

GKE Cluster Operations - When you need full Kubernetes
DevOps Utilities - Cloud Run CI/CD automation
GCP KMS Setup - Secure secrets with KMS

How I Use Google Cloud Run for Production Serverless Containers

When Cloud Run Makes Sense

My Production Template

Automation Because Manual Deployment Is Pain

What I Learned

Related Articles