Deploy to staging automatically on every merge. Gate production with a human approval. Blue/green, rolling, canary strategies. Rollback in 30 seconds. Week 3 capstone — a production-grade CD pipeline from scratch.
Stop all old instances → start new. Simple but causes downtime.
v1 ●●●● → STOP → v2 ●●●●
↑ downtime here
Rollback: redeploy v1. Use for: dev/staging, batch jobs, where downtime is acceptable.
Replace instances one by one. Mix of v1 + v2 briefly running.
v1●●●● → v1●●●v2 → v1●●v2v2 → v1v2v2v2 → v2v2v2v2
Rollback: slow (roll back instance by instance). Use for: most production deployments. K8s default.
Two identical environments. Switch traffic instantly.
BLUE (v1) ← traffic GREEN (v2) staging
Deploy v2 to green, test it
BLUE (v1) idle GREEN (v2) ← traffic
→ Rollback: flip LB back to blue
Rollback: instant — flip load balancer. Cost: 2× infrastructure. Use for: critical services.
Route small % of traffic to new version. Monitor. Increase gradually.
5% → v2, 95% → v1 (watch metrics) 20% → v2, 80% → v1 (still good?) 100% → v2 (full rollout)
Use for: high-risk releases, new features. Netflix, Google, Amazon use this.
name: CD Pipeline on: push: branches: [ main ] jobs: # ── Stage 1: Run tests ────────────────── test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: { node-version: '20', cache: 'npm' } - run: npm ci && npm test # ── Stage 2: Deploy to staging (AUTO) ─── deploy-staging: needs: test runs-on: ubuntu-latest environment: staging # links to GitHub Environment steps: - uses: actions/checkout@v4 - name: Deploy to staging run: | echo "🚀 Deploying \${{ github.sha }} to STAGING" # docker pull ghcr.io/... && docker run ... # or: kubectl set image deployment/app app=ghcr.io/...:SHA # or: helm upgrade app chart/ --set image.tag=SHA # ── Stage 3: Production (MANUAL GATE) ─── deploy-production: needs: deploy-staging runs-on: ubuntu-latest environment: name: production # requires approval ← configured in GitHub url: https://myapp.example.com steps: - uses: actions/checkout@v4 - name: Deploy to production run: | echo "🌐 Deploying \${{ github.sha }} to PRODUCTION" # ── Stage 4: Notify team ──────────────── notify: needs: [ deploy-production ] if: always() runs-on: ubuntu-latest steps: - name: Slack notification uses: slackapi/slack-github-action@v1.26.0 with: payload: | { "text": "Deploy \${{ job.status }} — \${{ github.sha }}" } env: SLACK_WEBHOOK_URL: \${{ secrets.SLACK_WEBHOOK_URL }}
stagingproductionproduction → Required reviewersWhen the deploy-production job runs, GitHub pauses and sends a notification to reviewers. They must click Approve before the deploy proceeds.
# ── Option 1: Redeploy previous SHA (Docker) ─── # Every image is tagged with its commit SHA # Rollback = deploy the previous SHA tag PREVIOUS_SHA=$(git log --format="%H" -n 2 | tail -1) docker pull ghcr.io/user/myapp:sha-${PREVIOUS_SHA:0:7} docker run -p 3000:3000 ghcr.io/user/myapp:sha-${PREVIOUS_SHA:0:7} # ── Option 2: Kubernetes rollout undo ────────── kubectl rollout history deployment/myapp # REVISION CHANGE-CAUSE # 1 sha-a3f7c2d # 2 sha-b4e8d3e ← current, broken kubectl rollout undo deployment/myapp # → rolls back to revision 1 instantly kubectl rollout undo deployment/myapp --to-revision=1 # ── Option 3: Feature flag (no redeploy) ─────── # Feature flags: toggle feature off in LaunchDarkly/Unleash # Bug is in new feature → turn flag off → bug hidden # Fix, test, turn flag back on → zero downtime # ── Option 4: Git revert + redeploy ──────────── git revert HEAD # creates revert commit git push origin main # → triggers CI/CD → new deploy with code reverted # slower: requires full CI run # ── Option 5: Database rollback (careful!) ───── # RULE: always make backward-compatible schema changes # ADD nullable column first (both old + new app work) # REMOVE column only after all instances are updated # NEVER remove a column the old app still reads!
| Method | Speed | Downtime |
|---|---|---|
| Blue/Green flip | Instant | Zero |
| K8s rollout undo | <30 sec | Zero |
| Feature flag off | Instant | Zero |
| Redeploy previous SHA | 5-10 min | Brief |
| Git revert + CI | 10-20 min | Longer |
GitHub Environments → deploy.yml → staging auto-deploy → production approval gate → rollback demo
staging (no protection) and production (add yourself as required reviewer).git revert HEAD && git push to automatically fix it.name: CD — Deploy on: push: branches: [ main ] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: { node-version: '20', cache: 'npm' } - run: npm ci && npm test deploy-staging: needs: test runs-on: ubuntu-latest environment: staging steps: - uses: actions/checkout@v4 - name: Deploy to Staging run: | echo "Deploying \${{ github.sha }} to STAGING" echo "Image: ghcr.io/\${{ github.repository }}:sha-\${{ github.sha }}" echo "✅ Staging deploy complete" deploy-production: needs: deploy-staging runs-on: ubuntu-latest environment: name: production url: https://myapp.example.com steps: - uses: actions/checkout@v4 - name: Deploy to Production run: | echo "Deploying \${{ github.sha }} to PRODUCTION" echo "Image: ghcr.io/\${{ github.repository }}:sha-\${{ github.sha }}" echo "🌐 Production deploy complete" - name: Smoke test run: | echo "Running smoke tests..." echo "✅ App healthy at \${{ env.APP_URL }}" notify: needs: [ deploy-production ] if: always() runs-on: ubuntu-latest steps: - run: | echo "Deploy status: \${{ needs.deploy-production.result }}" echo "SHA: \${{ github.sha }}" echo "Actor: \${{ github.actor }}"
Before pushing the workflow, set up environments:
staging → Save (no protection needed)productionmain onlykubectl set image deployment/app app=ghcr.io/.../app:shahelm upgrade app chart/ --set image.tag=shadocker pull img:sha && docker run img:shaaws ecs update-service --cluster prod --task-def app:v2
──── git push origin feat/x ───────────────── ci.yml: ├─ lint (ESLint, Husky) └─ test (matrix Node 18+20) ──── git merge → main ─────────────────────── test.yml: └─ jest --coverage (70% gate) docker.yml: └─ docker build + push to GHCR ghcr.io/user/app:sha-abc1234 ghcr.io/user/app:latest deploy.yml: ├─ test (regression check) ├─ deploy-staging ← AUTO (no gate) │ └─ "✅ sha-abc1234 → staging" ├─ [⏸ WAITING FOR APPROVAL] ├─ deploy-production ← GATED │ └─ "🌐 sha-abc1234 → production" └─ notify └─ Slack: "Deploy success: sha-abc1234" ──── Total time: ~8-12 minutes ────────────── # (excluding approval wait time)
.github/workflows/ ├── ci.yml ← lint + test (on PR) ├── ci-advanced.yml ← matrix tests ├── test.yml ← Jest coverage gate ├── docker.yml ← GHCR push on merge └── deploy.yml ← staging + prod
3 questions · 5 minutes · deployment strategies, canary, backward-compatible schema
| Strategy | Downtime | Rollback | Cost | Best for |
|---|---|---|---|---|
| Recreate | Yes | Redeploy old | 1× | Dev/staging, batch jobs |
| Rolling | No | Slow | 1× | Most services — K8s default |
| Blue/Green | No | Instant | 2× | Critical services, financial |
| Canary | No | Fast | ~1.1× | High-risk releases, new features |
| Feature Flag | No | Instant | 1× | Feature rollouts, A/B testing |
kubectl rollout status deployment/myappkubectl rollout history deployment/myappkubectl rollout undo deployment/myappkubectl rollout undo deployment/myapp --to-revision=2
| Day | Topic | Key Skills | File |
|---|---|---|---|
| 11 | CI/CD Concepts | CI vs CD vs Deployment, pipeline anatomy, Jenkins vs GHA | ci.yml ✅ |
| 12 | Pipeline Deep Dive | needs:, matrix, secrets, cache, withCredentials | ci-advanced.yml ✅ |
| 13 | Testing in CI | Jest, coverage threshold, quality gate fail+fix | test.yml ✅ |
| 14 | Artifacts & Docker | Multi-stage Dockerfile, GHCR push, npm audit | docker.yml ✅ |
| 15 ✅ | Continuous Deployment | Environments, approval gates, strategies, rollback | deploy.yml ✅ |
# Environment with all protection rules jobs: deploy-prod: environment: name: production url: https://app.example.com # ↑ URL shown in GitHub UI after deploy # ↑ required reviewers set in GitHub Settings # Environment-specific secrets # (different from repository secrets) # production DB_URL ≠ staging DB_URL env: DB_URL: \${{ secrets.DB_URL }} # → uses the DB_URL from 'production' environment # not from the repo-level secrets # Limit which branches can deploy to production # Configured in: Settings → Environments → production # → Deployment branches: main only # → Prevents feature branches deploying to prod! # Timeout the approval gate timeout-minutes: 1440 # 24 hours max wait # Concurrency — prevent simultaneous deploys concurrency: group: production cancel-in-progress: false # → queue, don't cancel running deploys
Configure in Settings → Environments → [name]:
DB_URL ≠ staging DB_URL. Configure environment-level secrets so staging code can never accidentally hit the production database, even if someone misconfigures the pipeline.
cancel-in-progress: false. Never cancel a running production deployment — it can leave the system in a partial state. Queue it instead.
notify: needs: [ deploy-production ] if: always() # notify even on failure runs-on: ubuntu-latest steps: - name: Notify Slack — success if: needs.deploy-production.result == 'success' uses: slackapi/slack-github-action@v1.26.0 with: payload: | { "text": "✅ Deployed to production", "blocks": [ { "type": "section", "text": { "type": "mrkdwn", "text": "✅ *Production deploy successful*\n*SHA:* \`\${{ github.sha }}\`\n*By:* \${{ github.actor }}\n*Repo:* \${{ github.repository }}" } } ] } env: SLACK_WEBHOOK_URL: \${{ secrets.SLACK_WEBHOOK_URL }} - name: Notify Slack — failure if: needs.deploy-production.result == 'failure' uses: slackapi/slack-github-action@v1.26.0 with: payload: | { "text": "❌ Production deploy FAILED — \${{ github.sha }}" } env: SLACK_WEBHOOK_URL: \${{ secrets.SLACK_WEBHOOK_URL }}
api.slack.com/appsSLACK_WEBHOOK_URLmicrosoft/teams-deployments@v1pagerduty/pagerduty-send-event-action@v2| Problem | Cause | Fix |
|---|---|---|
| deploy-production never starts (no gate UI) | Environment not created or no reviewer set | Settings → Environments → create production → add Required reviewers. |
| "Environment not found" error | Environment name in YAML doesn't match GitHub | Name is case-sensitive. environment: production must match the environment name exactly. |
| deploy-staging skipped (yellow ○) | test job failed or if: condition not met | Fix the failing test job first. Check if: expressions. |
| Approval notification not received | GitHub notification settings | GitHub → Settings → Notifications → Actions → enable environment deployment notifications. |
| Deployment branch restriction blocking | Branch not allowed for this environment | Settings → Environments → production → Deployment branches → add branch pattern. |
| Slack notification fails | Invalid or expired webhook URL | Regenerate webhook in Slack API. Update the GitHub secret. |
| Production deploys cancel each other | Missing concurrency setting | Add concurrency: group: production, cancel-in-progress: false. |
| Staging uses prod secrets | Using repo-level secrets instead of env-level | Move environment-specific secrets to the environment (not repo-level) in GitHub Settings. |
# Runs on: PR + push to main/feat/** # Jobs: lint, test (matrix Node 18+20) # Gate: ESLint errors block merge # Purpose: quality gate on every PR
# Runs on: push to main + feat/** # Jobs: jest --coverage # Gate: 70% coverage threshold # Purpose: no code without tests
# Runs on: push to main + git tags # Jobs: npm audit + docker build + push # Output: ghcr.io/user/app:sha-xxx # Purpose: deployable artifact
# Runs on: push to main # Jobs: test → staging → (gate) → prod → notify # Gate: production requires approval # Purpose: full CD
ci.yml runs (lint + test)test.yml runs (coverage gate)docker.yml runs (image push)deploy.yml runs (staging → prod)docker.yml runs (semver-tagged image)ci.yml — lint + test on PRstest.yml — Jest + 70% coverage gatedocker.yml — image in GHCRdeploy.yml — staging auto + prod gatedci: add full cd pipelineWeek 4 Docker mastery feeds directly into Week 5: Kubernetes. What you built in Week 3 (Docker images, CD pipeline) connects to what you'll deploy in Week 5 (AKS, Helm, kubectl).
The full 35-day arc: Git → CI/CD → Docker → Kubernetes → Observability → Security → Capstone