Shipping vs. shipping fast: a senior engineer's three-variable heuristic

1. The three variables — velocity is just one

The trap: optimizing only for velocity. You ship 10× faster, but every bad deploy costs you 8 hours of production firefighting because you can't roll back cleanly and you don't know what's wrong. Net velocity is lower than before.

The fix: invest in reversibility and observability first, then velocity follows naturally. Once a deploy is one-click reversible and you can detect a regression in 2 minutes, you can ship 20 times a day without fear.

Velocity: how fast can a change get from idea to production. Measured in lead time for changes (DORA metric).
Reversibility: how cheap is it to undo a change if it goes wrong. Measured in mean time to recovery (DORA metric).
Observability: how quickly do we know a change went wrong. Measured in mean time to detection (MTTD).

2. Reversibility — the cheap insurance you don't take out

Reversibility is binary: either you can undo a change in under 5 minutes, or you can't. The investment to get to 'yes' is engineering work most teams skip until after their first major incident.

Feature flags. Every meaningful behavior change ships behind a flag. Flag toggle = 30 seconds to revert. No redeploy needed. LaunchDarkly, Statsig, Vercel Edge Config, or build your own with Postgres.
Database migrations: backwards-compatible by default. Add columns nullable, populate, then deploy code that reads them. Don't drop columns until two deploys later.
Canary deploys. Roll out to 1% of traffic first. Promote to 10%, 50%, 100% based on error rates. If something's wrong, only 1% saw it.
Blue-green deploys. New version runs alongside old; traffic switches atomically. Rollback is the same switch in reverse.
Atomic infra changes. Use Pulumi/Terraform with proper state. A failed change reverts to the previous state instead of leaving a half-broken environment.

Feature flag pattern: ship behind a flag, validate, then promote

typescript

import { Flags } from "@/lib/flags";

export async function getProductPrice(productId: string, userId: string) {
  const useDynamicPricing = await Flags.isEnabled(
    "dynamic-pricing-v2",
    { userId, default: false }
  );

  if (useDynamicPricing) {
    return await computeDynamicPrice(productId, userId);
  }
  return await getStaticPrice(productId);
}

// Day 1: ship with flag off (default false). 0% impact.
// Day 2: enable for internal users. Validate.
// Day 3: 5% rollout. Watch metrics.
// Day 5: 50% rollout.
// Day 7: 100% rollout. Remove flag in next sprint.

// If anything breaks at any percentage:
//   Flag toggle off = instant rollback. No redeploy needed.

3. Observability — knowing when you're wrong (in minutes, not days)

Observability is the difference between 'a customer emailed us about an outage' and 'PagerDuty paged us 90 seconds after deploy'. The infrastructure investment pays for itself the first time it catches a bug before a customer does.

Three pillars: logs (what happened), metrics (how often, how much), traces (request flow across services). All three, instrumented end-to-end.
OpenTelemetry as the default. Vendor-agnostic, supports all three pillars, ships with most modern frameworks. Datadog, Honeycomb, Grafana Cloud as the storage/UI.
Per-deploy comparison. Vercel Speed Insights, Sentry deploy markers — every deploy is annotated in your dashboards. Spike in error rate? You can see which deploy did it.
Real-user monitoring (RUM) for frontend. CrUX is sampled and slow; RUM is real-time and per-user. Vercel Speed Insights or DataDog RUM.
SLOs + error budgets. Define what 'available' means (e.g., 99.9% of requests under 500ms). Alert when error budget burns faster than expected.

MTTD before MTTR

You can't recover (MTTR) what you don't detect (MTTD). Most teams optimize MTTR with on-call training and runbooks before they've fixed MTTD. Detection is the harder problem; fix it first.

4. DORA metrics: the calibration tool

The two metrics that correlate most strongly with high-functioning teams: deploy frequency and mean time to recovery. The fast teams aren't fast because they're cowboys — they're fast because reversibility + observability let them deploy without fear.

Metric	Elite	High	Medium	Low
Lead time for changes	< 1 hour	1 day – 1 week	1 week – 1 month	1 month +
Deploy frequency	On demand (multi/day)	Once/day–once/week	Once/week–once/month	Less than once/month
Mean time to recovery	< 1 hour	< 1 day	1 day – 1 week	1 week +
Change failure rate	0–15%	16–30%	16–30%	16–30%

5. Feature flags as a velocity multiplier

Cost: ~$200–$2000/month for LaunchDarkly or Statsig at typical scale. ROI: deploy frequency goes from weekly to multi-daily. Lead time drops 5–10×. Change failure rate halves because the rollback strategy is one click.

Deploy code at any time. Code is dormant behind 'off' flags until the launch.
Marketing announces; you flip a flag. No 2am deploy windows.
Bad launch? Flip the flag back. No emergency rollback.
Per-customer flags. Beta features for specific accounts. Gradual rollouts to enterprise tiers first.
A/B testing emerges from the same infrastructure. Half-and-half assignment + metrics.

6. When to slow down deliberately

Senior engineers know when fast is wrong. Three signals to slow down:

The change is one-way. Database deletions, third-party API contracts, customer-visible URL changes. Once shipped, undoing is expensive. Slow is the right speed.
The change is high-stakes. Payment processing, authentication, anything regulatory. Reversibility doesn't help if the bug already cost you a SOC 2 finding.
The change is poorly understood. If two engineers can't agree on what the change does, fast just means breaking faster. Spend the time to align first.
The team is tired. Velocity is a function of energy. If the team has been on-call for two weeks, going fast is going to break things even with all the right infrastructure.

7. The decision table

The table is intentionally tactical. Most engineering decisions don't need a meeting; they need a heuristic that says 'flag this, don't flag that, observe this hard, observe that lightly.' The best engineering teams have internalized this table to the point that it's reflex.

Change type	Velocity priority	Reversibility need	Observability need
UI tweak	High — ship it	Low (CSS revert)	Low
New feature, low-risk	High — feature flag it	High (flag-gated)	Medium
DB schema change	Low — go slow	Critical (backwards-compat)	High
Auth / payment	Low — go slow	Critical (canary 1%)	Critical (instant alerting)
Marketing copy	High — ship it	Low (revert via CMS)	Low
3rd-party integration	Medium	High (circuit breaker)	High (per-vendor SLO)
Infra config	Low — go slow	Critical (state-tracked)	Critical
Performance optimization	Medium	High (flag-gated)	Critical (regression detect)

Shipping vs. shipping fast: a senior engineer's three-variable heuristic

1. The three variables — velocity is just one

2. Reversibility — the cheap insurance you don't take out

3. Observability — knowing when you're wrong (in minutes, not days)

4. DORA metrics: the calibration tool

5. Feature flags as a velocity multiplier

6. When to slow down deliberately

7. The decision table

Related deep-dives

Production RAG: chunking, reranking, evals, and cost (a field guide)

Core Web Vitals in 2026: the INP edition (and what actually moves rankings)

11 agency anti-patterns we refuse to participate in

The cost of waiting
is your competitor.

Shipping vs. shipping fast: a senior engineer's three-variable heuristic

1. The three variables — velocity is just one

2. Reversibility — the cheap insurance you don't take out

3. Observability — knowing when you're wrong (in minutes, not days)

4. DORA metrics: the calibration tool

5. Feature flags as a velocity multiplier

6. When to slow down deliberately

7. The decision table

Related deep-dives

Production RAG: chunking, reranking, evals, and cost (a field guide)

Core Web Vitals in 2026: the INP edition (and what actually moves rankings)

11 agency anti-patterns we refuse to participate in

The cost of waiting is your competitor.

The cost of waiting
is your competitor.