Northwind·B2B SaaS · Field-service·8 months · ongoing

Took over a stalled platform. 99.99% uptime, 22 features shipped, technical debt halved — in 6 months.

Item: SERP Axis
Rating: 5
Author: Helena Brodie

Northwind builds field-service-management software for HVAC, plumbing, and electrical contractors (1,400 active customers). Their lead engineer left abruptly in summer 2025, and the platform stalled — incidents up 4×, deploys down 60%, customer-success tickets up 2.7×. They needed an agency to take ownership of operations AND ship the product roadmap, not just fix bugs. We took over in 6 weeks, cleared the incident backlog in 90 days, shipped 22 customer-facing features in 6 months, and halved their technical debt while maintaining 99.99% uptime.

Started

Aug 2025

Region

United States, Canada, UK

Team

5 people

Stack

7 technologies

Software ManagementSoftware DevelopmentPerformance EngineeringTechnical Debt Recovery

99.99%

Uptime · 6 months

0 P1 incidents

Customer-facing features shipped

vs 8 promised

−54%

Technical debt

measured by SonarQube

11/wk

Deploy frequency

from 0.4/wk · +2,650%

The problem

Northwind came to us with…

When their lead engineer left, Northwind had a 14-page Notion handover doc, two junior engineers who'd never owned production, and 1,400 customers who didn't know any of this. Within 6 weeks: P1 incidents went from 0.3/month to 1.2/month, dependency upgrades stopped, and the customer-success team was fielding 4× their usual ticket volume. The CTO needed to either rebuild the engineering team (12-month process) or hand off operations to a senior agency.

The four core challenges

CHALLENGE 01

Inherited a stalled platform

47 known bugs in the backlog (some open for 14 months). 8 dependency-vulnerability alerts. Test coverage at 22%. CI/CD broken. Deployment wiki was 18 months stale.

CHALLENGE 02

Two junior engineers, no senior

Both juniors were strong but had never owned production incidents. We needed to handle on-call AND mentor them up to mid-level competence.

CHALLENGE 03

1,400 customers, zero notice

Couldn't take downtime windows. Migration had to happen in-flight without customer-visible disruption.

CHALLENGE 04

Roadmap commitments

8 features had been promised on dated commit dates. Pushing them would erode customer trust further. We had to ship AND clean up the platform simultaneously.

How we shipped it

The approach

Weeks 1–6

Discovery + emergency stabilization

Two staff engineers + one SRE shadowed every part of the platform for 2 weeks. Wrote 47 pages of runbooks. Set up OpenTelemetry + PagerDuty. Cleared the P1 backlog (4 critical bugs, all production-blocking) in week 5. Took over on-call in week 6.

Deliverables

47-page runbook library
OpenTelemetry instrumentation across 12 services
PagerDuty rotation + escalation
P1 backlog cleared (4 bugs)
Dependency upgrade plan (8 vulns)

Months 2–3

Test coverage + CI/CD recovery

Got CI/CD back to green. Wrote tests for the 12 highest-risk modules. Test coverage went 22% → 64%. Deploy frequency went from 0.4/week to 11/week. Mean time to deploy a 1-line change went from 3 days to 22 minutes.

Deliverables

Test coverage 22% → 64%
CI/CD pipeline (GitHub Actions)
Canary deployment infrastructure
Feature flags (LaunchDarkly)
Automated dependency upgrades (Renovate)

Months 3–6

Roadmap + customer-facing features

Shipped 22 customer-facing features against the original roadmap commitments. Plus 6 unplanned features driven by data from the new observability stack (we found bugs that revealed unmet customer needs). Customer-success ticket volume halved.

Deliverables

22 customer-facing features shipped
6 unplanned features (from observability data)
Mobile app v3 (React Native)
API platform v2 (rate-limited, versioned)
Performance audit + optimization

Months 6–8 · ongoing

Steady-state operations + roadmap velocity

Now in steady state: 11 deploys/week, 99.99% uptime, customer-success at 60% of pre-stall volume. Both junior engineers have been promoted to mid-level and own discrete services. We're 8 weeks into the year-2 roadmap with zero incidents shipped.

Deliverables

Steady-state operations playbook
Junior engineer mentorship track
Quarterly SLO + error-budget review
Customer-facing status page
Year-2 roadmap (scoped + estimated)

The receipts

Before / after — every metric

Numbers verifiable with the client. Audit trail available on request.

Metric	Before	After	Change
P1 incidents (monthly)	1.2	0	−100%
Mean time-to-resolve P1	8.4 hours	1.1 hours	−87%
Test coverage	22%	64%	+190%
Deploy frequency (per week)	0.4	11	+2,650%
Mean time-to-deploy 1-line change	3 days	22 min	−99.5%
Open dependency vulnerabilities	8	0	−100%
Customer-success ticket volume	240/wk	118/wk	−51%
Features shipped (6 months)	—	22	—
Technical-debt score (SonarQube)	47.2	21.8	−54%
Uptime (6 months)	99.2%	99.99%	+0.79pp

What we ran it on

Stack, team, and tools

Tech stack

· Node.js + TypeScript
· Postgres
· Redis
· React Native (mobile)
· Next.js (web)
· AWS (ECS + RDS)
· OpenTelemetry

Team

· 1 engineering manager (lead)
· 2 staff engineers
· 1 SRE / on-call lead
· 1 mobile engineer (RN)
· 1 QA engineer

Tools

· GitHub Actions
· PagerDuty
· Datadog
· SonarQube
· LaunchDarkly
· Renovate
· Sentry

When our lead engineer left, I had three options: rebuild the team (12 months), accept slower delivery (board wouldn't), or find a senior agency to operate the platform. SERP Axis was option three. Six months later we have 99.99% uptime, 22 features shipped, and our two junior engineers have been promoted to mid-level. They didn't just operate it — they made our team better.

Helena Brodie

CTO, Northwind

“I went from 4× ticket volume back to under our pre-stall baseline. The customer-success team noticed the difference within a month. The retention math alone paid for the engagement.”

Tom Aldrich

VP Customer Success, Northwind

What's next

Year-2 plan: AI-assisted scheduling for field technicians (RAG over historical work-order data), plus a Power BI dashboard for ops + customer-success. Both scoped, kicking off month 9.

Veridian Health

Healthtech · Telehealth · 3.1×

Quasar Clinic

Healthcare SaaS · +4,950%

All case studies

4 strategy seats remaining · Q3