CASE STUDY · 06.01

Doubling engineering output without doubling headcount.

How Livehopper re-engineered a healthcare SaaS provider's software delivery lifecycle into a measurable value engine in 16 weeks, and handed it back to the client's own teams to run.

  • VERTICALHealthcare SaaS
  • STATUSIn production
  • DURATION16-week transformation
  • PUBLISHED

The mandate. Roughly double shipped output ("Cargo Shipped") without proportional headcount growth, on a legacy ASP.NET / Classic ASP / SQL Server stack, inside a HIPAA and PHI compliance envelope.

The method. Treat the SDLC as a value stream, find where it actually slows, and re-engineer those points with AI-augmented agents, hard metrics, and a leaner operating model.

The outcome. A metrics-instrumented, AI-augmented SDLC, delivered in 16 weeks and owned by the client's own teams, with no ongoing vendor dependency.

The headline numbers

  • 50% increase in Cargo Shipped on the pilot teams.
  • 100 to 140% design-capacity uplift measured on the pilot team.
  • 17+ product pods unlocked for an AI-native operating model (pod size 8 to 12 down to about 7; engineering core 4 to 5 down to 3).
  • Translation layer eliminated. Product Owner role removed across the org; management spans widened by more than 50%.
  • Six fragmented knowledge sources collapsed into one semantic index, removing the per-role, every-day tax of stitching context by hand.
  • 45+ teams of rollout enabled, with the playbook and enablement kit transferred to the client.
01

Client overview

Our client is a leading healthcare SaaS provider, building electronic health record (EHR) and practice-management software for behavioral-health, substance-use, and human-services providers across the United States. Its multiproduct portfolio serves thousands of provider organizations and end-clinicians, on workflows that must satisfy HIPAA, PHI handling, and state-level behavioral-health rules.

The engagement centered on the flagship clinical platform, a long-tenured product carrying a real legacy footprint: ASP.NET application code, a meaningful surface of Classic ASP pages, and core business logic concentrated in MS SQL Server stored procedures. Delivery ran across 17+ product pods, with engineering managers, QA leads, designers, and release management split between U.S. and India sites.

The trigger was a private-equity-backed growth mandate: double the rate at which value reaches customers, without doubling the cost base. Our job was to make that economically and operationally real, not as a slogan, but as a number the board could watch move.

02

The challenge: where the lifecycle was losing value

Phase 1 was discovery, not building: six-plus hours of structured workflow interviews with five pilot teams across Product, Design, Engineering, QA, and Release Management. We walked the real end-to-end feature lifecycle and found four systemic value leaks.

  1. The context-stitching tax. Product context lived in Confluence; tickets in Azure DevOps; strategy and long-range plans in SharePoint and Excel; test cases in TestRail; customer voice in Salesforce; code in GitHub. Before anyone could start work, they first reassembled context by hand, a quiet, daily drag on every role in the lifecycle.
  2. Key-person concentration. Critical product, architectural, and regulatory knowledge sat with a handful of senior people. When they were unavailable, work stalled. Throughput was capped by a small number of calendars.
  3. Translation-layer overhead. Strategy passed from PMs to Product Owners to Scrum Masters to Engineers, each handoff adding rework. Product strategy was spread thin, roughly 0.33 PM per pod, while an entire execution layer existed only to bridge that gap. Paid headcount, no direct customer value.
  4. QA and release bottlenecks. Around 52,000 test cases in TestRail (only about 30,000 of them current), automation trailing sprint development, and release night demanding manual re-testing of some 300 tickets. Risk was pushed to the end of every cycle, every cycle.

The constraint, in other words, was rarely engineering talent. It was that the knowledge needed to do the work (real, current, and somewhere in the building) was scattered, permission-bound, and slow to reach. Engineering was further constrained to Visual Studio with GitHub Copilot on the legacy codebase, and any AI had to honor strict PHI guardrails and ACL-based filtering.

03

Our approach: engineer the value, not just the process

Livehopper extends lean transformation into the AI era. We map the lifecycle as a value stream, find where scattered knowledge has become the flow constraint, and re-engineer exactly those points: first by grounding the knowledge, then by building agents and a leaner operating model on top of it.

The grounding came first. We installed JIG, Livehopper's grounded knowledge layer, so that every agent and every assistant answered from the organization's real, current, permission-safe knowledge, with citations, never guesswork. JIG ingests each source with its access lists, redacts PHI and secrets before anything is stored, and retrieves answers tagged with the permissions they came from, so a person only ever sees what they are already cleared to see. JIG is engine-agnostic; here, because the client's operational data and AI-native flows already ran on AWS and Amazon Bedrock, the business index under JIG was Amazon Q Business. With that foundation in place, the rest of the program had something true to build on. (The build and connector design behind that foundation is a subject of its own; this study stays on throughput.)

Three principles governed the build.

  • Observe before automating. Nothing shipped until we had walked the real feature lifecycle with the people living it. Real workflows drove the design, not theoretical ones.
  • Metrics-gated rollout. Each wave scaled only when measurable outcomes, primarily Cargo Shipped, earned it. The board and the program team watched the same number.
  • Build for handoff from day one. Every agent prompt, connector, configuration, and runbook was engineered for the client's own owners to operate after we left. Zero ongoing vendor dependency was a design constraint, not an afterthought.

Phase plan

PhaseDurationValue delivered
Phase 1: Discover Weeks 1 to 4 Baseline metrics established, lifecycle value leaks mapped, knowledge foundation and developer-experience platform assessed.
Phase 2: Build Weeks 4 to 9 Grounded knowledge foundation live; three Product agents in production; live metrics dashboard.
Phase 3: Pilot Weeks 9 to 13 QA and DevOps agents; AI-augmented SDLC and full enablement kit operating on two pilot teams.
Phase 4: Transfer Weeks 13 to 16 Technical transfer, operational handoff, Wave 1 rollout, impact report, plan for Waves 2 to 4.

The operating model rested on three instruments: DX Core 4 as the developer-experience and engineering-metrics platform, wired into Azure DevOps; a Five Core Metrics framework (Cargo Shipped, Feature Velocity, PR Throughput, Spec Coverage, Bug Escape Rate) each with a named executive co-owner; and a four-wave change-management rollout (2 pilot teams, then 5 to 8, then 15 to 20, then the remainder, around 50 teams in all), each wave gated on metrics and carried by internal champions.

04

Across the lifecycle: where the value was captured

Requirements and product definition: weeks of stitching, compressed to minutes

We replaced the multi-tool, multi-handoff intake with a structured chain of three Product Management agents, deployed into the client's Teams and Microsoft 365 environment. A Feature Intake Agent turns any new request into a one-page Decision Brief (Why, Who, What We Know, What Constrains Us, What We Don't Know). A PRD Writer Agent expands that brief into a full PRD with an explicit gap-review checkpoint and inline citations to source. A Prototyping Agent turns the PRD into an interactive design, collapsing the product, design, and engineering loop. Because all three answer from the grounded foundation, every PRD draws on the company's actual Confluence specs, ADO history, SharePoint strategy, and Salesforce customer voice, with clickable citations. Time-to-PRD fell sharply while traceability rose. Three PM agents were live in production within eight weeks, each validated through structured pilot evaluation.

Design: capacity doubled on the scarcest resource

Design was a hard structural constraint: fewer than five FTE designers for the whole organization. Rather than replace them, we kept designer ownership of mockups and used the Prototyping Agent plus an AI-enhanced workflow to compress iteration. The pilot design lead reported a 100 to 140% capacity increase, and the binding constraint moved off design entirely, a direct unlock of the most expensive bottleneck in the lifecycle.

Development: context in the editor, not the meeting

Once we confirmed engineering lived exclusively in Visual Studio with GitHub Copilot, we built the path that lets the editor query the grounded foundation in-context, and packaged the client's stored-procedure conventions, Classic ASP patterns, and code standards into reusable Copilot skills loaded automatically into the coding flow. An engineer can now ask, mid-task, "What are the architectural constraints on the auth module?" or "Look up story 54321 and tell me what tests exist for it," and get a cited answer drawn from Confluence, ADO, SharePoint, and TestRail without leaving the IDE. Context-switching collapses; ramp-up on the legacy stack collapses; institutional knowledge becomes a query rather than a meeting.

Quality assurance: cost taken out of release night

A dedicated QA agent lead ran a deep-dive with the client's QA lead on test-case authoring, release-night re-testing, and the TestRail/ADO interaction. The Phase 3 QA agents target the two highest-leverage cost areas, generating test cases from user stories, and identifying regression automatically, while a Release Cargo Cleanup Agent takes over the monthly manual cleanup of stale Target-Release fields on cloned tickets, returning senior-QA time to higher-value work. When the client pulled QA agents forward by a full phase to hit a non-negotiable sprint date, we onboarded the lead, ran an accelerated deep-dive, and shipped a thin-but-functional QA agent in time to refine it live during the sprint. A schedule shock became a real validation cycle.

Deployment and operations: durability without dependency

Phase 3 DevOps agents target CI/CD support and automated documentation and runbook generation. To make the model last, we packaged every connector and configuration with operational runbooks, set 30 and 60-day guardrail-review cadences, and built a dedicated module of the Training Enablement Kit for the internal technical team, covering index administration, connector monitoring, persona and PHI-guardrail management, and ongoing maintenance.

05

Organizational design: efficiency without firing anyone

The new tooling pays an organizational dividend, captured in an AI-Native Team Structure recommendation.

  • Translation layer collapsed. The Product Owner role is removed; one dedicated PM per pod operates the PRD and ticket-generation agents.
  • Pod size compressed. From 8 to 12 members down to about 7.
  • Engineering core compressed. From 4 to 5 engineers down to 3, backed by AI coding assistants.
  • Management spans widened. Engineering Managers cover 3 pods, up from 2; QA Leads cover 4, up from 3.

The result is a smaller, denser, more autonomous unit of delivery, and a cost-per-feature ratio that improves and then compounds across 17+ pods.

06

Value created

Baselines were set in Phase 1 and improvement measured against them in pilot. The value falls into five categories.

Throughput. Baselines existed for the first time, PR Throughput at 1.2 PRs per engineer per week, Issue Cycle Time at 4.7 days against industry P75 references, giving the client a defensible benchmark to manage against. A 50% rise in Cargo Shipped on the pilot teams turned the PE growth mandate from aspiration into measured fact, alongside the 100 to 140% design-capacity uplift and three PM agents live inside eight weeks.

Cost. Removing the Product Owner role freed recurring headcount for redeployment into direct-value work. Smaller pods and wider management spans compound into real cost-per-pod efficiency across 17+ pods. Release-night re-testing (around 300 tickets) and stale-ticket cleanup were put on a path to automation. And six tool subscriptions collapsed into one query surface, removing the cognitive and onboarding overhead of six fragmented systems.

Risk. Key-person risk fell, critical product, architectural, and regulatory knowledge is now queryable by any cleared user, so institutional memory is no longer a single point of failure. Compliance posture strengthened: PHI guardrails, ACL filtering, Entra ID federation, persona-scoped responses, and pre-ingestion data scrubbing were first-class constraints, not bolt-ons.

Strategic. A repeatable rollout playbook, four waves, around 45 teams beyond pilot, an internal-champion model, and a full enablement kit transferred to the client's training team. A shared cross-functional vocabulary for the lifecycle, anchored by a C-suite alignment session. And a governance model with named executive co-owners for every metric, so the value keeps compounding after we step back.

Future optionality. An AI-native operating model that scales sideways as the client launches products, acquires teams, or absorbs PE-driven integrations, on a modern AI and data foundation that other use cases, from analytics to customer support, can build on without re-platforming.

07

Why this matters for prospects

This is a clean proof of what Livehopper does best: turn a sprawling, multi-team, regulated SDLC into a measurably more valuable delivery engine, without ripping out the toolchain, without compromising compliance, and without leaving the client dependent on us.

  1. We monetize the lifecycle. Every phase ties to a value lever: throughput, cost, risk, strategic, optionality. Process change is the mechanism; value is the outcome.
  2. We do the work before we build the tools. Six-plus hours of senior-led workflow discovery drove every agent and every architectural decision.
  3. We instrument before we scale. Five Core Metrics, DX Core 4, executive co-owners, and a metrics-gated rollout meant no wave moved on hope. The board saw the number we saw.
  4. We respect regulated environments. PHI guardrails, ACL filtering, Entra ID federation, persona-scoped responses, and pre-ingestion scrubbing were design constraints from day one. Compliance became a moat, not a tax.
  5. We transfer the keys, and the value compounds. Code, configs, runbooks, prompt playbooks, a six-module training kit, and a train-the-trainer model, engineered so the client's own champions lead the next 45+ teams themselves.

For any healthcare, regulated-SaaS, or legacy-stack-heavy organization that needs materially higher engineering output without materially higher headcount, and needs the numbers to prove it, this is the playbook.

START

A quiet conversation
about what you are trying to ship.