Outcome-based, milestone-gated. We agree on a success number before week one. You pay when we hit it. Risk sits with DayTwoAI.

03 · Gen AI Co-Development

Your engineers already use AI.
You just can't see what it's costing — or earning.

Q: Our repos are messy. Will AI even help us?

Some will, some won't. That is exactly what the readiness score is for — it tells you which repos to invest in now and which to fix the foundation in first.

Vibe coding is already inside your org. The question is whether it shows up as a measured edge or a runaway invoice. We turn AI co-development into a governed practice — visible cost, repeatable process, the right tool for the right job — in 8–12 weeks.

Start a conversation See the cost shape

12×

Cheaper for the same work — base levers alone

400×

Cost spread between autocomplete and autonomous

8 axes

Readiness score — most enterprise repos land 50–70

Zero

Lock-in to any single AI vendor or tool

The problem

The bill is non-linear. The savings are uneven. The blast radius is real.

Three things every engineering leader we've worked with discovers within ninety days of taking AI co-dev seriously.

The bill is non-linear

Autocomplete is ~$25 / dev / week. Autonomous agentic work is ~$10,000. A single supervised agentic run that costs $0.64 in the best case can extrapolate to $430k / year at 50 engineers if it runs uncontrolled — and the 5-minute cache TTL on most providers means the bill compounds fast.

Cheap tokens fix half the bill

Most teams pay the premium rate by default. Base levers — longer cache TTL, smart routing between models, a visible cost status line — typically deliver 12× cheaper output for the same work. Advanced levers add another 20–40%. The savings are real and immediate, but only if someone is measuring.

Pockets of value exist — and pockets of risk

Some repos respond well to AI co-dev. Others won't — and pushing AI into them creates rework, regressions, and review fatigue. Without a readiness score, you're betting blind on which teams to fund and which to slow down.

Shadow tools are already inside the building

Engineers don't wait for procurement. By the time you're running the RFP, four IDE plugins, two CLI agents, and one personal API key are already in your codebase. The question isn't whether to allow it — it's how to make it safe, measurable, and repeatable.

The approach

Four modes of AI co-development. A 400× spread in cost. One coherent practice.

Most teams talk about "using AI for coding" as if it were one thing. It isn't. The cost, the value, and the risk are wildly different across four distinct modes — and the trick is matching the mode to the work, not standardising on one.

Mode

Shape, and what it costs

Autocomplete

Editor suggests the next line. Cheapest mode. ≈ $25 / dev / week.

Turn-taking

Chat in the IDE. Engineer prompts, AI replies, engineer pastes. ≈ $200 / dev / week.

Supervised agentic

AI runs a small task end-to-end while the engineer reviews. ≈ $1,200 / dev / week.

Autonomous

AI takes a ticket and ships a PR with minimal hand-holding. ≈ $10,000 / dev / week — and a single run can hit $430k / yr at 50 engineers.

Three tiers of cost levers apply on top: base (cache TTL, smart router, status line — 12× cheaper), advanced (Redis memory, auto-prune, loop guard — 20–40% more), and experimental (checkpoint summaries, think cap — 50–70%, pilot only).

How we're different

We're integrators. We blend into your engineering rituals, not on top of them.

Within your existing engineering culture, we connect the dots — find the trim tab, turn the ship a few degrees. No parallel process, no new approval chain, no AI platform you license forever.

No vendor lock-in

Copilot, Claude Code, Cursor, Windsurf, your own gateway — we're tool-agnostic. The governance and cost controls live in your stack, not behind someone else's license. Swap the tool, keep the practice.

Best tool for the engineer

Different work needs different tools. Frontend prototyping and refactoring legacy services don't want the same model or the same harness. We help you match the tool to the task instead of standardising on one for the wrong reasons.

Leverage what your team already runs

Your IDE, your CI, your code-review rituals, your security review — we blend AI co-dev into what's already working. No parallel process, no new approval chain, no rip-and-replace of the engineering org.

Integrator, not platform vendor

We don't sell a co-dev platform you have to license forever. We assemble the practice from open building blocks — your IDE plugins, your gateway, your dashboards, your governance — and we walk away when it's running clean.

How we deliver

From shadow tools to measured practice in 8–12 weeks.

01Wk 1–2

Listen

Sit with the engineers who are actually using AI today. Find the shadow workflows, the favourite prompts, the tools that quietly entered the stack. No surveys. No vendor pitches. Just listening.

02Wk 2–4

Map & baseline

Build the picture you don't have today: which teams use which tools, what they cost, where they help, where they hurt. One spreadsheet — trust on one tab, cost on the other.

03Wk 3–5

Score the repos

Run the 8-axis readiness score against the codebases that matter — tests, docs, deps, bus factor, build, complexity, CI, structure. Most enterprise repos land 50–70 / 100. That number tells you where AI works and where it won't.

04Wk 4–8

Pull the levers

Apply the base levers first — 1-hour cache TTL, smart router, status line. Often 12× cheaper for the same work. Then advanced levers (Redis memory, auto-prune, loop guard) for another 20–40%. Experimental levers pilot only.

05Wk 6–10

Blend into rituals

AI co-dev gets folded into your existing engineering rituals — stand-ups, code review, retros — instead of becoming a parallel process. Hackathons and learnathons to spread the patterns that worked.

06Wk 10–12 + 12 mo

Hand off

We leave behind the playbook, the dashboards, the governance, and the people who own them. Then we stay on for twelve months while it sets.

Tools we built

Two tools that make AI co-development measurable and safe.

Prefex

API spend control. 40–70% savings, one config change.

•Lightweight proxy between your AI tools and the API — savings start immediately.
•Automatically routes simple requests to cheaper models — most sessions are 60–80% simple questions.
•Manages prompt caching so repeated context costs 10% instead of 100%.
•Detects and kills runaway agent loops before they generate surprise bills.
•Shows spend, cache efficiency, and projected cost inline in your terminal — when it matters, not in last month's report.

ReadyBase

Repo readiness. Know the blast radius before you edit.

•Deterministic scan: no LLM calls, no opinions. Signals from git history and code structure.
•Maps blast radius so you see every downstream dependency before editing a file.
•Surfaces co-change pairs — files that always move together — so AI tools don't introduce silent drift.
•Flags bus factor risk on files where one person holds all the knowledge.
•Surfaces everything in-editor through MCP — so AI and engineer both know the stakes before the edit.

What changes

A measured AI engineering practice. Cost you can see. Outcomes you can name.

Every engagement has a success number signed before week one. Milestone-gated payment. You pay for results, not for hours.

A measurable cost profile — and savings

Two living artifacts: the infrastructure map (who's using what, where), and the budget (what it's costing, by team, by mode). Refreshed automatically. The savings show up in the same view.

A process outline blended into your rituals

Stand-ups, code review, retros — AI co-dev folded into what your team already does, not parked next to it. Playbooks live in your repo, not in our deck.

Three to five projects in motion

By the time we leave, two or three high-readiness repos are running AI-augmented workflows with measurable lift. The pattern is documented; the next teams self-serve.

Readiness score · 8 axes

Score each repo before you fund AI work in it. Most enterprise repos land 50–70 / 100 — which tells you where AI works today and where to invest in the foundation first.

Tests

Coverage, speed, reliability — the AI is only as safe as the test it writes against.

Docs

READMEs and ADRs the model can actually read. Tribal knowledge is invisible to AI.

Deps

Out-of-date dependencies break the AI's mental model of the API surface.

Bus factor

If only one human understands a module, the AI inherits that single point of failure.

Build

Reproducible builds. The AI needs to verify its own work locally before opening a PR.

Complexity

Cyclomatic complexity, function length, file size. High numbers = bad AI suggestions.

Fast, clear signal. The AI loop is bounded by how quickly CI tells it "wrong, try again."

Structure

Module boundaries. The AI works best inside small, well-named, well-bounded units.

Common questions

What engineering leaders ask first.

We already standardised on one AI coding tool. Does this still apply?

Yes. Standardising on one tool gets you one mode of AI co-dev. The cost levers, the readiness score, the rituals blend — all of that still applies on top. And if the tool stops being the right fit in eighteen months, you're not stuck.

How do you handle code and IP leaving the building?

We work inside your gateway and your data-residency boundary. Whichever tools you use route through your control plane — your prompt logs, your retention policies, your masking rules. Nothing about the practice requires sending source code to a vendor you haven't already approved.

Our repos are messy. Will AI even help us?

Some will, some won't. That's exactly what the readiness score is for — it tells you which repos to invest in now and which to fix the foundation in first. We'd rather you skip a repo than burn engineering trust pushing AI into it.

Who owns the dashboards, the playbook, the rituals?

You do, all of it. The cost dashboards run in your observability stack. The playbook lives in your repo. The rituals belong to your engineering org. We don't resell your usage data, hold dashboards hostage behind a license, or charge per-seat for the practice.

How is this priced?

Outcome-based, milestone-gated. We agree on a success number — cost reduction, throughput, repos with measurable lift — before week one. You pay when we hit it. Twelve months of operating support is included.

Start with one team

Pick the engineering team you'd trust to pilot this. We'll show you the cost shape in two weeks.

One team, one repo, one signed success number. That's the on-ramp.

Start a conversation Read the field notes

Your engineers already use AI.You just can't see what it's costing — or earning.