buchecha

Stop watching a single agent drift through a long spec. bcc splits the job across four roles that plan, brief, execute, and review under audit.

Status: early development, not stable.

The problem

Driving a single agent through a long Markdown spec with claude -p or codex breaks in predictable ways:

Focus drifts halfway through. The agent forgets phase boundaries and acceptance criteria.
It declares done early. One success message ends the session before the spec is actually covered.
Scope creep. A small fix sprawls into adjacent files the spec never mentioned.
Retries redo approved work. There is no granularity below "rerun the whole thing".
No audit trail. When something goes wrong, you have a transcript, not a record of decisions.

How bcc solves it

bcc replaces the single-loop pattern with four cognitive roles, each with its own context, coordinated through an in-process MCP server. The Planner produces a typed DAG of phases and tasks. The Briefer picks the next sub-DAG. The Executor edits the working tree for that scope only. The Reviewer audits per-task outcomes and decides what advances, retries, or escalates. Every role call is logged.

flowchart LR
    S[Markdown spec] --> P[Planner]
    P -->|plan_emit| M[(MCP server)]
    M --> B[Briefer]
    B -->|briefing_emit| M
    M --> E[Executor]
    E -->|task_progress<br/>iteration_finished| M
    M --> R[Reviewer]
    R -->|task_approved<br/>task_needs_fix| M
    M -.audit log.-> J[".bcc/sessions/&lt;id&gt;/mcp-log"]

Demo: Web UI

Quickstart in 60 seconds

Prerequisites: the bcc binary on $PATH (see Install) and an agent CLI (claude, codex, or gemini) reachable on $PATH.

# 1. scaffold .bcc.toml in the current project
# (optional if you want to override any settings)
bcc init

# 2. write a minimal spec
cat > spec.md <<'EOF'
# Add a /healthz endpoint

The HTTP server must expose `GET /healthz` returning `{"ok": true}`.

## Acceptance
- New route registered in the router.
- A test covers the 200 response with the expected body.
EOF

# 3. run it (live TUI)
bcc run spec.md

# 4. inspect what happened
bcc sessions list
bcc sessions show <id>

Use cases

Long refactors guided by a spec. Break a multi-file migration into phases the Planner sequences, retry only the tasks the Reviewer rejects.
Retroactive test coverage. Hand bcc a spec describing the surface to cover, let the Executor add cases per task while the Reviewer enforces the contract.
Multi-phase architectural changes. Schema migrations, port-and-adapter extractions, transport swaps. Phases that depend on each other land in order with verifiable boundaries.
Large feature prototyping. Describe the target state once, let the loop converge instead of hand-holding a single agent through 20 prompts.

Features

Four-role pipeline (Planner, Briefer, Executor, Reviewer) coordinated through an in-process MCP server with per-connection authorization.
Flexible input: a Markdown spec, a free-form --prompt/-p directive, or both composed together. The spec argument is optional when --prompt is set.
Typed, executable DAG with cycle validation; retries target only needs_fix tasks, never redo approved work.
Multi-vendor agnostic: adapters for claude, codex, and gemini; --planner and --provider pin roles to one vendor.
Multiple observability surfaces in a single run: live TUI (default), embedded Web UI (--webui), HTTP API + OpenAPI (--api), headless text or JSON (--output).
Persistent, resumable sessions under .bcc/sessions/<id>/ with manifest, plan, DAG, briefings, MCP log, and event stream; --resume[=<id>] retakes a run.
Target-project agnostic: embedded prompts stay neutral, the Executor picks the toolchain (build, lint, test) from the project under work.
Interactive escalation when the Reviewer rejects or the retry budget runs out: resume_with_hint, force_approve, skip, or abort. Every choice lands in the MCP log.
Full journal and audit: every MCP method call (role, input, result) is recorded for regression and debugging.

Why not just run `claude -p` directly?

Challenge	Single-agent shell wrapper	buchecha
Focus across a long spec	One context window, drifts past phase 2	Per-role contexts, Planner DAG enforces phase boundaries
Premature `done`	Agent decides when it is finished	Reviewer is the gate, verdict per task
Retry granularity	Rerun the whole spec	Retry only tasks the Reviewer marked `needs_fix`
Auditability	Terminal transcript	Per-call MCP log, persisted session, event stream
Observability	stdout	TUI, Web UI, HTTP API, JSON output, all in one run
Vendor lock	Tied to one CLI	`claude`, `codex`, and `gemini` adapters, pinned per role

Output and observability

# Live TUI dashboard (default)
bcc run spec.md

# Free-form directive instead of a spec file
bcc run -p "add a /healthz endpoint returning {\"ok\": true} with a test"

# Compose spec and prompt: the directive layers on top of the spec
bcc run -p "focus only on phase 2 for now" spec.md

# Headless text: progress to stderr, result to stdout
bcc run --output text spec.md

# Structured result, scriptable
bcc run --output json spec.md

# Embedded Web UI: SSE timeline, DAG view, MCP log
bcc run --webui spec.md      # serve only
bcc run -W spec.md           # serve and open the browser

# HTTP API + OpenAPI document at /api/v1/openapi.json
bcc run --api spec.md

--webui and --api share one listener. The Web UI implies the API. The dashboard is read-only in V1; mutations go through the TUI or the MCP log.

Configuration

bcc init writes a .bcc.toml. Key fields:

Field	Purpose
`project.language`	Localized vocabulary (`en`, `pt-BR`). The wire protocol stays canonical English.
`providers.<name>`	Vendor adapter config (binary path, flags, sandbox mode).
`roles.planner` / `roles.briefer` / `roles.executor` / `roles.reviewer`	Per-role capability menus and provider selection.
`journal.store`	Where to keep session artifacts (defaults to `.bcc/sessions/`).
`loop.max_iterations`	Per-task retry budget before the loop escalates.
`git.branch_prefix`	Prefix applied when the loop creates a working branch.

Pin a vendor at the CLI level with --provider claude (all roles) or --planner codex (one role). See docs/guides/director.md (pt-BR) for the full reference: sessions, MCP method surface, escalation flows, troubleshooting.

Install

# macOS or Linux Homebrew
brew install fgmacedo/tap/buchecha

# Linux: detect OS/arch, verify checksum, install to /usr/local/bin
curl -sSL https://raw.githubusercontent.com/fgmacedo/buchecha/main/scripts/install.sh | sh

# Debian/Ubuntu: download the .deb from the latest release and install
curl -LO https://github.com/fgmacedo/buchecha/releases/latest/download/bcc_<version>_linux_amd64.deb
sudo apt install ./bcc_<version>_linux_amd64.deb

# Any OS with a Go toolchain
go install github.com/fgmacedo/buchecha/cmd/bcc@latest

When installed via tar.gz or install.sh on macOS, Gatekeeper may quarantine the binary on first run. Either install via brew (which avoids quarantine), or clear the attribute manually: xattr -d com.apple.quarantine $(command -v bcc).

Deeper documentation

CLAUDE.md: architecture, layer boundaries, conventions, and the assistant contract.
docs/specs/director/index.md: current state of the project and pointers to the normative specs.
docs/guides/director.md (pt-BR): operator reference for bcc run, sessions, MCP method surface, escalation, and troubleshooting.
docs/how-to/event-stream-troubleshooting.md: end-to-end SSE debugging and the anti-drift contract.
docs/how-to/webui-dev-loop.md: develop the Web UI against a replayed session, no agents required.

Contributing

Prerequisites are pinned in .mise.toml. Install mise and let it manage them, or match versions yourself.

mise install            # Go 1.25.x and Node 22.x at pinned versions
git clone https://github.com/fgmacedo/buchecha.git
cd buchecha
go mod download
make build              # produces ./bcc with the embedded SPA bundle
./bcc --help

Inner loop:

make test               # unit suite
make test-race          # race detector; required before commits touching concurrent code
make vet                # go vet ./...
make lint               # golangci-lint
make fmt                # gofmt -w .

Pre-commit hooks run the same gates locally. Install once:

make precommit-install

See CLAUDE.md and AGENTS.md for layer rules, naming, error conventions, and the testing model. Read those before editing any package.

Roadmap

The current architecture is captured in docs/specs/director/index.md. Open enhancements live as GitHub issues tagged enhancement.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 243 Commits
.claude		.claude
.github/workflows		.github/workflows
cmd/bcc		cmd/bcc
docs		docs
internal		internal
scripts		scripts
testdata		testdata
.bcc.smoke.toml		.bcc.smoke.toml
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
.mise.toml		.mise.toml
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

buchecha

The problem

How bcc solves it

Demo: Web UI

Quickstart in 60 seconds

Use cases

Features

Why not just run `claude -p` directly?

Output and observability

Configuration

Install

Deeper documentation

Contributing

Roadmap

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

buchecha

The problem

How bcc solves it

Demo: Web UI

Quickstart in 60 seconds

Use cases

Features

Why not just run claude -p directly?

Output and observability

Configuration

Install

Deeper documentation

Contributing

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Why not just run `claude -p` directly?

Packages