Pipeline
Pipeline Overview
Section titled “Pipeline Overview”Karajan orchestrates 15 specialized roles through a three-phase pipeline. Each role defines what to do; you choose which AI agent (Claude, Codex, Gemini, Aider, OpenCode) does it.
intent? → discover? → hu-reviewer? → triage → researcher? → architect? → planner? → [coder → refactorer? → guards → TDD → sonar? → impeccable? → reviewer] → tester? → security? → audit? → commiter? └─── iteration loop (1..N) ──────────────────────────────────────┘| Role | Description | Default |
|---|---|---|
| discover | Pre-execution gap detection — analyzes tasks for missing info, ambiguities, and assumptions | Off |
| triage | Pipeline director — analyzes task complexity, classifies taskType, and activates roles dynamically | On |
| researcher | Investigates codebase context before planning | Off |
| architect | Designs solution architecture — layers, patterns, data model, API contracts, tradeoffs | Off |
| planner | Generates structured implementation plans informed by architecture | Off |
| coder | Writes code and tests following TDD | Always on |
| refactorer | Improves code clarity without changing behavior | Off |
| sonar | Runs SonarQube static analysis and quality gates | On (if configured) |
| impeccable | Automated UI/UX design audit — accessibility, performance, theming, responsive, anti-patterns. Includes WebPerf Quality Gate (v1.45.0): Core Web Vitals (LCP, CLS, INP) via Chrome DevTools MCP + WebPerf Snippets, with configurable thresholds | Off |
| reviewer | Code review with configurable strictness profiles | Always on |
| tester | Test quality gate and coverage verification | On |
| security | OWASP security audit | On |
| solomon | Pipeline Boss & Arbiter — evaluates every reviewer rejection, classifies issues, can override style-only blocks. 6 rules including smart iteration control | On |
| commiter | Git commit, push, and PR automation after approval | Off |
| hu-reviewer | HU story certification — evaluates user stories across 6 quality dimensions, detects 7 antipatterns, rewrites weak stories, supports dependency graphs | Auto (medium/complex) |
| audit | Mandatory post-approval quality gate — runs after reviewer+tester+security pass. Checks generated code for critical/high issues; if found, loops coder back to fix. If clean, pipeline is CERTIFIED | On (v1.32.0+) |
Roles marked with ? are optional and can be enabled per-run or via config.
Deterministic Guards
Section titled “Deterministic Guards”Guards are regex/pattern-based checks that run between coder and quality gates — no LLM cost, 100% deterministic:
| Guard | What it checks | Default |
|---|---|---|
| output-guard | Destructive operations (rm -rf, DROP TABLE, git push —force), exposed credentials (15 patterns: AWS keys, private keys, GitHub/npm/PyPI/Slack tokens, Google OAuth, JWTs, generic secrets), protected file modifications (.env, serviceAccountKey.json). All 15 credential patterns are blocking — secrets never pass (v1.38.2) | On (blocks on critical) |
| perf-guard | Frontend performance anti-patterns — images without dimensions/lazy, render-blocking scripts, missing font-display, document.write, heavy deps (moment, lodash, jquery) | On (advisory, configurable to block) |
| intent-classifier | Pre-triage keyword classification — skips LLM triage for obvious task types (doc, add-tests, refactor, infra, trivial-fix) | Off |
| injection-guard | Prompt injection scanner — detects directive overrides (“ignore previous instructions”), invisible Unicode characters (zero-width spaces, bidi overrides), and oversized comment block payloads in diffs before passing them to AI reviewers. Also available as a GitHub Action | On |
Guards are configurable via kj.config.yml:
guards: output: enabled: true on_violation: block # block | warn | skip # patterns: [{ id: "custom", pattern: "DANGEROUS", severity: "critical", message: "..." }] # protected_files: [secrets.yml] perf: enabled: true block_on_warning: false # patterns: [{ id: "custom-perf", pattern: "eval\\(", severity: "warning" }] intent: enabled: false confidence_threshold: 0.85 # patterns: [{ keywords: ["readme", "docs"], taskType: "doc", confidence: 0.9 }]Pipeline Stage Tracker
Section titled “Pipeline Stage Tracker”During kj_run, Karajan emits a cumulative pipeline:tracker event after every stage transition. This gives MCP hosts (Claude Code, Codex, etc.) a single event with the full state of all stages:
┌ Pipeline │ ✓ triage → medium │ ✓ planner → 5 steps │ ▶ coder (claude/sonnet) │ · sonar │ · reviewer └Status icons: ✓ done, ▶ running, · pending, ✗ failed.
Each stage includes an optional summary — the provider name while running, or a result summary when done.
For single-agent tools (kj_code, kj_review, kj_plan), tracker start/end logs are also emitted so hosts can show which agent is active.
Solomon — Pipeline Boss (v1.11.0+, enhanced v1.25.0)
Section titled “Solomon — Pipeline Boss (v1.11.0+, enhanced v1.25.0)”Solomon runs after each iteration as the Pipeline Boss & Arbiter. It evaluates every reviewer rejection, classifies issues as critical vs. style-only, and can override style-only blocks to keep the pipeline moving. 6 built-in rules:
| Rule | What it checks |
|---|---|
max_files_per_iteration | Too many files changed in one iteration (default: 15) |
max_stale_iterations | Same issues repeating across iterations (default: 3) |
dependency_guard | New dependencies added without explicit approval |
scope_guard | Changes outside the task’s expected scope |
reviewer_overreach | Reviewer blocking on files outside the diff scope (v1.12.0) |
style_only_block | Reviewer blocking exclusively on style/formatting issues — auto-escalates to human review (v1.24.1) |
When a critical alert fires, Solomon pauses the session and asks for human input via elicitInput.
Since v1.12.0, Solomon also mediates reviewer stalls. Instead of immediately stopping the pipeline when the reviewer raises blocking issues, Solomon evaluates whether those issues are legitimate scope concerns or reviewer overreach, and acts accordingly.
Since v1.25.0, Solomon evaluates every reviewer rejection, classifying issues and applying smart iteration logic. Style-only blocks are overridden automatically, allowing the pipeline to continue without human intervention for subjective concerns.
Reviewer Scope Filter (v1.12.0)
Section titled “Reviewer Scope Filter (v1.12.0)”The scope filter automatically detects when a reviewer raises blocking issues about files that are not part of the current diff. Instead of stalling the pipeline, these out-of-scope issues are:
- Auto-deferred — removed from the blocking issue list so the pipeline continues
- Tracked as tech debt — stored in the session’s
deferredIssuesfield for later resolution - Fed back to the coder — deferred issues are included in the coder prompt on subsequent iterations, so the coder is aware of them without being blocked
This prevents a common failure mode where a reviewer flags pre-existing problems in untouched files, causing the pipeline to loop indefinitely on issues the coder cannot resolve within the current task’s scope.
Rate-Limit Standby (v1.11.0)
Section titled “Rate-Limit Standby (v1.11.0)”When a coder or reviewer hits a rate limit, Karajan:
- Parses the cooldown time from the error message (5 patterns supported)
- Waits with exponential backoff (default 5min, max 30min)
- Emits
coder:standby,coder:standby_heartbeat(every 30s), andcoder:standby_resumeevents - Retries the same iteration automatically
- After 5 consecutive standby retries, pauses for human intervention
Preflight Handshake (v1.11.0)
Section titled “Preflight Handshake (v1.11.0)”Before kj_run or kj_code executes, Karajan requires a kj_preflight call to confirm the agent configuration. This prevents AI agents from silently overriding your coder/reviewer assignments.
The preflight supports natural language: "use gemini as coder", "coder: claude", or "set reviewer to codex".
Session overrides are stored in-memory and die when the MCP server restarts.
BecarIA Gateway (v1.13.0)
Section titled “BecarIA Gateway (v1.13.0)”BecarIA Gateway adds full CI/CD integration with GitHub PRs as the single source of truth. Instead of running agents locally and manually creating PRs, the gateway makes PRs the central coordination point.
How it works
Section titled “How it works”- Early PR creation: After the first coder iteration, Karajan creates a draft PR automatically
- Agent comments on PRs: All agents (Coder, Reviewer, Sonar, Solomon, Tester, Security, Planner) post their results as PR comments or reviews
- Configurable dispatch events: The
becariaconfig section defines which GitHub Actions workflows to trigger at each pipeline stage - Embedded workflow templates:
kj init --scaffold-becariagenerates ready-to-use workflow files (becaria-gateway.yml,automerge.yml,houston-override.yml)
Standalone review with PR diff
Section titled “Standalone review with PR diff”kj review now supports reviewing an existing PR’s diff directly, making it usable as a standalone code review tool outside the full pipeline.
kj init --scaffold-becariakj doctor # verifies BecarIA configurationEnable via CLI:
kj run --enable-becaria --task "Add input validation"Or via MCP:
{ "tool": "kj_run", "params": { "task": "Add input validation", "enableBecaria": true }}kj doctor includes BecarIA-specific checks to verify that workflow templates are present and the GitHub token has the required permissions.
Architect Stage (v1.17.0+)
Section titled “Architect Stage (v1.17.0+)”The architect role runs between researcher and planner to define the solution architecture before coding begins. It produces explicit design decisions: layers, patterns, data model, API contracts, and tradeoffs.
Activation
Section titled “Activation”Triage auto-activates the architect when the task creates new modules/apps, affects data models or APIs, has medium/complex complexity, or has ambiguous design. You can also enable it explicitly:
kj run --enable-architect --task "Build user authentication system"Interactive clarification
Section titled “Interactive clarification”When the architect detects design ambiguity, it returns verdict: "needs_clarification" with targeted questions. The pipeline pauses and asks the human via askQuestion. After receiving answers, the architect re-evaluates with the additional context.
If no interactive input is available (non-interactive CLI), the architect continues with best-effort decisions and emits a warning.
Auto ADR generation
Section titled “Auto ADR generation”When a Planning Game card is linked (pgTaskId + pgProject), each architectural tradeoff is automatically persisted as an Architecture Decision Record (ADR) via the Planning Game MCP.
Policy-Driven Pipeline (v1.14.0+)
Section titled “Policy-Driven Pipeline (v1.14.0+)”The policy-resolver module maps each taskType to a set of pipeline policies, determining which stages are enabled or disabled:
| taskType | TDD | SonarQube | Reviewer | Tests Required |
|---|---|---|---|---|
sw | ✓ | ✓ | ✓ | ✓ |
infra | ✗ | ✗ | ✓ | ✗ |
doc | ✗ | ✗ | ✓ | ✗ |
add-tests | ✗ | ✓ | ✓ | ✓ |
refactor | ✓ | ✓ | ✓ | ✗ |
Unknown or missing taskType defaults to sw (most conservative).
Policies can be overridden per-project in kj.config.yml:
policies: sw: tdd: false infra: sonar: trueThe orchestrator emits a policies:resolved event and applies policy gates using shallow copies — never mutating the caller’s configuration.
Auto-Simplify Pipeline (v1.25.1+)
Section titled “Auto-Simplify Pipeline (v1.25.1+)”When triage classifies a task as level 1-2 (trivial or simple), the pipeline automatically runs a lightweight coder-only flow — reviewer, tester, and other post-coder stages are auto-disabled. Level 3+ tasks (medium, complex) get the full pipeline with all enabled stages. This reduces overhead for simple tasks while preserving rigor for complex ones.
Disable with --no-auto-simplify (CLI) or autoSimplify: false (MCP).
Auto-Detect TDD (v1.25.0+)
Section titled “Auto-Detect TDD (v1.25.0+)”TDD methodology is now auto-detected based on the project’s test framework. If the project has a test runner configured (Vitest, Jest, Mocha, etc.), the pipeline automatically enables TDD without requiring --methodology tdd. You can still override with --methodology standard if needed.
Since v1.38.1, TDD supports 12 languages beyond JavaScript/TypeScript: Java (JUnit/Maven/Gradle), Python (pytest/unittest), Go (go test), Rust (cargo test), C# (dotnet test/NUnit/xUnit), Ruby (RSpec/Minitest), PHP (PHPUnit), Swift (XCTest), Dart (dart test/Flutter), and Kotlin (JUnit/Gradle). The TDD enforcer auto-detects the project’s language and test framework.
Mandatory Triage (v1.15.0+)
Section titled “Mandatory Triage (v1.15.0+)”Starting from v1.15.0, triage always runs to classify the task’s taskType. The classification priority is:
- Explicit
--taskTypeflag (highest priority) taskTypeinkj.config.yml- Triage AI classification
- Default:
sw(most conservative)
Triage can activate additional roles but cannot deactivate roles explicitly enabled in pipeline config.
Discovery Stage (v1.16.0+)
Section titled “Discovery Stage (v1.16.0+)”The discover stage runs before triage as an opt-in pre-pipeline stage. It analyzes the task specification for gaps, ambiguities, and missing information before any code is written.
Enabling discovery
Section titled “Enabling discovery”Via CLI:
kj run --enable-discover --task "Add user authentication"Via MCP:
{ "tool": "kj_run", "params": { "task": "Add user authentication", "enableDiscover": true }}Or permanently in kj.config.yml:
pipeline: discover: enabled: true mode: gaps # or: momtest, wendel, classify, jtbd5 Discovery Modes
Section titled “5 Discovery Modes”| Mode | What it does |
|---|---|
gaps | Default — identifies missing requirements, ambiguities, implicit assumptions, and contradictions |
momtest | Generates validation questions following The Mom Test principles (past behavior, not hypotheticals) |
wendel | Evaluates 5 behavior change adoption conditions: CUE, REACTION, EVALUATION, ABILITY, TIMING |
classify | Classifies task impact as START (new behavior), STOP (remove behavior), or DIFFERENT (change existing) |
jtbd | Generates reinforced Jobs-to-be-Done with functional, emotional, and behavioral layers |
Standalone usage with kj_discover
Section titled “Standalone usage with kj_discover”Discovery is also available as a standalone MCP tool:
{ "tool": "kj_discover", "params": { "task": "Add dark mode toggle to settings page", "mode": "wendel" }}Non-blocking behavior
Section titled “Non-blocking behavior”Discovery is non-blocking: if it fails, the pipeline logs a warning and continues execution. When verdict is needs_validation, the pipeline emits a warning with the detected gaps but proceeds normally.
Integrated HU Manager (v1.38.0+)
Section titled “Integrated HU Manager (v1.38.0+)”When triage classifies a task as medium or complex, the hu-reviewer role auto-activates and decomposes the task into 2-5 formal user stories (HUs) with dependency ordering. Each HU then runs as its own iteration through the pipeline (coder → sonar → reviewer), with individual state tracking (pending → coding → reviewing → done/failed/blocked).
How it works
Section titled “How it works”- Triage classification: triage analyzes the task and determines complexity level
- Auto-activation: for medium/complex tasks, hu-reviewer activates automatically (no
--enable-hu-reviewerneeded) - AI-driven decomposition: hu-reviewer decomposes the task into 2-5 formal HUs with acceptance criteria and dependency graphs
- Sub-pipeline execution: each HU runs its own coder → sonar → reviewer cycle with state tracking
- PG adapter: when a Planning Game card is linked, card data is fed to hu-reviewer for richer context
- History records: pipeline history records are generated for all task runs, providing full traceability
Parallel HU execution (v1.46.0+)
Section titled “Parallel HU execution (v1.46.0+)”Independent HUs (those without unresolved dependencies) run concurrently via git worktrees. Each HU gets its own worktree so coders can work in parallel without conflicting on the same working directory. Dependent HUs wait until their blockers complete before starting.
State tracking
Section titled “State tracking”Each HU progresses through states independently:
pending → coding → reviewing → done → failed → blockedThe pipeline tracks which HUs are complete and which remain, ensuring dependencies are respected during execution.
Manual override
Section titled “Manual override”You can still enable hu-reviewer explicitly for simple tasks:
kj run --enable-hu-reviewer --task "Add input validation"Or disable the auto-activation:
kj run --no-hu-reviewer --task "Complex task without HU decomposition"