Skip to content

kj discover

kj discover points the analysis at the task, not the codebase. It asks: is this request actually specified well enough to build? It surfaces the gaps, ambiguities and unstated assumptions — the questions someone should have answered before “implement X” was ever handed over.

kj discover takes a task and analyses it (not the code) for what’s missing or under-specified: unstated requirements, ambiguous terms, implicit assumptions, undefined edge cases, success criteria nobody wrote down. The default gaps mode produces a structured gap list with a verdict (ready / needs-validation). It writes nothing and runs no coder.

Five modes change the questioning lens:

  • gaps (default) — generic missing-information analysis.
  • momtest — Mom-Test-style questions (probe for real evidence, not validation).
  • wendel — a behaviour-change checklist framing.
  • classify — START / STOP / DIFFERENT classification of the ask.
  • jtbd — Jobs-to-be-Done framing of the underlying need.

It’s the discover role of kj run (--enable-discover) standalone. Where kj researcher explores reality in the code, kj discover interrogates the request before reality is touched.

  • Vague or second-hand taskskj discover "make the dashboard better" exposes that “better” is undefined.
  • Spec from a non-developer — surface the implicit requirements before they become wrong code.
  • Before planning a big feature — close gaps first, so kj plan decomposes a complete task.
  • Product framing--mode jtbd / momtest to interrogate the underlying need, not just the stated feature.
  • Triaging an ask--mode classify to decide if this is a START, a STOP, or a DIFFERENT thing entirely.
  • Well-specified tasks — a precise spec with acceptance criteria has no gaps to find; discover is latency.
  • You’ll run the pipeline anywaykj run --enable-discover; standalone only helps to gate before committing to a run.
  • Codebase questions — “what pattern does this project use?” is kj researcher, not discover.
  • Sizing — “how complex is this?” is kj triage.
FlagDefaultWhen to flip itInteraction
[task] (arg)The task to interrogate. Required unless --task-file.
--task-file <path>noneLong task / spec file.Supersedes the inline [task].
--mode <name>gapsmomtest / wendel / classify / jtbd — pick the lens that matches why you’re questioning the task.Each mode changes the output shape, not just the tone.
--discover <name>configOverride the discover agent.
--discover-model <name>tier-drivenPin the model.
--jsonoffCI gating / tooling.
Terminal window
kj discover "Add notifications to the app"

What happens: returns the unanswered questions — which events? which channels (push/email/in-app)? user preferences? — with a needs-validation verdict. Answer them, then plan.

Terminal window
kj discover "Users want a dark mode" --mode jtbd

What happens: reframes around the underlying job (“reduce eye strain at night” / “match OS”) — which may change what you build, not just how.

Terminal window
verdict=$(kj discover --task-file task.md --json | jq -r .verdict)
[ "$verdict" = ready ] || { echo "Task underspecified"; exit 1; }

What happens: the pipeline refuses to spend a kj run on a task that discover flags as underspecified.

kj discover encodes a hard truth from kj run’s own guidance: the pipeline works as well as the task description; vague tasks produce vague code. Most “AI wrote bad code” outcomes are actually “the task was underspecified and nobody noticed until the diff”. Discover moves that noticing before the tokens are spent — the cheapest possible place to catch a missing requirement is before any agent has acted on its absence.

The five modes exist because “what’s missing from this task?” is not one question. A generic gap list (gaps) is right for an engineering ticket; but a feature request often needs to be challenged at the need level, not the spec level — hence jtbd and momtest, which deliberately refuse to take the stated feature at face value, and classify, which questions whether the task should exist at all (START/STOP/DIFFERENT). Bundling these into one mode would force a single interrogation style onto questions that need different ones; separating them makes the kind of doubt explicit and selectable.

  • kj researcher — the complement: interrogates the codebase, not the task.
  • kj triage — sizes the task; discover checks it’s complete enough to size.
  • kj plan — feed a discover-hardened task in so the decomposition is built on solid ground.
  • Pipeline roles → discover — the role this command runs standalone.