The Case for a Layered Event Matcher
1. Where the Current Flat Approach Breaks Down
The current path from webhook to agent execution is:
HTTP POST → parseEvent() → WebhookEvent → matchPipelines() → assembleContext() → executePipeline()matchPipelines() is the entire routing brain:
// pipeline.ts:55–67
const triggers: string[] = pipeline.trigger[source] ?? [];
if (triggers.some((t) => event.event === t || event.event.startsWith(t + "."))) {
matches.push([name, pipeline]);
}That is a string prefix test. It is the only filter in the system. Every crack in the current design flows from that single fact.
1.1 Multi-repo is structurally impossible
When GitHub sends pull_request.opened, the payload contains event.payload.repository.full_name — e.g. acme/frontend or acme/payments. The current matcher never looks at it. You cannot write two pr-review pipelines that differ only by repo without naming them differently and duplicating every other field. There is no filter.repos concept anywhere in PipelineTrigger.
Concretely: if you add a second webhook source (a second GitHub App, a monorepo), every pull_request.opened will match every pr-review-type pipeline simultaneously. You'll get N×M agent invocations.
1.2 Context assembly can't do its job without project scope
assembleContext() in context.ts has two stubs that are deliberately returning null:
async function gatherGitHistory(_event: WebhookEvent): Promise<string | null> {
// TODO: use git log on the affected files
return null;
}
async function gatherRelatedPRs(_event: WebhookEvent): Promise<string | null> {
// TODO: query GitHub API for related PRs
return null;
}These are not stubs because the implementation is hard. They are stubs because the WebhookEvent passed in does not carry the information needed to do the work: no repo URL, no installation token, no default branch, no clone path. The context layer would have to re-parse event.payload.repository — buried in an untyped Record<string, unknown> — with no guarantee that field exists (e.g. Linear events have no repo at all).
The context builder is structurally blocked until there is a typed project scope object passed alongside the event.
1.3 Agent selection is static and cannot be rule-driven
PipelineConfig has a single agent: string field. Every invocation of that pipeline uses the same agent. You cannot express:
- "use
security-agentfor anything inacme/payments" - "use
fast-reviewerfor draft PRs,thorough-reviewerfor PRs targetingmain" - "fall back to
reviewerwhen no specialist matches"
The only workaround is to create separate pipelines with separate agent fields, duplicating all prompt and guardrail config. That duplication grows O(repos × agent-types).
1.4 The trigger schema can't compose filters
PipelineTrigger is { github?: string[], linear?: string[], webhook?: string[] }. It is a flat list of event name strings. There is no place to express:
- "only for PRs targeting
mainorrelease/*" - "except when the author is
dependabot[bot]" - "only for repos in the
acmeorg" - "and only when the PR has the
needs-reviewlabel"
Adding Slack would require modifying the PipelineTrigger interface, the config type, the matching code, and the context switch in formatEvent() — all in separate files with no clear ownership boundary.
1.5 formatEvent() has implicit source knowledge baked in
// context.ts:60–76
function formatEvent(event: WebhookEvent): string {
if (event.source === "github") {
const pr = payload.pull_request as Record<string, unknown>;
// ...
}
}Every new domain source (Slack, Sentry, JIRA) requires a new if branch here, and another in assembleContext(), and another in matchPipelines(). The domain logic is fractured across three files with no extraction point.
2. The Layered Architecture
The key insight: matching, context, and execution have fundamentally different information needs. Mixing them prevents each from working correctly.
Layer 1: Domain Adapter — HTTP → normalized WebhookEvent (exists)
Layer 2: Event Classifier — WebhookEvent → typed ClassifiedEvent<T>
Layer 3: Project Router — ClassifiedEvent → ProjectContext
Layer 4: Pipeline Matcher — event + project → [MatchedPipeline]
Layer 5: Context Builder — event + project + pipeline → EnrichedContext
Layer 6: Executor — agent + prompt + context → spawn (exists)Layers 2–5 are the missing infrastructure. Each has a clear, narrow contract. None leaks into the others.
3. Concrete Interface Proposal
Layer 2 — Event Classifier
// packages/core/src/classifier.ts
export type EventDomain = 'github' | 'linear' | 'slack' | 'generic';
export type GitHubEventType =
| 'pull_request'
| 'push'
| 'issue'
| 'issue_comment'
| 'pull_request_review'
| 'check_run'
| 'deployment';
export type LinearEventType = 'Issue' | 'Comment' | 'Project' | 'Cycle';
// Typed, normalized GitHub PR payload — no more Record<string, unknown> casting
export interface GitHubPRData {
number: number;
title: string;
url: string;
author: string;
authorType: 'user' | 'bot'; // detect dependabot, renovate, etc.
base: string;
head: string;
body: string;
draft: boolean;
labels: string[];
repoFullName: string; // 'acme/frontend' — the critical routing key
repoDefaultBranch: string;
installationId?: string; // for per-repo GitHub App auth
orgLogin: string; // 'acme'
}
export interface GitHubPushData {
repoFullName: string;
ref: string;
branch: string;
pusher: string;
commitCount: number;
compareUrl: string;
installationId?: string;
}
export interface LinearIssueData {
id: string;
title: string;
description: string;
teamKey: string; // 'HAR' — the critical routing key
projectId?: string;
priority: number;
assigneeId?: string;
workspaceId: string;
}
export interface ClassifiedEvent<T = unknown> {
raw: WebhookEvent; // always available for fallback
domain: EventDomain;
type: string; // 'pull_request', 'Issue', etc.
action: string; // 'opened', 'create', 'synchronize'
qualifiedName: string; // 'pull_request.opened' — the current event string
data: T; // typed normalized payload
receivedAt: Date;
}
export interface EventClassifier {
/**
* Classify a raw webhook event into a typed, normalized form.
* Returns null if the event is unrecognized or malformed.
*/
classify(event: WebhookEvent): ClassifiedEvent | null;
}Why this matters: ClassifiedEvent<GitHubPRData> has .data.repoFullName as a typed string. There is no more event.payload.repository?.full_name as string scattered across the codebase. The classifier owns all the payload-parsing risk in one place.
Layer 3 — Project Router
// packages/core/src/project.ts
export interface ProjectContext {
// Identity
domain: EventDomain;
org?: string; // 'acme'
repo?: string; // 'frontend'
repoFullName?: string; // 'acme/frontend'
repoCloneUrl?: string; // for git operations in context builder
defaultBranch?: string;
// Linear
teamKey?: string; // 'HAR'
linearWorkspaceId?: string;
// Auth
installationId?: string; // GitHub App installation for this repo
tokenRef?: string; // secret reference for API calls
// Metadata
tags?: string[]; // inherited from bento.yaml project config
}
export interface ProjectRouter {
/**
* Extract project scope from a classified event.
* Never throws — returns a partial context if info is unavailable.
*/
extractProject(event: ClassifiedEvent): ProjectContext;
}Why this matters: assembleContext() currently receives a WebhookEvent and must do event.payload.repository as Record<string,unknown> before it can fetch git history. With ProjectContext, the context builder receives project.repoCloneUrl directly. The two stubs become implementable immediately.
Layer 4 — Pipeline Matcher (replaces matchPipelines())
// services/daemon/src/matcher.ts
// Extended trigger config (backwards compatible)
export interface PipelineTriggerFilter {
repos?: string[]; // ['acme/frontend', 'acme/*'] — glob patterns
branches?: string[]; // ['main', 'release/*']
authors?: string[]; // ['!dependabot[bot]'] — '!' prefix = exclude
labels?: string[]; // PR must have at least one of these
draft?: boolean; // true = only drafts, false = only non-drafts
teams?: string[]; // Linear team keys: ['HAR', 'ENG']
}
// This extends PipelineConfig.trigger — no breaking change
export interface PipelineTriggerV2 extends PipelineTrigger {
filter?: PipelineTriggerFilter;
}
export interface MatchedPipeline {
name: string;
config: PipelineConfig;
agent: DiscoveredAgent; // resolved here — executor never looks up by name again
project: ProjectContext;
matchScore: number; // higher = more specific match (repo-filter beats no-filter)
}
export interface PipelineMatcher {
/**
* Find all pipelines that match this event and project context.
* Returns matches sorted by specificity (most specific first).
*/
match(
event: ClassifiedEvent,
project: ProjectContext,
pipelines: Record<string, PipelineConfig>,
agents: Map<string, DiscoveredAgent>,
): MatchedPipeline[];
}Match scoring example: A pipeline with filter.repos: ['acme/payments'] scores higher than one with no filter when the event comes from acme/payments. This prevents a "catch-all" pipeline from shadowing a specialized one, and makes the tie-breaking rule explicit rather than dependent on YAML ordering.
Layer 5 — Context Builder (replaces assembleContext())
// services/daemon/src/context.ts (enhanced)
export interface ContextBuilderOptions {
pipeline: PipelineConfig;
event: ClassifiedEvent;
project: ProjectContext;
logger: Logger;
}
export interface EnrichedContext {
// Existing
summary: string;
sections: ContextSection[];
// New — available to executor for prompt rendering
project: ProjectContext;
// Structured event data for typed template vars (not just payload paths)
eventData: Record<string, unknown>;
}
export interface ContextBuilder {
/**
* Assemble context for a pipeline run.
* Receives typed event + project — no payload-parsing needed here.
*/
assemble(options: ContextBuilderOptions): Promise<EnrichedContext>;
}Why event-aware context building is necessary: Git history for a PR requires:
- Repo clone URL (
project.repoCloneUrl) — from Layer 3 - PR head SHA (
event.data.headfor GitHub) — from Layer 2 - Auth token (
project.installationId) — from Layer 3
None of these were available before this refactor. Context building was fundamentally blocked by the flat architecture. With these layers, gatherGitHistory() becomes:
async function gatherGitHistory(
event: ClassifiedEvent<GitHubPRData>,
project: ProjectContext,
): Promise<string | null> {
if (!project.repoCloneUrl) return null;
const token = await resolveInstallationToken(project.installationId);
// git log --oneline -20 HEAD..event.data.head
// now actually implementable
}4. How Multi-Project Support Works End-to-End
With these layers, bento.yaml can express per-repo pipelines without duplication:
pipelines:
# Catches all PRs — low specificity
pr-review-default:
trigger:
github: [pull_request.opened, pull_request.synchronize]
agent: reviewer
prompt: "Review this PR..."
# Catches only payments/ PRs — higher specificity, wins over default
pr-review-payments:
trigger:
github: [pull_request.opened, pull_request.synchronize]
filter:
repos: ['acme/payments']
branches: ['main']
agent: security-reviewer # different agent
prompt: "Review this payment-service PR with security focus..."
# Only non-bot PRs on frontend, targeting main
pr-review-frontend:
trigger:
github: [pull_request.opened]
filter:
repos: ['acme/frontend']
authors: ['!dependabot[bot]', '!renovate[bot]']
draft: false
agent: reviewer
context:
git_history: true # now actually works via ProjectContextThe PipelineMatcher scores pr-review-payments above pr-review-default for acme/payments events because its filter is more specific. The correct agent is resolved at match time and passed to the executor — no secondary lookup.
5. The Abstraction Boundary
The critical boundary is between Layers 4 and 5 — between matching and context building.
Matching (Layers 1–4) must be:
- Cheap — runs for every event, most will not match
- Stateless — no I/O, no API calls
- Deterministic — same event always produces the same matches
Context building (Layer 5) must be:
- Rich — fetch git history, query APIs, recall sessions
- Event-type-aware — PRs need diff context, issues need related ticket context
- Project-scope-aware — needs repo URL, auth tokens, branch info
The current code blurs this boundary: handleEvent() calls matchPipelines() (cheap), then immediately calls assembleContext() (expensive, I/O-bound) — and passes the same untyped WebhookEvent to both. The expensive work can't use the information it needs because that information was never extracted from the event at the matching stage.
The clean rule: anything that requires I/O lives in Layer 5 and below. Anything that only reads the event lives in Layer 4 and above.
6. What Does Not Change
WebhookEventinterface — Layer 2 wraps it, not replaces itWebhookServer— still parses and emitsWebhookEventPipelineQueue/executePipeline()— executor is clean, unchangedDiscoveredAgent/discoverAgents()— SOUL.md discovery is finebento.yamlstructure — the existing pipeline format is valid;filteris additive- All existing tests —
matchPipelines()behavior is preserved for trigger-only pipelines
The refactor is additive. Layer 2 and 3 are new files. Layer 4 is a replacement for the 12-line matchPipelines() method. Layer 5 is an enhancement to the existing assembleContext() signature. Nothing is deleted until the old code can be shadowed by the new.
7. Summary
| Failing scenario | Root cause in current code | Resolved by layer |
|---|---|---|
| Two repos, different agents | matchPipelines() ignores payload | Layer 3 + Layer 4 filter |
| Git history is always null | assembleContext() lacks repo URL | Layer 3 → Layer 5 |
| Adding Slack breaks 3 files | Domain logic split across files | Layer 2 classifier |
| Can't exclude bots from triggers | No filter clause in PipelineTrigger | Layer 4 filter schema |
| Agent can't vary by repo | agent is a static string field | Layer 4 match scoring |
| Prompt template can't use typed fields | renderPrompt only walks raw payload | Layer 5 eventData |
The layered architecture does not make the code more complex. It makes the current hidden complexity explicit, gives each concern a named owner, and unblocks the features — git history, multi-repo routing, per-repo agents — that are already stubbed out or missing from the codebase.

