Back to Projects
AI TOOLS

Every question gets the right expert. Or an honest ‘I don’t know.’

These four tools externalize different parts of the developer’s brain. Technical judgment that usually lives in senior engineers’ heads. Memory and identity context that resets every session. The fresh perspective you lose after months on the same codebase. The usability instinct that fades when you stop being a first-time user. Two MCP servers and two agents, each handling a job that shouldn’t be done manually.

EXPERT COUNCIL

Technical decisions, grounded in ingested sources.

Every team has that one person who just knows how things work. "Use Zustand here, not Context." "Server component for that page." Nobody wrote it down - it lives in Slack threads and the heads of senior engineers. I wanted a shared mental model I could actually query.

So I built six domain knowledge bases - React architecture, AI/LLM patterns, backend infrastructure, Vercel, frontend performance, and marketing strategy. Each one has its own ingested corpus of docs, architecture decisions, and real project notes. When a question comes in, it scores every relevant match and routes through a confidence gate that determines how much the system actually knows.

That confidence gate changes everything. Most AI tools answer whether or not they have anything useful to say. This one is honest about what it knows - full expert response when confident, answer with caveats when partial, raw search results when uncertain. Sounds obvious. Nobody does it though.

React Designer

react

Component architecture, server vs. client components, hooks patterns, and accessibility standards

AI Architect

ai

Knowledge system design, tool orchestration, token cost optimization, and reliability patterns

Backend Infrastructure

backend

Database design, API architecture, deployment pipelines, and production observability

Vercel Expert

vercel

Platform optimization, Core Web Vitals, edge functions, and incremental static regeneration

Frontend Performance

frontend

Rendering performance, animation compositing, GPU layer management, and paint optimization

Marketing Strategist

marketing

Product positioning, growth strategy, jobs-to-be-done analysis, and category design

CONFIDENCE ROUTING

It knows what it knows. And tells you when it doesn’t.

Confidence: HIGH

[React Designer] useEffect is for side effects - data fetching, subscriptions, DOM mutations. useMemo is for expensive computations derived from props/state. If you're computing a value, use useMemo. If you're syncing with an external system, use useEffect.

scoreConfidence()
async function scoreConfidence(results, query) {
  if (results.length === 0)
    return { level: 'low', reason: 'No results' }

  const topScore = results[0].score
  const strongMatches = results.filter(r => r.score > HIGH_THRESHOLD)

  // High: strong match found or multiple supporting results
  if (hasExistingDecision || topScore > HIGH_THRESHOLD || strongMatches.length >= 3)
    return { level: 'high' }

  // Medium: partial match - answer with caveats
  if (topScore >= MEDIUM_THRESHOLD || hasPartialMatch(results))
    return { level: 'medium' }

  // Low: below threshold - return raw results
  return { level: 'low' }
}
consult_experts - auto mode
// Auto mode: confidence-based routing
const results = await searchDomain(question, expert.domain)
const confidence = await scoreConfidence(results, question)

if (confidence.level === 'high') {
  // Full expert response - persona-wrapped
  return askExpert(expert, question)
}

if (confidence.level === 'medium') {
  // Expert response WITH caveats
  return askExpert(expert, question) + caveat
}

// Low confidence: raw results + flag for human
return formatRawResults(results) + ingestionSuggestion
PERSONA ENGINE

Persistent identity and memory across every session.

Claude is stateless by default. Every session starts fresh - no memory of the decision you made last week, no idea how you prefer to work, no accumulated context from the last 40 conversations. That drove me crazy enough to build a fix.

The Persona Engine has two layers. The identity layer analyzes your conversation history and distills a stable profile across seven dimensions - tech preferences, architecture opinions, communication style, decision patterns. It loads as a system prompt at zero extra cost per query. The memory layer handles the dynamic stuff: when something needs past context, hybrid search with temporal decay pulls it in, weighting recent decisions higher than old ones.

Now I have Claude across multiple terminals that all share the same context through PostgreSQL. What I decided in one session is available in the next. It's the difference between working with a collaborator who's been here from the start and a contractor who just walked in.

1
Communication Style

How direct, casual, or technical you prefer AI responses to be

2
Questioning Technique

Whether you guide through Socratic questions or direct requests

3
Tech Preferences

Your default stack choices - languages, frameworks, and tools you reach for

4
Decision Patterns

How you weigh tradeoffs and choose between valid approaches

5
Architecture Opinions

Your philosophy on system boundaries, data flow, and coupling

6
Workflow Style

How you use AI tooling - parallel terminals, focused sessions, delegation style

7
Core Beliefs

What you believe about shipping, quality, testing, and technical debt

Stable identity profile baked into every session - no retrieval cost, no added latency per query
Conversation memory via search finds relevant past decisions on demand
Recent conversations rank higher through 180-day temporal decay - last week matters more than last year
Claude instances in different terminals share context through PostgreSQL - knowledge transfers between sessions
Indexes 1GB+ of Claude Code configurations from GitHub repos for pattern matching across projects
Incremental ingestion tracks file modification times so unchanged conversations are never re-processed
retrieveMemory()
async function retrieveMemory(query, limit = 10) {
  const results = await hybridSearch(query, limit)

  return results.map((r, i) =>
    `[${i + 1}] From "${r.project}" (${r.date}):\n    "${r.content}"`
  ).join('\n\n')
}
QA AGENT

Fresh eyes find the bugs you can’t see.

Developers test their own code the way they wrote it. Same paths, same inputs, same muscle memory skipping past the rough edges. You already know where the happy path lives - so you never actually find out what breaks outside it.

This agent comes in with zero context. No codebase knowledge, no assumption about how navigation is supposed to work. It uses MCP browser tools to click around the live app - real DOM interactions, real form submissions, real page navigations. It tries the things you'd never think to try because you already know they're "wrong." That's exactly why it finds bugs you missed.

Approaches every application like a first-time user with no knowledge of intended workflows or architecture
Interacts through MCP browser tools - real DOM clicks, form inputs, and page navigations, not simulated events
Uncovers edge-case bugs that developers skip because they already know which paths work
Runs exploration autonomously - no test scripts to maintain, no manual QA sessions to schedule

Read the full writeup: Fresh Eyes Don’t Assume

UX ANALYSIS AGENT

Automated usability evaluation against Nielsen’s 10.

After months on the same product, you stop seeing the rough edges. The error message below the fold. The loading spinner with no progress indication. The search page with no empty state. They're not bugs, so nobody files a ticket. They just stay broken.

This agent navigates live sites through browser tools and evaluates what it finds against Jakob Nielsen's 10 usability heuristics - the actual principles that separate polished products from frustrating ones. It's checking the running app, not the source code or mockups. Findings come back prioritized by severity, and approved fixes get implemented automatically. No UX audit spreadsheet. No scheduled review. Just a list of what's actually broken and a fix ready to ship.

Evaluates the rendered, running application through browser tools - not source code or design mockups
Checks every interaction against all 10 of Nielsen's usability heuristics with specific violation evidence
Prioritizes findings by severity so the most impactful usability issues get fixed first
Implements approved fixes directly - from identifying the issue to shipping the code change
1Visibility of system status2Match between system and real world3User control and freedom4Consistency and standards5Error prevention6Recognition rather than recall7Flexibility and efficiency of use8Aesthetic and minimalist design9Help users recognize and recover from errors10Help and documentation
0+MCP tools
0knowledge domains
0identity dimensions
0heuristics checked
UNDER THE HOOD
TypeScriptMCP SDKClaude Agent SDKPostgreSQLDrizzle ORMLocal ML ProcessingSemantic SearchConfidence ScoringInter-terminal MessagingNielsen's 10 Heuristics