Challenge Framing
MindMirror predictions and scores were useful but difficult to interpret conversationally, while unconstrained assistants would introduce hallucination and out-of-scope risk in a sensitive domain.
A guarded AI explanation layer for psychology-focused product experiences.
Rather than building a general chatbot, this service was intentionally constrained around psychology interpretation. That tradeoff improves relevance, trust, and production safety.
Overview
Built a memory-aware AI service that turns structured MindMirror predictions into grounded conversational explanations using user-specific retrieval, cache reuse, and corrective web search with explicit domain restrictions.
Problem
MindMirror predictions and scores were useful but difficult to interpret conversationally, while unconstrained assistants would introduce hallucination and out-of-scope risk in a sensitive domain.
Approach
I designed a LangGraph-based orchestration flow with domain routing, user-scoped retrieval, hybrid memory, semantic cache reuse, and corrective search rules that stay inside psychology.
Built a memory-aware AI service that turns structured MindMirror predictions into grounded conversational explanations using user-specific retrieval, cache reuse, and corrective web search with explicit domain restrictions.
Challenge Framing
MindMirror predictions and scores were useful but difficult to interpret conversationally, while unconstrained assistants would introduce hallucination and out-of-scope risk in a sensitive domain.
Solution Strategy
I designed a LangGraph-based orchestration flow with domain routing, user-scoped retrieval, hybrid memory, semantic cache reuse, and corrective search rules that stay inside psychology.
Project Highlights
Agent orchestration, guarded retrieval, personalized memory, domain safety, and operational observability for AI-backed products.
Core Stack
Key Features
The router enforces domain restriction before retrieval or generation begins.
Prediction data is resolved against canonical user identity before context assembly.
Short-term conversation state and long-term facts are recalled under a strict token budget.
SSE streaming and observability hooks expose graph progress, latency, and debugging context.
Each layer stays explicit so reviewers can quickly understand where ingestion, orchestration, persistence, and model-serving responsibilities live.
Chat, streaming, health, graph, and memory endpoints expose the service to MindMirror clients and developers.
LangGraph coordinates recall, routing, cache, retrieval, synthesis, and memory write-back.
Supabase stores operational memory and user data while Chroma supports semantic cache lookup.
The pipeline section keeps the most important engineering steps visible without collapsing them into generic bullet lists.
Load recent turns and long-term facts under a bounded context budget.
Classify the query into a safe execution path and reject non-psychology requests.
Resolve user identity, fetch prediction context, or expand into corrective search when internal evidence is weak.
Generate grounded explanations, then persist cache hits and memory-worthy facts for future turns.
This timeline keeps the implementation story concise: what was framed first, what was hardened next, and what ultimately made the project production-ready.
Constrained the assistant to psychology and MindMirror contexts before retrieval logic was added.
Introduced short-term recall, long-term fact distillation, and semantic cache reuse for repeated queries.
Added streaming graph visibility and observability hooks to support debugging and cost inspection.
This section is intentionally recruiter-friendly and engineer-friendly at the same time: each challenge is tied to a concrete design choice and a specific outcome.
Challenge
Solution
Added layered routing constraints, a fixed safe route set, and reject paths before retrieval and synthesis.
Outcome
Improved trust and reduced generic-chatbot drift.
Challenge
Solution
Separated short-term memory from distilled long-term facts and enforced a strict recall budget.
Outcome
Maintained personalization without degrading answer quality.
Challenge
Solution
Combined exact cache keys with semantic similarity matching scoped by user, route, and context hash.
Outcome
Improved repeat-query efficiency while staying personalized.
The emphasis here is signal, not decoration: key numbers, verifiable outcomes, and the context needed to interpret them responsibly.
Execution Graph
7 nodes
Recall through memory write-back in a reusable LangGraph flow.
Route Types
5
Direct, RAG, memory, web search, and reject.
Recall Budget
500 tokens
Memory context stays constrained and intentional.
Memory Policy
30 / 90d
Short-term turns plus 90-day long-term fact retention.
Key Results
Research + Business Impact
Turns raw behavioral inference tables into an explainable conversational experience that feels more useful to end users.
Shows a mature AI system design mindset: scoped behavior, explicit routing, memory policy, observability, and fallback control.