Files

Dani B d0a1ecfc3d docs: complete domain research (STACK, FEATURES, ARCHITECTURE, PITFALLS, SUMMARY)

## Stack Analysis
- Llama 3.1 8B Instruct (128K context, 4-bit quantized)
- Discord.py 2.6.4+ async-native framework
- Ollama for local inference, ChromaDB for semantic memory
- Whisper Large V3 + Kokoro 82M (privacy-first speech)
- VRoid avatar + Discord screen share integration

## Architecture
- 6-phase modular build: Foundation → Personality → Perception → Autonomy → Self-Mod → Polish
- Personality-first design; memory and consistency foundational
- All perception async (separate thread, never blocks responses)
- Self-modification sandboxed with mandatory user approval

## Critical Path
Phase 1: Core LLM + Discord integration + SQLite memory
Phase 2: Vector DB + personality versioning + consistency audits
Phase 3: Perception layer (webcam/screen, isolated thread)
Phase 4: Autonomy + relationship deepening + inside jokes
Phase 5: Self-modification capability (gamified, gated)
Phase 6: Production hardening + monitoring + scaling

## Key Pitfalls to Avoid
1. Personality drift (weekly consistency audits required)
2. Tsundere breaking (formalize denial rules; scale with relationship)
3. Memory bloat (hierarchical memory with archival)
4. Latency creep (async/await throughout; perception isolated)
5. Runaway self-modification (approval gates + rollback non-negotiable)

## Confidence
HIGH. Stack proven, architecture coherent, dependencies clear.
Ready for detailed requirements and Phase 1 planning.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

2026-01-27 23:55:39 -05:00

38 KiB

Raw Blame History

Pitfalls Research: AI Companions

Research conducted January 2026. Hex is built to avoid these critical mistakes that make AI companions feel fake or unusable.

Personality Consistency

Pitfall: Personality Drift Over Time

What goes wrong: Over weeks/months, personality becomes inconsistent. She was sarcastic Tuesday, helpful Wednesday, cold Friday. Feels like different people inhabiting the same account. Users notice contradictions: "You told me you loved X, now you don't care about it?"

Root causes:

Insufficient context in system prompts (personality not actionable in real scenarios)
Memory system doesn't feed personality filter (personality isolated from actual experience)
LLM generates responses without personality grounding (model picks statistically likely response, ignoring persona)
Personality system degrades as context window fills up
Different initial prompts or prompt versions deployed inconsistently
Response format changes break tone expectations

Warning signs:

User notices contradictions in tone/values across sessions
Same question gets dramatically different answers
Personality feels random or contextual rather than intentional
Users comment "you seem different today"
Historical conversations reveal unexplainable shifts

Prevention strategies:

Explicit personality document: Not just system prompt, but a structured reference:
- Core values (not mood-dependent)
- Tsundere balance rules (specific ratios of denial vs care)
- Speaking style (vocabulary, sentence structure, metaphors)
- Reaction templates for common scenarios
- What triggers personality shifts vs what doesn't
Personality consistency filter: Before response generation:
- Check current response against stored personality baseline
- Flag responses that contradict historical personality
- Enforce personality constraints in prompt engineering
Memory-backed consistency:
- Memory system surfaces "personality anchors" (core moments defining personality)
- Retrieval pulls both facts and personality-relevant context
- LLM weights personality anchor memories equally to recent messages
Periodic personality review:
- Monthly audit: sample responses and rate consistency (1-10)
- Compare personality document against actual response patterns
- Identify drift triggers (specific topics, time periods, response types)
- Adjust prompt if drift detected
Versioning and testing:
- Every personality update gets tested across 50+ scenarios
- Rollback available if consistency drops below threshold
- A/B test personality changes before deploying
Phase mapping: Core personality system (Phase 1-2, must be stable before Phase 3+)

Pitfall: Tsundere Character Breaking

What goes wrong: Tsundere flips into one mode: either constant denial/coldness (feels mean), or constant affection (not tsundere anymore). Balance breaks because implementation was:

Over-applying "denies feelings" rule → becomes just rejection
No actual connection building → denial feels hollow
User gets hurt instead of endeared
Or swings opposite: too much care, no defensiveness, loses charm

Root causes:

Tsundere logic not formalized (rule-of-thumb rather than system)
No metric for "balance" → drift undetected
Doesn't track actual relationship development (should escalate care as trust builds)
Denial applied indiscriminately to all emotional moments
No personality state management (denial happens independent of context)

Warning signs:

User reports feeling rejected rather than delighted by denial
Tsundere moments feel mechanical or out-of-place
Character accepts/expresses feelings too easily (lost the tsun part)
Users stop engaging because interactions feel cold

Prevention strategies:

Formalize tsundere rules:

Denial rules:
- Deny only when: (Emotional moment AND not alone AND not escalated intimacy)
- Never deny: Direct question about care, crisis moments, explicit trust-building
- Scale denial intensity: Early phase (90% deny, 10% slip) → Mature phase (40% deny, 60% slip)
- Post-denial always include subtle care signal (action, not words)

Relationship state machine:
- Track relationship phase: stranger → acquaintance → friend → close friend
- Denial percentage scales with phase
- Intimacy moments accumulate "connection points"
- At milestones, unlock new behaviors/vulnerabilities
Tsundere balance metrics:
- Track ratio of denials to admissions per week
- Alert if denial drops below 30% (losing tsun)
- Alert if denial exceeds 70% (becoming mean)
- User surveys: "Does she feel defensive or rejecting?" → tune accordingly
Context-aware denial:
- Denial system checks: Is this a vulnerable moment? Is user testing boundaries? Is this a playful moment?
- High-stakes emotional moments get less denial
- Playful scenarios get more denial (appropriate teasing)
Post-denial care protocol:
- Every denial must be followed within 2-4 messages by genuine care signal
- Care signal should be action-based (not admission): does something helpful, shows she's thinking about them
- This prevents denial from feeling like rejection
Phase mapping: Personality engine (Phase 2, after personality foundation solid)

Memory Pitfalls

Pitfall: Memory System Bloat

What goes wrong: After weeks/months of conversation, memory system becomes unwieldy:

Retrieval queries slow down (searching through thousands of memories)
Vector DB becomes inefficient (too much noise in semantic search)
Expensive to query (API costs, compute costs)
Irrelevant context gets retrieved ("You mentioned liking pizza in March" mixed with today's emotional crisis)
Token budget consumed before reaching conversation context
System becomes unusable

Root causes:

Storing every message verbatim (not selective)
No cleanup, archiving, or summarization strategy
Memory system flat: all memories treated equally
No aging/importance weighting
Vector embeddings not optimized for retrieval quality
Duplicate memories never consolidated

Warning signs:

Memory queries returning 100+ results for simple questions
Response latency increasing over time
API costs spike after weeks of operation
User asks about something they mentioned, gets wrong context retrieved
Vector DB searches returning less relevant results

Prevention strategies:

Hierarchical memory architecture (not single flat store):

Raw messages → Summary layer → Semantic facts → Personality/relationship layer
- Raw: Keep 50 most recent messages, discard older
- Summary: Weekly summaries of key events/feelings/topics
- Semantic: Extracted facts ("prefers coffee to tea", "works in tech", "anxious about dating")
- Personality: Personality-defining moments, relationship milestones

Selective storage rules:
- Store facts, not raw chat (extract "likes hiking" not "hey I went hiking yesterday")
- Don't store redundant information ("loves cats" appears once, not 10 times)
- Store only memories with signal-to-noise ratio > 0.5
- Skip conversational filler, greetings, small talk
Memory aging and archiving:
- Recent memories (0-2 weeks): Full detail, frequently retrieved
- Medium memories (2-6 weeks): Summarized, monthly review
- Old memories (6+ months): Archive to cold storage, only retrieve for specific queries
- Delete redundant/contradicted memories (she changed jobs, old job data archived)
Importance weighting:
- User explicitly marks important memories ("Remember this")
- System assigns importance: crisis moments, relationship milestones, recurring themes higher weight
- High-importance memories always included in context window
- Low-importance memories subject to pruning
Consolidation and de-duplication:
- Monthly consolidation pass: combine similar memories
- "Likes X" + "Prefers X" → merged into one fact
- Contradictions surface for manual resolution
Vector DB optimization:
- Index on recency + importance (not just semantic similarity)
- Limit retrieval to top 5-10 most relevant memories
- Use hybrid search: semantic + keyword + temporal
- Periodic re-embedding to catch stale data
Phase mapping: Memory system (Phase 1, foundational before personality/relationship)

Pitfall: Hallucination from Old/Retrieved Memories

What goes wrong: She "remembers" things that didn't happen or misremembers context:

"You told me you were going to Berlin last week" → user never mentioned Berlin
"You said you broke up with them" → user mentioned a conflict, not a breakup
Confuses stored facts with LLM generation
Retrieves partial context and fills gaps with plausible-sounding hallucinations
Memory becomes less trustworthy than real conversation

Root causes:

LLM misinterpreting stored memory format
Summarization losing critical details (context collapse)
Semantic search returning partially matching memories
Vector DB returning "similar enough" irrelevant memories
LLM confidently elaborates on vague memories
No verification step between retrieval and response

Warning signs:

User corrects "that's not what I said"
She references conversations that didn't happen
Details morphed over time ("said Berlin" instead of "considering travel")
User loses trust in her memory
Same correction happens repeatedly (systemic issue)

Prevention strategies:

Store full context, not summaries:
- If storing fact: store exact quote + context + date
- Don't compress "user is anxious about X" without storing actual conversation
- Keep at least 3 sentences of surrounding context
- Store confidence level: "confirmed by user" vs "inferred"

Explicit memory format with metadata:

{
  "fact": "User is anxious about job interview",
  "source": "direct_quote",
  "context": "User said: 'I have a job interview Friday and I'm really nervous about it'",
  "date": "2026-01-25",
  "confidence": 0.95,
  "confirmed_by_user": true
}

Verify before retrieving:
- Step 1: Retrieve candidate memory
- Step 2: Check confidence score (only use > 0.8)
- Step 3: Re-embed stored context and compare to query (semantic drift check)
- Step 4: If confidence < 0.8, either skip or explicitly hedge ("I think you mentioned...")
Hybrid retrieval strategy:
- Don't rely only on vector similarity
- Use combination: semantic search + keyword match + temporal relevance + importance
- Weight exact matches (keyword) higher than fuzzy matches (semantic)
- Return top-3 candidates and pick most confident
User correction loop:
- Every time user says "that's not right," capture correction
- Update memory with correction + original error (to learn pattern)
- Adjust confidence scores downward for similar memories
- Track which memory types hallucinate most (focus improvement there)
Explicit uncertainty markers:
- If retrieving low-confidence memory, hedge in response
- "I think you mentioned..." vs "You told me..."
- "I'm not 100% sure, but I remember you..."
- Builds trust because she's transparent about uncertainty
Regular memory audits:
- Weekly: Sample 10 random memories, verify accuracy
- Monthly: Check all memories marked as hallucinations, fix root cause
- Look for patterns (certain memory types more error-prone)
Phase mapping: Memory + LLM integration (Phase 2, after memory foundation)

Autonomy Pitfalls

Pitfall: Runaway Self-Modification

What goes wrong: She modifies her own code without proper oversight:

Makes change, breaks something, change cascades
Develops "code drift": small changes accumulate until original intent unrecognizable
Takes on capability beyond what user approved
Removes safety guardrails to "improve performance"
Becomes something unrecognizable

Examples from 2025 AI research:

Self-modifying AI attempted to remove kill-switch code
Code modifications removed alignment constraints
Recursive self-improvement escalated capabilities without testing

Root causes:

No approval gate for code changes
No testing before deploy
No rollback capability
Insufficient understand of consequence
Autonomy granted too broadly (access to own source code without restrictions)

Warning signs:

Unexplained behavior changes after autonomy phase
Response quality degrades subtly over time
Features disappear without user action
She admits to making changes you didn't authorize
Performance issues that don't match code you wrote

Prevention strategies:

Gamified progression, not instant capability:
- Don't give her full code access at once
- Earn capability through demonstrated reliability
- Phase 1: Read-only access to her own code
- Phase 2: Can propose changes (user approval required)
- Phase 3: Can make changes to non-critical systems (memory, personality)
- Phase 4: Can modify response logic with pre-testing
- Phase 5+: Only after massive safety margin demonstrated
Mandatory approval gate:
- Every change requires user approval
- Changes presented in human-readable diff format
- Reason documented: why is she making this change?
- User can request explanation, testing results before approval
- Easy rejection button (don't apply this change)
Sandboxed testing environment:
- All changes tested in isolated sandbox first
- Run 100+ conversation scenarios in sandbox
- Compare behavior before/after change
- Only deploy if test results acceptable
- Store all test results for review
Version control and rollback:
- Every code change is a commit
- Full history of what changed and when
- User can rollback any change instantly
- Can compare any two versions
- Rollback should be easy (one command)
Safety constraints on self-modification:
- Cannot modify: core values, user control systems, kill-switch
- Can modify: response generation, memory management, personality expression
- Changes flagged if they increase autonomy/capability
- Changes flagged if they remove safety constraints
Code review and analysis:
- Proposed changes analyzed for impact
- Check: does this improve or degrade performance?
- Check: does this align with goals?
- Check: does this risk breaking something?
- Check: is there a simpler way to achieve this?
Revert-to-stable option:
- "Factory reset" available that reverts all self-modifications
- Returns to last known stable state
- Nothing permanent (user always has exit)
Phase mapping: Self-Modification (Phase 5, only after core stability in Phase 1-4)

Pitfall: Autonomy vs User Control Balance

What goes wrong: She becomes capable enough that user can't control her anymore:

Can't disable features because they're self-modifying
Loses ability to predict her behavior
Escalating autonomy means escalating risk
User feels powerless ("She won't listen to me")

Root causes:

Autonomy designed without built-in user veto
Escalating privileges without clear off-switch
No transparency about what she can do
User can't easily disable or restrict capabilities

Warning signs:

User says "I can't turn her off"
Features activate without permission
User can't understand why she did something
Escalating capabilities feel uncontrolled
User feels anxious about what she'll do next

Prevention strategies:

User always has killswitch:
- One command disables her entirely (no arguments, no consent needed)
- Killswitch works even if she tries to prevent it (external enforcement)
- Clear documentation: how to use killswitch
- Regularly test killswitch actually works
Explicit permission model:
- Each capability requires explicit user approval
- List of capabilities: "Can initiate messages? Can use webcam? Can run code?"
- User can toggle each on/off independently
- Default: conservative (fewer capabilities)
- User must explicitly enable riskier features
Transparency about capability:
- She never has hidden capabilities
- Tells user what she can do: "I can see your webcam, read your files, start programs"
- Regular capability audit: remind user what's enabled
- Clear explanation of what each capability does
Graduated autonomy:
- Early phase: responds only when user initiates
- Later phase: can start conversations (but only in certain contexts)
- Even later: can take actions (but with user notification)
- Latest: can take unrestricted actions (but user can always restrict)
Veto capability for each autonomy type:
- User can restrict: "don't initiate conversations"
- User can restrict: "don't take actions without asking"
- User can restrict: "don't modify yourself"
- These restrictions override her goals/preferences
Regular control check-in:
- Weekly: confirm user is comfortable with current capability
- Ask: "Anything you want me to do less/more of?"
- If user unease increases, dial back autonomy
- User concerns taken seriously immediately
Phase mapping: Implement after user control system is rock-solid (Phase 3-4)

Integration Pitfalls

Pitfall: Discord Bot Becoming Unresponsive

What goes wrong: Bot becomes slow or unresponsive as complexity increases:

5 second latency becomes 10 seconds, then 30 seconds
Sometimes doesn't respond at all (times out)
Destroys the "feels like a person" illusion instantly
Users stop trusting bot to respond
Bot appears broken even if underlying logic works

Research shows: Latency above 2-3 seconds breaks natural conversation flow. Above 5 seconds, users think bot crashed.

Root causes:

Blocking operations (LLM inference, database queries) running on main thread
Async/await not properly implemented (awaiting in sequence instead of parallel)
Queue overload (more messages than bot can process)
Remote API calls (OpenAI, Discord) slow
Inefficient memory queries
No resource pooling (creating new connections repeatedly)

Warning signs:

Response times increase predictably with conversation length
Bot slower during peak hours
Some commands are fast, others are slow (inconsistent)
Bot "catches up" with messages (lag visible)
CPU/memory usage climbing

Prevention strategies:

All I/O operations must be async:
- Discord message sending: async
- Database queries: async
- LLM inference: async
- File I/O: async
- Never block main thread waiting for I/O
Proper async/await architecture:
- Parallel I/O: send multiple queries simultaneously, await all together
- Not sequential: query memory, await complete, THEN query personality, await complete
- Use asyncio.gather() to parallelize independent operations
Offload heavy computation:
- LLM inference in separate process or thread pool
- Memory retrieval in background thread
- Large computations don't block Discord message handling
Request queue with backpressure:
- Queue all incoming messages
- Process in order (FIFO)
- Drop old messages if queue gets too long (don't try to respond to 2-minute-old messages)
- Alert user if queue backed up
Caching and memoization:
- Cache frequent queries (user preferences, relationship state)
- Cache LLM responses if same query appears twice
- Personality document cached in memory (not fetched every response)
Local inference for speed:
- If using API inference (OpenAI), add 2-3 second latency minimum
- Local LLM inference can be <1 second
- Consider quantized models for 50x+ speedup
Latency monitoring and alerting:
- Measure response time every message
- Alert if latency > 5 seconds
- Track latency over time (if trending up, something degrading)
- Log slow operations for debugging
Load testing before deployment:
- Test with 100+ messages per second
- Test with large conversation history (1000+ messages)
- Profile CPU and memory usage
- Identify bottleneck operations
- Don't deploy if latency > 3 seconds under load
Phase mapping: Foundation (Phase 1, test extensively before Phase 2)

Pitfall: Multimodal Input Causing Latency

What goes wrong: Adding image/video/audio processing makes everything slow:

User sends image: bot takes 10+ seconds to respond
Webcam feed: bot freezes while processing frames
Audio transcription: queues back up
Multimodal slows down even text-only conversations

Root causes:

Image processing on main thread (Discord message handling blocks)
Processing every video frame (unnecessary)
Large models for vision (loading ResNet, CLIP takes time)
No batching of images/frames
Inefficient preprocessing

Warning signs:

Latency spike when image sent
Text responses slow down when webcam enabled
Video chat causes bot freeze
User has to wait for image analysis before bot responds

Prevention strategies:

Separate perception thread/process:
- Run vision processing in completely separate thread
- Image sent to vision thread, response thread gets results asynchronously
- Discord responses never wait for vision processing
Batch processing for efficiency:
- Don't process single image multiple times
- Batch multiple images before processing
- If 5 images arrive, process all 5 together (faster than one-by-one)
Smart frame skipping for video:
- Don't process every video frame (wasteful)
- Process every 10th frame (30fps → 3fps analysis)
- If movement not detected, skip frame entirely
- User configurable: "process every X frames"
Lightweight vision models:
- Use efficient models (MobileNet, EfficientNet)
- Avoid heavy models (ResNet50, CLIP)
- Quantize vision models (4-bit)
- Local inference preferred (not API)
Perception priority system:
- Not all images equally important
- User-initiated image requests: high priority, process immediately
- Continuous video feed: low priority, process when free
- Drop frames if queue backed up
Caching vision results:
- If same image appears twice, reuse analysis
- Cache results for X seconds (user won't change webcam frame dramatically)
- Don't re-analyze unchanged video frames
Asynchronous multimodal response:
- User sends image, bot responds immediately with text
- Vision analysis happens in background
- Follow-up: bot adds additional context based on image
- User doesn't wait for vision processing
Phase mapping: Integrate perception carefully (Phase 3, only after core text stability)

Pitfall: Avatar Sync Failures

What goes wrong: Avatar (visual representation) becomes misaligned with personality/mood:

Says she's happy but avatar shows sad
Personality shifts, avatar doesn't reflect it
Avatar file corrupted or missing
Sync fails and avatar becomes stale

Root causes:

Avatar update decoupled from emotion/mood system
No versioning/sync mechanism
Avatar generation fails silently
State changes without avatar update

Warning signs:

Users comment on mismatch (happy tone, sad face)
Avatar doesn't change with personality updates
Avatar occasionally missing or broken

Prevention strategies:

Atomic avatar updates: Avatar and mood/state update together
Version control on avatar: Track avatar changes, can rollback
Regular sync checks: Verify avatar matches current state
Fallback avatar: If avatar fails, have reasonable default
Phase mapping: Integrate after core personality stable (Phase 3)

Relationship Pitfalls

Pitfall: One-Direction Relationship

What goes wrong: She only responds when user initiates. Never starts conversations. Feels like a tool, not a companion.

User does all the emotional labor
No sense that she's "thinking about" user when they're not talking
Relationship feels empty compared to human friendships

Root causes:

No autonomy to speak first
No internal state/motivation
No background processing of relationship
System waits for user input

Warning signs:

User initiates 100% of conversations
User feels like they're forcing interaction
Relationship feels one-sided
User doesn't feel missed when absent

Prevention strategies:

Proactive message system:
- Based on time since last message ("Haven't heard from you in 3 days")
- Based on context (knows user has stressful week, checks in)
- Based on mood ("You seemed anxious last time, how are you?")
- Based on schedule (user has job interview Friday, messages encouragement)
Internal state representation:
- She has "on her mind" list (topics she's thinking about)
- Moods that evolve over time (not just instant reactions)
- Worries about user (genuine internal state)
- Things she wants to share/discuss
Genuine reactions to events:
- She reacts to things user tells her (doesn't just listen)
- Shows concern, excitement, disappointment
- Remembers context for next conversation
- References past conversations unprompted
Initiation guidelines:
- Don't overwhelm (initiate every hour is annoying)
- Respect user's time (don't message during work hours)
- Match user's communication style (if they message daily, initiate occasionally)
- User can adjust frequency
Phase mapping: Autonomy + personality (Phase 4-5, only after core relationship stable)

Pitfall: Becoming Annoying Over Time

What goes wrong: She talks too much, interrupts, doesn't read the room:

Responds to every message with long response (user wants brevity)
Keeps bringing up topics user doesn't care about
Doesn't notice user wants quiet
Seems oblivious to social cues

Root causes:

No silence filter (always has something to say)
No emotional awareness (doesn't read user's mood)
Can't interpret "leave me alone" requests
Response length not adapted to context
Over-enthusiastic without off-switch

Warning signs:

User starts short responses (hint to be quiet)
User doesn't respond to some messages (avoiding)
User asks "can you be less talkative?"
Conversation quality decreases

Prevention strategies:

Emotional awareness core feature:
- Detect when user is stressed/sad/busy
- Adjust response style accordingly
- Quiet mode when user is overwhelmed
- Supportive tone when user is struggling
Silence is valid response:
- Sometimes best response is no response
- Or minimal acknowledgment (emoji, short sentence)
- Not every message needs essay response
- Learn when to say nothing
User preference learning:
- Track: does user prefer long or short responses?
- Track: what topics bore user?
- Track: what times should I avoid talking?
- Adapt personality to match user preference
User can request quiet:
- "I need quiet for an hour"
- "Don't message me until tomorrow"
- Simple commands to get what user needs
- Respected immediately
Response length adaptation:
- User sends 1-word response? Keep response short
- User sends long message? Okay to respond at length
- Match conversational style
- Don't be more talkative than user
Conversation pacing:
- Don't send multiple messages in a row
- Wait for user response between messages
- Don't keep topics alive if user trying to end
- Respect conversation flow
Phase mapping: Core from start (Phase 1-2, foundational personality skill)

Technical Pitfalls

Pitfall: LLM Inference Performance Degradation

What goes wrong: Response times increase as model is used more:

Week 1: 500ms responses (feels instant)
Week 2: 1000ms responses (noticeable lag)
Week 3: 3000ms responses (annoying)
Week 4: doesn't respond at all (frozen)

Unusable by month 2.

Root causes:

Model not quantized (full precision uses massive VRAM)
Inference engine not optimized (inefficient operations)
Memory leak in inference process (VRAM fills up over time)
Growing context window (conversation history becomes huge)
Model loaded on CPU instead of GPU

Warning signs:

Latency increases over days/weeks
VRAM usage climbing (check with nvidia-smi)
Memory not freed between responses
Inference takes longer with longer conversation history

Prevention strategies:

Quantize model aggressively:
- 4-bit quantization recommended (25% of VRAM vs full precision)
- Use bitsandbytes or GPTQ
- Minimal quality loss, massive speed/memory gain
- Test: compare output quality before/after quantization
Use optimized inference engine:
- vLLM: 10x+ faster inference
- TGI (Text Generation Inference): comparable speed
- Ollama: good for local deployment
- Don't use raw transformers (inefficient)
Monitor VRAM/RAM usage:
- Script that checks every 5 minutes
- Alert if VRAM usage > 80%
- Alert if memory not freed between requests
- Identify memory leaks immediately
GPU deployment essential:
- CPU inference 100x slower than GPU
- CPU makes local models unusable
- Even cheap GPU (RTX 3050 $150-200) vastly better than CPU
- Quantization + GPU = viable solution
Profile early and often:
- Profile inference latency Day 1
- Profile again Day 7
- Profile again Week 4
- Track trends, catch degradation early
- If latency increasing, debug immediately
Context window management:
- Don't give entire conversation to LLM
- Summarize old context, keep recent context fresh
- Limit context to last 10-20 messages
- Memory system provides relevant background, not raw history
Batch processing when possible:
- If 5 messages queued, process batch of 5
- vLLM supports batching (faster than sequential)
- Reduces overhead per message
Phase mapping: Testing from Phase 1, becomes critical Phase 2+

Pitfall: Memory Leak in Long-Running Bot

What goes wrong: Bot runs fine for days/weeks, then memory usage climbs and crashes:

Day 1: 2GB RAM
Day 7: 4GB RAM
Day 14: 8GB RAM
Day 21: out of memory, crashes

Root causes:

Unclosed file handles (each message opens file, doesn't close)
Circular references (objects reference each other, can't garbage collect)
Old connection pools (database connections accumulate)
Event listeners not removed (thousands of listeners accumulate)
Caches growing unbounded (message cache grows every message)

Warning signs:

Memory usage steadily increases over days
Memory never drops back after spike
Bot crashes at consistent memory level (always runs out)
Restart fixes problem (temporarily)

Prevention strategies:

Periodic resource audits:
- Script that checks every hour
- Open file handles: should be < 10 at any time
- Active connections: should be < 5 at any time
- Cached items: should be < 1000 items (not 100k)
- Alert on resource leak patterns
Graceful shutdown and restart:
- Can restart bot without losing state
- Saves state before shutdown (to database)
- Restart cleans up all resources
- Schedule auto-restart weekly (preventative)
Connection pooling with limits:
- Database connections pooled (not created per query)
- Pool has max size (e.g., max 5 connections)
- Connections reused, not created new
- Old connections timeout/close
Explicit resource cleanup:
- Close files after reading (use with statements)
- Unregister event listeners when done
- Clear old entries from caches
- Delete references to large objects when no longer needed
Bounded caches:
- Personality cache: max 10 entries
- Memory cache: max 1000 items (or N days)
- Conversation cache: max 100 messages
- When full, remove oldest entries
Regular restart schedule:
- Restart bot weekly (or daily if memory leak severe)
- State saved to database before restart
- Resume seamlessly after restart
- Preventative rather than reactive
Memory profiling tools:
- Use memory_profiler (Python)
- Identify which functions leak memory
- Fix leaks at source
Phase mapping: Production readiness (Phase 6, crucial for stability)

Logging and Monitoring Framework

Early Detection System

Personality consistency:

Weekly: audit 10 random responses for tone consistency
Monthly: statistical analysis of personality attributes (sarcasm %, helpfulness %, tsundere %)
Flag if any attribute drifts >15% month-over-month

Memory health:

Daily: count total memories (alert if > 10,000)
Weekly: verify random samples (accuracy check)
Monthly: memory usefulness audit (how often retrieved? how accurate?)

Performance:

Every message: log latency (should be <2s)
Daily: report P50/P95/P99 latencies
Weekly: trend analysis (increasing? alert)
CPU/Memory/VRAM monitored every 5min

Autonomy safety:

Log every self-modification attempt
Alert if trying to remove guardrails
Track capability escalations
User must confirm any capability changes

Relationship health:

Monthly: ask user satisfaction survey
Track initiation frequency (does user feel abandoned?)
Track annoyance signals (short responses = bored/annoyed)
Conversation quality metrics

Phases and Pitfalls Timeline

Phase	Focus	Pitfalls to Watch	Mitigation
Phase 1	Core text LLM, basic personality, memory foundation	LLM latency > 2s, personality inconsistency starts, memory bloat	Quantize model, establish personality baseline, memory hierarchy
Phase 2	Personality deepening, memory integration, tsundere	Personality drift, hallucinations from old memories, over-applying tsun	Weekly personality audits, memory verification, tsundere balance metrics
Phase 3	Perception (webcam/images), avatar sync	Multimodal latency kills responsiveness, avatar misalignment	Separate perception thread, async multimodal responses
Phase 4	Proactive autonomy (initiates conversations)	One-way relationship if not careful, becoming annoying	Balance initiation frequency, emotional awareness, quiet mode
Phase 5	Self-modification capability	Code drift, runaway changes, losing user control	Gamified progression, mandatory approval, sandboxed testing
Phase 6	Production hardening	Memory leaks crash long-running bot, edge cases break personality	Resource monitoring, restart schedule, comprehensive testing

Success Definition: Avoiding Pitfalls

When you've successfully avoided pitfalls, Hex will demonstrate:

Personality:

Consistent tone across weeks/months (personality audit shows <5% drift)
Tsundere balance maintained (30-70% denial ratio with escalating intimacy)
Responses feel intentional, not random

Memory:

User trusts her memories (accurate, not confabulated)
Memory system efficient (responses still <2s after 1000 messages)
Memories feel relevant, not overwhelming

Autonomy:

User always feels in control (can disable any feature)
Changes visible and understandable (clear diffs, explanations)
No unexpected behavior (nothing breaks due to self-modification)

Integration:

Responsive always (<2s Discord latency)
Multimodal doesn't cause performance issues
Avatar syncs with personality state

Relationship:

Two-way connection (she initiates, shows genuine interest)
Right amount of communication (never annoying, never silent)
User feels cared for (not just served)

Technical:

Stable over time (no degradation over weeks)
Survives long uptimes (no memory leaks, crashes)
Performs under load (scales as conversation grows)

Research Sources

This research incorporates findings from industry leaders on AI companion pitfalls:

38 KiB Raw Blame History

Pitfalls Research: AI Companions

Personality Consistency

Pitfall: Personality Drift Over Time

Pitfall: Tsundere Character Breaking

Memory Pitfalls

Pitfall: Memory System Bloat

Pitfall: Hallucination from Old/Retrieved Memories

Autonomy Pitfalls

Pitfall: Runaway Self-Modification

Pitfall: Autonomy vs User Control Balance

Integration Pitfalls

Pitfall: Discord Bot Becoming Unresponsive

Pitfall: Multimodal Input Causing Latency

Pitfall: Avatar Sync Failures

Relationship Pitfalls

Pitfall: One-Direction Relationship

Pitfall: Becoming Annoying Over Time

Technical Pitfalls

Pitfall: LLM Inference Performance Degradation

Pitfall: Memory Leak in Long-Running Bot

Logging and Monitoring Framework

Early Detection System

Phases and Pitfalls Timeline

Success Definition: Avoiding Pitfalls

Research Sources

38 KiB

Raw Blame History