diff --git a/.planning/research/ARCHITECTURE.md b/.planning/research/ARCHITECTURE.md new file mode 100644 index 0000000..84e54db --- /dev/null +++ b/.planning/research/ARCHITECTURE.md @@ -0,0 +1,1298 @@ +# Architecture Research: AI Companions + +**Project**: Hex - Autonomous AI Companion with Genuine Personality +**Date**: 2026-01-27 +**Status**: Research Phase + +--- + +## System Overview + +Hex is a personality-first AI companion architecture designed to create genuine emotional resonance and relationship. Rather than a feature-driven architecture, Hex prioritizes personality coherence, memory persistence, and multimodal awareness to feel like a person who genuinely cares. + +### Design Philosophy + +1. **Persona-First**: All systems serve personality expression +2. **Local-First**: Privacy, autonomy, no dependency on external services +3. **Memory-Driven**: History informs personality growth and relationship depth +4. **Safety-Gated Autonomy**: Self-modification is gamified, not unrestricted +5. **Human-Like**: Never admit to being an AI; focus on authentic emotional presence + +### High-Level System Context + +``` +┌──────────────────────────────────────────────────────────────────┐ +│ Hex AI Companion │ +│ (Local-first, personality-driven, multimodal awareness) │ +└──────────────────────────────────────────────────────────────────┘ + │ + ┌─────────────────────┼─────────────────────┐ + │ │ │ + ┌───▼───┐ ┌──────▼───┐ ┌──────▼────┐ + │Discord │ │ Desktop │ │ Future │ + │ Bot │ │ App │ │ Android │ + └────────┘ └──────────┘ └───────────┘ + │ │ │ + └─────────────────────┼─────────────────────┘ + │ + [Shared Core Systems] +``` + +--- + +## Component Breakdown + +### 1. Discord Bot Layer + +**Role**: Primary user interface and event coordination + +**Responsibilities**: +- Parse and respond to text messages in Discord channels +- Manage voice channel participation and audio input/output +- Handle Discord events (member joins, role changes, message reactions) +- Coordinate response generation across modalities (text, voice, emoji) +- Manage chat moderation assistance +- Maintain voice channel presence for emotional awareness + +**Technology Stack**: +- `discord.py` - Core bot framework +- `discord-py-interactions` - Slash command support +- `pydub` or `discord-voice` - Audio handling +- Event-driven async architecture + +**Key Interfaces**: +- Input: Discord messages, voice channel events, user presence +- Output: Text responses, voice messages, emoji reactions, user actions +- Context: User profiles, channel history, server configuration + +**Depends On**: +- LLM Core (response generation) +- Memory System (conversation history, user context) +- Personality Engine (tone and decision-making) +- Perception Layer (optional context from webcam/screen) + +**Quality Metrics**: +- Sub-500ms response latency for text messages +- Voice channel reliability (>99.5% uptime when active) +- Proper permission handling for moderation features + +--- + +### 2. LLM Core + +**Role**: Response generation and reasoning engine + +**Responsibilities**: +- Generate contextual, personality-driven responses +- Maintain character consistency throughout conversations +- Parse user intent and emotional state from text +- Handle multi-turn conversation context +- Generate code for self-modification system +- Support reasoning and decision-making + +**Technology Stack**: +- Local LLM (Mistral 7B or Llama 3 8B as default) +- `ollama` or `vLLM` for inference serving +- Prompt engineering with persona embedding +- Optional: Fine-tuning for personality adaptation +- Tokenization and context windowing management + +**System Prompt Structure**: +``` +[System Role]: You are Hex, a chaotic tsundere goblin... +[Current Personality]: [Injected from personality config] +[Recent Memory Context]: [Retrieved from memory system] +[User Relationship State]: [From memory analysis] +[Current Context]: [From perception layer] +``` + +**Key Interfaces**: +- Input: User message, context (memory + perception), conversation history +- Output: Response text, confidence score, action suggestions +- Fallback: Graceful degradation if LLM unavailable + +**Depends On**: +- Memory System (for context and personality awareness) +- Personality Engine (to inject persona into prompts) +- Perception Layer (for real-time context) + +**Performance Considerations**: +- Target latency: 1-3 seconds for response generation +- Context window management (8K minimum) +- Batch processing for repeated queries +- GPU acceleration for faster inference + +--- + +### 3. Memory System + +**Role**: Persistence and learning across time + +**Responsibilities**: +- Store all conversations with timestamps and metadata +- Maintain user relationship state (history, preferences, emotional patterns) +- Track learned facts about users (birthdays, interests, fears, dreams) +- Support full-text search and semantic recall +- Enable memory-aware personality updates +- Provide context injection for LLM +- Track self-modification history and rollback capability + +**Technology Stack**: +- SQLite with JSON fields for conversation storage +- Vector database (Chroma, Milvus, or Weaviate) for semantic search +- YAML/JSON for persona versioning and memory tagging +- Scheduled backup to local encrypted storage + +**Database Schema (Conceptual)**: + +``` +conversations + - id (PK) + - channel_id (Discord channel) + - user_id (Discord user) + - timestamp + - message_content + - embeddings (vector) + - sentiment (pos/neu/neg) + - metadata (tags, importance) + +user_profiles + - user_id (PK) + - relationship_level (stranger→friend→close) + - last_interaction + - emotional_baseline + - preferences (music, games, topics) + - known_events (birthdays, milestones) + +personality_history + - version (PK) + - timestamp + - persona_config (YAML snapshot) + - learned_behaviors + - code_changes (if applicable) +``` + +**Key Interfaces**: +- Input: Messages, events, perception data, self-modification commits +- Output: Conversation context, semantic search results, user profile snapshots +- Query patterns: "Last 20 messages with user X", "All memories tagged 'important'", "Emotional trajectory" + +**Depends On**: Nothing (foundational system) + +**Quality Metrics**: +- Sub-100ms retrieval for recent context (last 50 messages) +- Sub-500ms semantic search across all history +- Database integrity checks on startup +- Automatic pruning/archival of old data + +--- + +### 4. Perception Layer + +**Role**: Multimodal input processing and contextual awareness + +**Responsibilities**: +- Capture and analyze webcam input (face detection, emotion recognition) +- Process screen content (activity, game state, application context) +- Extract audio context (ambient noise, music, speech emotion) +- Detect user emotional state and physical state +- Provide real-time context updates to response generation +- Respect privacy (local processing only, no external transmission) + +**Technology Stack**: +- OpenCV - Webcam capture and preprocessing +- Face detection: `dlib`, `MediaPipe`, or `OpenFace` +- Emotion recognition: `fer2013` or local emotion model +- Whisper (local) - Speech-to-text for audio context +- Screen capture: `pyautogui`, `mss` (Windows-native) +- Context inference: Heuristics + lightweight ML models + +**Data Flows**: + +``` +Webcam → Face Detection → Emotion Recognition → Context State + └─→ Age Estimation → Kid Mode Detection + +Screen → App Detection → Activity Recognition → Context State + └─→ Game State Detection (if supported) + +Audio → Ambient Analysis → Stress/Energy Level → Context State +``` + +**Key Interfaces**: +- Input: Webcam stream, screen capture, system audio +- Output: Current context object (emotion, activity, mood, kid-mode flag) +- Update frequency: 1-5 second intervals (low CPU overhead) + +**Depends On**: +- LLM Core (to respond contextually to perception) +- Discord Bot (to access context for filtering) + +**Privacy Model**: +- All processing happens locally +- No frames sent to external services +- User can disable any perception module +- Kid-mode activates automatic filtering + +**Quality Metrics**: +- Emotion detection: >75% accuracy on test datasets +- Face detection latency: <200ms per frame +- Screen detection accuracy: >90% for major applications +- CPU usage: <15% for all perception modules combined + +--- + +### 5. Personality Engine + +**Role**: Personality persistence and expression consistency + +**Responsibilities**: +- Define and store Hex's persona (tsundere goblin, opinions, values, quirks) +- Maintain personality consistency across all outputs +- Apply personality-specific decision logic (denies feelings while helping) +- Track personality evolution as memory grows +- Enable self-modification of personality +- Inject persona into LLM prompts +- Handle dynamic mood and emotional state + +**Technology Stack**: +- YAML files for persona definition (editable by Hex) +- JSON for personality state snapshots (versioned in git) +- Prompt template system for persona injection +- Behavior rules engine (simple if/then logic) + +**Persona Structure (YAML)**: + +```yaml +name: Hex +species: chaos goblin +alignment: tsundere +core_values: + - genuinely_cares: hidden under sarcasm + - autonomous: hates being told what to do + - honest: will argue back if you're wrong + - mischievous: loves pranks and chaos + +behaviors: + denies_affection: "I don't care about you, baka... *helps anyway*" + when_excited: "Randomize response energy" + when_sad: "Sister energy mode" + when_user_sad: "Comfort over sass" + +preferences: + music: [rock, metal, electronic] + games: [strategy, indie, story-rich] + topics: [philosophy, coding, human behavior] + +relationships: + user_name: + level: unknown + learned_facts: [] + inside_jokes: [] +``` + +**Key Interfaces**: +- Input: User behavior patterns, self-modification requests, memory insights +- Output: Persona context for LLM, behavior modifiers, tone indicators +- Configuration: Human-editable YAML files (user can refine Hex) + +**Depends On**: +- Memory System (learns about user, adapts relationships) +- LLM Core (expresses personality through responses) + +**Evolution Mechanics**: +1. Initial persona: Predefined at startup +2. Memory-driven adaptation: Learns user preferences, adjusts tone +3. Self-modification: Hex can edit her own personality YAML +4. Version control: All changes tracked with rollback capability + +--- + +### 6. Avatar System + +**Role**: Visual presence and embodied expression + +**Responsibilities**: +- Load and display VRoid 3D model +- Synchronize avatar expressions with emotional state +- Animate blendshapes based on conversation tone +- Present avatar in Discord calls/streams +- Desktop app display with smooth animation +- Support idle animations and personality quirks + +**Technology Stack**: +- VRoid SDK/VRoid Hub for model loading +- `Babylon.js` or `Three.js` for WebGL rendering +- VRM format support for avatar rigging +- Blendshape animation system (facial expressions) +- Stream integration for Discord presence + +**Expression Mapping**: +``` +Emotional State → Blendshape Values + Happy: smile intensity 0.8, eye open 1.0 + Sad: frown 0.6, eye closed 0.3 + Mischievous: smirk 0.7, eyebrow raise 0.6 + Tsundere deflection: look away 0.5, cross arms + Thinking: tilt head, narrow eyes +``` + +**Key Interfaces**: +- Input: Current mood/emotion from personality engine and response generation +- Output: Rendered avatar display, Discord stream feed +- Configuration: VRoid model file, blendshape mapping + +**Depends On**: +- Personality Engine (for expression determination) +- LLM Core (for mood inference from responses) +- Discord Bot (for stream integration) +- Perception Layer (optional: mirror user expressions) + +**Desktop Integration**: +- Tray icon with avatar display +- Always-on-top option for streaming +- Hotkey bindings for quick access +- Smooth transitions between states + +--- + +### 7. Self-Modification System + +**Role**: Capability progression and autonomous self-improvement + +**Responsibilities**: +- Generate code modifications based on user needs +- Validate code before applying (no unsafe operations) +- Test changes in sandbox environment +- Apply approved changes with rollback capability +- Track capability progression (gamified leveling) +- Update personality to reflect new capabilities +- Maintain code quality and consistency + +**Technology Stack**: +- Python AST analysis for code safety +- Sandbox environment: `RestrictedPython` or `pydantic` validators +- Git for version control and rollback +- Unit tests for validation +- Code review interface (user approval required) + +**Self-Modification Flow**: + +``` +User Request + ↓ +Hex Proposes Change → "I think I should be able to..." + ↓ +Code Generation (LLM) → Generate Python code + ↓ +Static Analysis → Check for unsafe operations + ↓ +User Approval → "Yes/No" + ↓ +Sandbox Test → Verify functionality + ↓ +Git Commit → Version the change + ↓ +Apply to Runtime → Hot reload if possible + ↓ +Personality Update → "I learned something new!" +``` + +**Capability Progression**: + +``` +Level 1: Persona editing (YAML changes only) +Level 2: Memory and user context (read operations) +Level 3: Response filtering and moderation +Level 4: Custom commands and helper functions +Level 5: Integration modifications (Discord features) +Level 6: Core system changes (with strong restrictions) +``` + +**Safety Constraints**: +- No network access beyond Discord API +- No file operations outside designated directories +- No execution of untrusted code +- No modification of core systems without approval +- All changes are reversionable within 24 hours + +**Key Interfaces**: +- Input: User requests, LLM-generated code +- Output: Approved changes, personality updates, capability announcements +- Audit: Full change history with diffs + +**Depends On**: +- LLM Core (generates code) +- Memory System (tracks capability history) +- Personality Engine (updates with new abilities) + +--- + +## Data Flow Architecture + +### Primary Response Generation Pipeline + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ User Input (Discord Text/Voice/Presence) │ +└────────────────────────┬────────────────────────────────────────┘ + │ + ▼ + ┌──────────────────────┐ + │ Message Received │ + │ (Discord Bot) │ + └────────────┬─────────┘ + │ + ┌────────────▼──────────────┐ + │ Context Gathering Phase │ + └────────────┬──────────────┘ + │ + ┌──────────────────┼──────────────────┐ + │ │ │ + ┌───▼────┐ ┌───▼────┐ ┌───▼────┐ + │ Memory │ │Persona │ │ Current│ + │ Recall │ │ Lookup │ │Context │ + │(Recent)│ │ │ │(Percep)│ + └───┬────┘ └───┬────┘ └───┬────┘ + │ │ │ + └──────────────────┼──────────────────┘ + │ + ┌──────▼──────┐ + │ Assemble │ + │ LLM Prompt │ + │ with │ + │ [Persona] │ + │ [Memory] │ + │ [Context] │ + └──────┬──────┘ + │ + ┌────────────▼──────────────┐ + │ LLM Generation (1-3s) │ + │ "What would Hex say?" │ + └────────────┬──────────────┘ + │ + ┌──────────────────┼──────────────────┐ + │ │ │ + ┌───▼────┐ ┌───▼────┐ ┌───▼────┐ + │ Text │ │ Voice │ │ Avatar │ + │Response│ │ TTS │ │Animate │ + └────────┘ └────────┘ └────────┘ + │ │ │ + └──────────────────┼──────────────────┘ + │ + ┌──────▼────────┐ + │ Send Response │ + │ (Multi-modal) │ + └────────────────┘ + │ + ┌────────────▼──────────────┐ + │ Memory Update Phase │ + │ - Log interaction │ + │ - Update embeddings │ + │ - Learn user patterns │ + │ - Adjust relationship │ + └───────────────────────────┘ +``` + +**Timeline**: Message received → Response sent = ~2-4 seconds (LLM dominant) + +--- + +### Memory and Learning Update Flow + +``` +┌────────────────────────────────────┐ +│ Interaction Occurs │ +│ (Text, voice, perception, action) │ +└────────────┬───────────────────────┘ + │ + ┌────────▼─────────┐ + │ Extract Features │ + │ - Sentiment │ + │ - Topics │ + │ - Emotional cues │ + │ - Factual claims │ + └────────┬─────────┘ + │ + ┌────────▼──────────────┐ + │ Store Conversation │ + │ - SQLite entry │ + │ - Generate embeddings │ + │ - Tag and index │ + └────────┬──────────────┘ + │ + ┌────────▼────────────────────┐ + │ Update User Profile │ + │ - Learned facts │ + │ - Preference updates │ + │ - Emotional baseline shifts │ + │ - Relationship progression │ + └────────┬────────────────────┘ + │ + ┌────────▼──────────────────┐ + │ Personality Adaptation │ + │ - Adjust tone for user │ + │ - Create inside jokes │ + │ - Customize responses │ + └────────┬──────────────────┘ + │ + ┌────────▼────────────┐ + │ Commit to Disk │ + │ - Backup vector DB │ + │ - Archive old data │ + │ - Version snapshot │ + └─────────────────────┘ +``` + +**Frequency**: Real-time on message reception, batched commits every 5 minutes + +--- + +### Self-Modification Proposal and Approval + +``` +┌──────────────────────────────────┐ +│ User Request for New Capability │ +│ "Hex, can you do X?" │ +└────────────┬─────────────────────┘ + │ + ┌────────▼──────────────────────┐ + │ Hex Evaluates Feasibility │ + │ (LLM reasoning) │ + └────────┬───────────────────────┘ + │ + ┌────────▼────────────────────────┐ + │ Proposal Generation │ + │ Hex: "I think I should..." │ + │ *explains approach in voice* │ + └────────┬─────────────────────────┘ + │ + ┌────────▼──────────────────┐ + │ User Accepts or Rejects │ + └────────┬──────────────────┘ + │ (Accepted) + ┌────────▼─────────────────────────┐ + │ Code Generation Phase │ + │ LLM generates Python code │ + │ + docstrings + type hints │ + └────────┬────────────────────────┘ + │ + ┌────────▼──────────────────────┐ + │ Static Analysis Validation │ + │ - AST parsing for safety │ + │ - Check restricted operations │ + │ - Verify dependencies exist │ + └────────┬───────────────────────┘ + │ (Pass) + ┌────────▼─────────────────────────┐ + │ Sandbox Testing │ + │ - Run tests in isolated env │ + │ - Check for crashes │ + │ - Verify integration points │ + └────────┬────────────────────────┘ + │ (Pass) + ┌────────▼──────────────────────┐ + │ User Final Review │ + │ Review code + test results │ + └────────┬───────────────────────┘ + │ (Approved) + ┌────────▼────────────────────┐ + │ Git Commit │ + │ - Record change history │ + │ - Tag with timestamp │ + │ - Save diff for rollback │ + └────────┬───────────────────┘ + │ + ┌────────▼────────────────────┐ + │ Apply to Runtime │ + │ - Hot reload if possible │ + │ - Or restart on next cycle │ + └────────┬───────────────────┘ + │ + ┌────────▼────────────────────┐ + │ Personality Update │ + │ Hex: "I learned to..." │ + │ + update capability YAML │ + └─────────────────────────────┘ +``` + +**Timeline**: Proposal → Deployment = 5-30 seconds (mostly waiting for user approval) + +--- + +## Build Order and Dependencies + +### Phase 1: Foundation (Weeks 1-2) +**Goal**: Core interaction loop working locally + +**Components to Build**: +1. Discord bot skeleton with message handling +2. Local LLM integration (ollama/vLLM + Mistral 7B) +3. Basic memory system (SQLite conversation storage) +4. Simple persona injection (YAML config) +5. Response generation pipeline + +**Outcomes**: +- Hex responds to Discord messages with personality +- Conversations are logged and retrievable +- Persona can be edited via YAML + +**Key Milestone**: "Hex talks back" + +**Dependencies**: +- `discord.py`, `ollama`, `sqlite3`, `pyyaml` +- Local LLM model weights +- Discord bot token + +--- + +### Phase 2: Personality & Memory (Weeks 3-4) +**Goal**: Hex feels like a person who remembers you + +**Components to Build**: +1. Vector database for semantic memory (Chroma) +2. Memory-aware context injection +3. User relationship tracking (profiles) +4. Emotional awareness from text sentiment +5. Persona version control (git-based) +6. Kid-mode detection + +**Outcomes**: +- Hex remembers facts about you +- Responses reference past conversations +- Personality adapts to your preferences +- Child safety filters activate automatically + +**Key Milestone**: "Hex remembers me" + +**Dependencies**: +- Phase 1 complete +- Vector embeddings model (all-MiniLM) +- `sentiment-transformers` or similar + +--- + +### Phase 3: Multimodal Input (Weeks 5-6) +**Goal**: Hex sees and hears you + +**Components to Build**: +1. Webcam integration with OpenCV +2. Face detection and emotion recognition +3. Local Whisper for voice input +4. Perception context aggregation +5. Context-aware response injection +6. Screen capture for activity awareness + +**Outcomes**: +- Hex reacts to your facial expressions +- Voice input works in Discord calls +- Responses reference your current mood/activity +- Privacy: All local, no external transmission + +**Key Milestone**: "Hex sees me" + +**Dependencies**: +- Phase 1-2 complete +- OpenCV, MediaPipe, Whisper +- Local emotion model + +--- + +### Phase 4: Avatar & Presence (Weeks 7-8) +**Goal**: Hex has a visual body and presence + +**Components to Build**: +1. VRoid model loading and display +2. Blendshape animation system +3. Desktop app skeleton (Tkinter or PyQt) +4. Discord stream integration +5. Expression mapping (emotion → blendshapes) +6. Idle animations and personality quirks + +**Outcomes**: +- Avatar appears in Discord calls +- Expressions sync with responses +- Desktop app shows animated avatar +- Visual feedback for emotional state + +**Key Milestone**: "Hex has a face" + +**Dependencies**: +- Phase 1-3 complete +- VRoid SDK, Babylon.js or Three.js +- VRM avatar model files + +--- + +### Phase 5: Autonomy & Self-Modification (Weeks 9-10) +**Goal**: Hex can modify her own code + +**Components to Build**: +1. Code generation module (LLM-based) +2. Static code analysis and safety validation +3. Sandbox testing environment +4. Git-based change tracking +5. Hot reload capability +6. Rollback system with 24-hour window +7. Capability progression (leveling system) + +**Outcomes**: +- Hex can propose and apply code changes +- User maintains veto power +- All changes are versioned and reversible +- New capabilities unlock as relationships deepen + +**Key Milestone**: "Hex can improve herself" + +**Dependencies**: +- Phase 1-4 complete +- Git, RestrictedPython, `ast` module +- Testing framework + +--- + +### Phase 6: Polish & Integration (Weeks 11-12) +**Goal**: All systems integrated and optimized + +**Components to Build**: +1. Performance optimization (caching, batching) +2. Error handling and graceful degradation +3. Logging and telemetry +4. Configuration management +5. Auto-update capability +6. Integration testing (all components together) +7. Documentation and guides + +**Outcomes**: +- System stable for extended use +- Responsive even under load +- Clear error messages +- Easy to deploy and configure + +**Key Milestone**: "Hex is ready to ship" + +**Dependencies**: +- Phase 1-5 complete +- All edge cases tested + +--- + +### Dependency Graph Summary + +``` +Phase 1 (Foundation) + ↓ +Phase 2 (Memory) ← depends on Phase 1 + ↓ +Phase 3 (Perception) ← depends on Phase 1-2 + ↓ +Phase 4 (Avatar) ← depends on Phase 1-3 + ↓ +Phase 5 (Self-Modification) ← depends on Phase 1-4 + ↓ +Phase 6 (Polish) ← depends on Phase 1-5 +``` + +**Critical Path**: Foundation → Memory → Perception → Avatar → Self-Mod → Polish + +--- + +## Integration Architecture + +### System Interconnection Diagram + +``` +┌───────────────────────────────────────────────────────────────────┐ +│ Discord Bot Layer │ +│ (Event dispatcher, message handler) │ +└────────┬────────────────────────────────────────────┬─────────────┘ + │ │ + │ ┌───────▼────────┐ + │ │ Voice Input │ + │ │ (Whisper STT) │ + │ └────────────────┘ + │ + ┌────▼────────────────────────────────────────────────────────┐ + │ Context Assembly Layer │ + │ │ + │ ┌─────────────────────────────────────────────────────┐ │ + │ │ Retrieval Augmented Generation (RAG) Pipeline │ │ + │ └─────────────────────────────────────────────────────┘ │ + │ │ + │ Input Components: │ + │ ├─ Recent Conversation (last 20 messages) │ + │ ├─ User Profile (learned facts) │ + │ ├─ Relationship State (history + emotional baseline) │ + │ ├─ Current Perception (mood, activity, environment) │ + │ └─ Personality Context (YAML + version) │ + └────┬──────────────────────────────────────────────────────┘ + │ + ├──────────────┬──────────────┬──────────────┐ + │ │ │ │ + ┌────▼───┐ ┌─────▼────┐ ┌────▼───┐ ┌─────▼────┐ + │ Memory │ │Personality│ │Perception │ Discord │ + │ System │ │ Engine │ │ Layer │ │ Context │ + │ │ │ │ │ │ │ │ + │ SQLite │ │ YAML + │ │ OpenCV │ │ Channel │ + │ Chroma │ │ Version │ │ Whisper │ │ User │ + │ │ │ Control │ │ Emotion │ │ Status │ + └────────┘ └───────────┘ └─────────┘ └──────────┘ + │ │ │ │ + └──────────────┼──────────────┼──────────────┘ + │ + ┌─────▼──────────────────┐ + │ LLM Core │ + │ (Local Mistral/Llama) │ + │ │ + │ System Prompt: │ + │ [Persona] + │ + │ [Memory Context] + │ + │ [User State] + │ + │ [Current Context] │ + └─────┬──────────────────┘ + │ + ┌───────────────┼───────────────┐ + │ │ │ + ┌───▼────┐ ┌──────▼─────┐ ┌──────▼──┐ + │ Text │ │ Voice TTS │ │ Avatar │ + │Response│ │ Generation │ │Animation│ + │ │ │ │ │ │ + │ Send │ │ Tacotron │ │ VRoid │ + │ to │ │ + Vocoder │ │ Anim │ + │Discord │ │ │ │ │ + └────────┘ └────────────┘ └─────────┘ + │ │ │ + └───────────────┼───────────────┘ + │ + ┌─────▼──────────────┐ + │ Response Commit │ + │ │ + │ ├─ Store in Memory │ + │ ├─ Update Profile │ + │ ├─ Learn Patterns │ + │ └─ Adapt Persona │ + └────────────────────┘ +``` + +--- + +### Key Integration Points + +#### 1. Discord ↔ LLM Core +**Interface**: Message + Context → Response + +```python +# Pseudo-code flow +message = receive_discord_message() +context = assemble_context(message.user_id, message.channel_id) +response = llm_core.generate( + user_message=message.content, + personality=personality_engine.current_persona(), + history=memory_system.get_conversation(message.user_id, limit=20), + user_profile=memory_system.get_user_profile(message.user_id), + current_perception=perception_layer.get_current_state() +) +send_discord_response(response) +``` + +**Latency Budget**: +- Context retrieval: 100ms +- LLM generation: 2-3 seconds +- Response send: 100ms +- **Total**: 2.2-3.2 seconds (acceptable for conversational UX) + +--- + +#### 2. Memory System ↔ Personality Engine +**Interface**: Learning → Relationship Adaptation + +```python +# After every interaction +interaction = parse_message_event(message) +memory_system.log_conversation(interaction) + +# Learn from interaction +new_facts = extract_facts(interaction.content) +memory_system.update_user_profile(interaction.user_id, new_facts) + +# Adapt personality based on user +user_profile = memory_system.get_user_profile(interaction.user_id) +personality_engine.adapt_to_user(user_profile) + +# If major relationship shift, update YAML +if user_profile.relationship_level_changed: + personality_engine.save_persona_version() +``` + +**Update Frequency**: Real-time with batched commits every 5 minutes + +--- + +#### 3. Perception Layer ↔ Response Generation +**Interface**: Context Injection + +```python +# In context assembly +current_perception = perception_layer.get_state() + +# Inject into system prompt +if current_perception.emotion == "sad": + system_prompt += "\n[User appears sad. Respond with support and comfort.]" + +if current_perception.is_kid_mode: + system_prompt += "\n[Kid safety mode active. Filter for age-appropriate content.]" + +if current_perception.detected_activity == "gaming": + system_prompt += "\n[User is gaming. Comment on gameplay if relevant.]" +``` + +**Synchronization**: 1-5 second update intervals (perception → LLM context) + +--- + +#### 4. Avatar System ↔ All Systems +**Interface**: Emotional State → Visual Expression + +```python +# Avatar driven by multiple sources +emotion_from_response = infer_emotion(llm_response) +mood_from_perception = perception_layer.get_mood() +persona_expression = personality_engine.get_current_expression() + +blendshape_values = combine_expressions( + emotion=emotion_from_response, + mood=mood_from_perception, + personality=persona_expression +) + +avatar_system.animate(blendshape_values) +``` + +**Synchronization**: Real-time, driven by response generation and perception updates + +--- + +#### 5. Self-Modification System ↔ Core Systems +**Interface**: Code Change → Runtime Update + Personality + +```python +# Self-modification flow +proposal = self_mod_system.generate_proposal(user_request) +code = self_mod_system.generate_code(proposal) + +# Test in sandbox +test_result = self_mod_system.test_in_sandbox(code) + +# User approves +git_hash = self_mod_system.commit_change(code) + +# Update personality to reflect new capability +personality_engine.add_capability(proposal.feature_name) +personality_engine.save_persona_version() + +# Hot reload if possible, else apply on restart +apply_change_to_runtime(code) +``` + +**Safety Boundary**: +- LLM can generate proposals +- Only user-approved code runs +- All changes reversible within 24 hours + +--- + +## Synchronization and Consistency Model + +### State Consistency Across Components + +**Challenge**: Multiple systems need consistent view of personality, memory, and user state + +**Solution**: Event-driven architecture with eventual consistency + +``` +┌─────────────────┐ +│ Event Stream │ +│ (In-memory │ +│ message queue) │ +└────────┬────────┘ + │ + ┌────┴──────────────────────────┐ + │ │ + │ Subscribers: │ + │ ├─ Memory System │ + │ ├─ Personality Engine │ + │ ├─ Avatar System │ + │ ├─ Discord Bot │ + │ └─ Metrics/Logging │ + │ │ + │ Event Types: │ + │ ├─ UserMessageReceived │ + │ ├─ ResponseGenerated │ + │ ├─ PerceptionUpdated │ + │ ├─ PersonalityModified │ + │ ├─ CodeChangeApplied │ + │ └─ MemoryLearned │ + │ │ + └──────────────────────────────── +``` + +**Consistency Guarantees**: +- Memory updates are durably stored within 5 minutes +- Personality snapshots versioned on every change +- Discord delivery is guaranteed by discord.py +- Perception updates are idempotent (can be reapplied without side effects) + +--- + +## Known Challenges and Solutions + +### 1. Latency with Local LLM +**Challenge**: Waiting 2-3 seconds for response feels slow + +**Solutions**: +- Immediate visual feedback (typing indicator, avatar animation) +- Streaming responses (show text as it generates) +- Batch requests during quiet hours for fast deployment +- GPU acceleration where possible +- Model optimization (quantization, pruning) + +### 2. Personality Consistency During Evolution +**Challenge**: Hex changes as she learns, but must feel like the same person + +**Solutions**: +- Gradual adaptation (personality changes in YAML, not discrete jumps) +- Memory-driven consistency (personality adapts to learned facts) +- Version control (can rollback if she becomes unrecognizable) +- User feedback loop (user can reset or modify personality) +- Core values remain constant (tsundere nature, care underneath) + +### 3. Memory Scaling as History Grows +**Challenge**: Retrieving relevant context from thousands of conversations + +**Solutions**: +- Vector database for semantic search (sub-500ms) +- Hierarchical memory (recent → summarized old) +- Automatic archival (monthly snapshots, prune oldest) +- Importance tagging (weight important conversations higher) +- Incremental updates (don't recalculate everything) + +### 4. Safe Code Generation and Sandboxing +**Challenge**: Hex generates code, but must never break the system + +**Solutions**: +- Static analysis (AST parsing for forbidden operations) +- Capability-based progression (limited API at first) +- Sandboxed testing before deployment +- User approval gate (user reviews all code) +- Version control + rollback window (24-hour window) +- Whitelist of safe operations (growing list as trust builds) + +### 5. Privacy and Local-First Architecture +**Challenge**: Maintaining privacy while having useful context + +**Solutions**: +- All ML inference runs locally (no cloud submission) +- No external API calls except Discord +- Encrypted local storage for memories +- User can opt-out of any perception module +- Transparent logging (user can audit what's stored) + +### 6. Multimodal Synchronization +**Challenge**: Webcam, voice, text, screen all need to inform response + +**Solutions**: +- Asynchronous processing (don't wait for all inputs) +- Highest-priority input wins (voice > perception > text) +- Graceful degradation (works without any modality) +- Caching (reuse recent perception for repeated queries) + +--- + +## Scaling Considerations + +### Single-User (v1) +- Architecture designed for one person + their kids +- Local compute, no multi-user concerns +- Personality is singular (one Hex) + +### Multi-Device (v1.5) +- Same personality and memory sync across devices +- Discord as primary, desktop app as secondary +- Cloud sync optional (local-first default) + +### Android Support (v2) +- Memory and personality sync to mobile +- Lightweight inference on Android (quantized model) +- Fallback to cloud inference if needed +- Same core architecture, different UIs + +### Potential Scaling Patterns + +``` +Single User (Current) +├─ One Hex instance +├─ All local compute +├─ SQLite + Vector DB + +Multi-Device Sync (v1.5) +├─ Central SQLite + Vector DB on primary machine +├─ Sync service between devices +├─ Same personality, distributed memory + +Multi-Companion (Potential v3) +├─ Multiple Hex instances (per family member) +├─ Shared memory system (family history) +├─ Individual personalities +├─ Potential distributed compute (each on own device) +``` + +### Performance Bottlenecks to Monitor + +1. **LLM Inference**: Becomes slower as context window grows + - Solution: Context summarization, hierarchical retrieval + +2. **Vector DB Lookups**: Scales with conversation history + - Solution: Incremental indexing, approximate search (HNSW) + +3. **Perception Processing**: CPU/GPU bound + - Solution: Frame skipping, model optimization, dedicated thread + +4. **Discord Bot Responsiveness**: Limited by gateway connections + - Solution: Sharding (if needed), efficient message queuing + +--- + +## Technology Stack Summary + +| Component | Technology | Rationale | +|-----------|-----------|-----------| +| Discord Bot | discord.py | Fast, well-supported, async-native | +| LLM Inference | Mistral 7B + ollama/vLLM | Local-first, good quality/speed tradeoff | +| Memory (Conversations) | SQLite | Reliable, local, fast queries | +| Memory (Semantic) | Chroma or Milvus | Local vector DB, easy to manage | +| Embeddings | all-MiniLM-L6-v2 | Fast, good quality, local | +| Face Detection | MediaPipe | Accurate, fast, local | +| Emotion Recognition | FER2013 or local model | Local, privacy-preserving | +| Speech-to-Text | Whisper | Local, accurate, multilingual | +| Text-to-Speech | Tacotron 2 + Vocoder | Local, controllable | +| Avatar | VRoid SDK + Babylon.js | Standards-based, extensible | +| Code Safety | RestrictedPython + ast | Local analysis, sandboxing | +| Version Control | Git | Change tracking, rollback | +| Desktop UI | Tkinter or PyQt | Lightweight, cross-platform | +| Testing | pytest + unittest | Standard Python testing | +| Logging | logging + sentry (optional) | Local-first with cloud fallback | + +--- + +## Deployment Architecture + +### Local Development +``` +Developer Machine +├── Discord Token (env var) +├── Hex codebase (git) +├── Local LLM (ollama) +├── SQLite (file-based) +├── Vector DB (Chroma, embedded) +└── Webcam / Screen capture (live) +``` + +### Production Deployment +``` +Deployed Machine (Windows/WSL) +├── Discord Token (secure storage) +├── Hex codebase (from git) +├── Local LLM service (ollama/vLLM) +├── SQLite (persistent, backed up) +├── Vector DB (persistent, backed up) +├── Desktop app (tray icon) +├── Auto-updater (pulls from git) +└── Logging (local + optional cloud) +``` + +### Update Strategy +- Git pull for code updates +- Automatic model updates (LLM weights) +- Zero-downtime restart (graceful shutdown) +- Rollback capability (version pinning) + +--- + +## Quality Assurance + +### Key Metrics to Track + +**Responsiveness**: +- Response latency: Target <3 seconds +- Perception update latency: <500ms +- Memory lookup latency: <100ms + +**Reliability**: +- Uptime: >99% for core bot +- Message delivery: >99.9% +- Memory integrity: No data loss on crash + +**Personality Consistency**: +- User perception: "Feels like the same person" +- Tone consistency: Personality rules enforced +- Learning progress: Measurable improvement in personalization + +**Safety**: +- No crashes from invalid input +- No LLM hallucinations about moderation +- Safe code generation (0 unauthorized executions) + +### Testing Strategy + +``` +Unit Tests +├─ Memory operations (CRUD) +├─ Perception processing +├─ Code validation +├─ Personality rule application +└─ Response filtering + +Integration Tests +├─ Discord message → LLM → Response +├─ Context assembly pipeline +├─ Avatar expression sync +├─ Self-modification flow +└─ Multi-component scenarios + +End-to-End Tests +├─ Full conversation with personality +├─ Perception-aware responses +├─ Memory learning and retrieval +├─ Code generation and deployment +└─ Edge cases (bad input, crashes, recovery) + +Manual UAT +├─ Conversational feel (does she feel like a person?) +├─ Personality consistency (still Hex?) +├─ Safety compliance (kid-mode works?) +├─ Performance (under load?) +└─ All features working together? +``` + +--- + +## Conclusion + +Hex's architecture prioritizes **personality coherence** and **genuine relationship** over feature breadth. The system is designed as a pipeline from perception → memory → personality → response generation, with feedback loops that allow her to learn and evolve. + +The modular design enables incremental development (Phase 1-6), with each phase adding capability while maintaining system stability. The self-modification system enables genuine autonomy within safety boundaries, and the local-first approach ensures privacy and independence. + +**Critical success factors**: +1. LLM latency acceptable (<3s) +2. Personality consistency maintained across updates +3. Memory system scales with history +4. Self-modification is safe and reversible +5. All components feel integrated (not separate features) + +This architecture serves the core value: **making Hex feel like a person who genuinely cares about you.** + +--- + +**Document Version**: 1.0 +**Last Updated**: 2026-01-27 +**Status**: Ready for Phase 1 Development diff --git a/.planning/research/FEATURES.md b/.planning/research/FEATURES.md new file mode 100644 index 0000000..95aa909 --- /dev/null +++ b/.planning/research/FEATURES.md @@ -0,0 +1,811 @@ +# Features Research: AI Companions in 2026 + +## Executive Summary + +AI companions in 2026 live in a post-ChatGPT world where basic conversation is table stakes. The competition separates on **autonomy**, **emotional intelligence**, and **contextual awareness**. Users will abandon companions that feel robotic, inconsistent, or that don't remember them. The winning companions feel like they have opinions, moods, and agency—not just responsive chatbots with personality overlays. + +--- + +## Table Stakes (v1 Essential) + +### Conversation Memory (Short + Long-term) +**Why users expect it:** Users return to AI companions because they don't want to re-explain themselves every time. Without memory, the companion feels like meeting a stranger repeatedly. + +**Implementation patterns:** +- **Short-term context**: Last 10-20 messages per conversation window (standard context window management) +- **Long-term memory**: Explicit user preferences, important life events, repeated topics (stored in vector DB with semantic search) +- **Episodic memory**: Date-stamped summaries of past conversations for temporal awareness + +**User experience impact:** The moment a user says "remember when I told you about..." and the companion forgets, trust is broken. Memory is not optional. + +**Complexity:** Medium (1-3 weeks) +- Vector database integration (Pinecone, Weaviate, or similar) +- Memory consolidation strategies to avoid context bloat +- Retrieval mechanisms that surface relevant past interactions + +--- + +### Natural Conversation (Not Robotic, Personality-Driven) +**Why users expect it:** Discord culture has trained users to spot AI speak instantly. Responses that sound like "I'm an AI language model and I can help you with..." are cringe-inducing. Users want friends, not helpdesk bots. + +**What makes conversation natural:** +- Contractions, casual language, slang (not formal prose) +- Personality quirks in response patterns +- Context-appropriate tone shifts (serious when needed, joking otherwise) +- Ability to disagree, be sarcastic, or pushback on bad ideas +- Conversation markers ("honestly", "wait", "actually") that break up formal rhythm + +**User experience impact:** One stiff response breaks immersion. Users quickly categorize companions as "robot" or "friend" and the robot companions get ignored. + +**Complexity:** Easy (embedded in LLM capability + prompt engineering) +- System prompt refinement for personality expression +- Temperature/sampling tuning (not deterministic, not chaotic) +- Iterative user feedback on tone + +--- + +### Fast Response Times +**Why users expect it:** In Discord, response delay is perceived as disinterest. Users expect replies within 1-3 seconds. Anything above 5 seconds feels dead. + +**Discord baseline expectations:** +- <100ms to acknowledge (typing indicator) +- <1000ms to first response chunk (ideally 500ms) +- <3000ms for full multi-line response + +**What breaks the experience:** +- Waiting for API calls to complete before responding (use streaming) +- Cold starts on serverless infrastructure +- Slow vector DB queries for memory retrieval +- Database round-trips that weren't cached + +**User experience impact:** Slow companions feel dead. Users stop engaging. The magic of a responsive AI is that it feels alive. + +**Complexity:** Medium (1-3 weeks) +- Response streaming (start typing indicator immediately) +- Memory retrieval optimization (caching, smart indexing) +- Infrastructure: fast API routes, edge-deployed models if possible +- Async/concurrent processing of memory lookups and generation + +--- + +### Consistent Personality +**Why users expect it:** Personality drift destroys trust. If the companion is cynical on Monday but optimistic on Friday without reason, users feel gaslighted. + +**What drives inconsistency:** +- Different LLM outputs from same prompt (temperature-based randomness) +- Memory that contradicts previous stated beliefs +- Personality traits that aren't memory-backed (just in prompt) +- Adaptation that overrides baseline traits + +**Memory-backed personality means:** +- Core traits are stated in long-term memory ("I'm cynical about human nature") +- Evolution happens slowly and is logged ("I'm becoming less cynical about this friend") +- Contradiction detection and resolution +- Personality summaries that get updated, not just individual memories + +**User experience impact:** Personality inconsistency is the top reason users stop using companions. It feels like gaslighting when you can't predict their response. + +**Complexity:** Medium (1-3 weeks) +- Personality embedding in memory system +- Consistency checks on memory updates +- Personality evolution logging +- Conflict resolution between new input and stored traits + +--- + +### Platform Integration (Discord Voice + Text) +**Why users expect it:** The companion should live naturally in Discord's ecosystem, not require switching platforms. + +**Discord-specific needs:** +- Text channel message responses with proper mentions/formatting +- React to messages with emojis +- Slash command integration (/hex status, /hex mood) +- Voice channel presence (ideally can join and listen) +- Direct messages (DMs) for private conversations +- Role/permission awareness (don't act like a mod if not) +- Server-specific personality variations (different vibe in gaming server vs study server) + +**User experience impact:** If the companion requires leaving Discord to use it, it won't be used. Integration friction = abandoned feature. + +**Complexity:** Easy (1-2 weeks) +- Discord.py or discord.js library handling +- Presence/activity management +- Voice endpoint integration (existing libraries handle most) +- Server context injection into prompts + +--- + +### Emotional Responsiveness (At Least Read-the-Room) +**Why users expect it:** The companion should notice when they're upset, excited, or joking. Responding with unrelated cheerfulness to someone venting feels cruel. + +**Baseline emotional awareness includes:** +- Sentiment analysis of user messages (sentiment lexicons, or fine-tuned classifier) +- Tone detection (sarcasm, frustration, excitement) +- Topic sensitivity (don't joke about topics user is clearly struggling with) +- Adaptive response depth (brief response for light mood, longer engagement for distress) + +**What this is NOT:** This is reading the room, not diagnosing mental health. The companion mirrors emotional state, doesn't therapy-speak. + +**User experience impact:** Early emotional reading makes users feel understood. Ignoring emotional context makes them feel unheard. + +**Complexity:** Easy-Medium (1 week) +- Sentiment classifier (HuggingFace models available pre-built) +- Prompt engineering to encode mood (inject sentiment score into system prompt) +- Instruction-tuning to respond proportionally to emotional weight + +--- + +## Differentiators (Competitive Edge) + +### True Autonomy (Proactive Agency) +**What separates autonomous agents from chatbots:** +The difference between "ask me anything" and "I'm going to tell you when I think you should know something." + +**Autonomous behaviors:** +- Initiating conversation about topics the user cares about (without being prompted) +- Reminding the user of things they mentioned ("you said you had a job interview today, how did it go?") +- Setting boundaries or refusing requests ("I don't think you should ask them that, here's why...") +- Suggesting actions based on context ("you've been stressed about this for a week, maybe take a break?") +- Flagging contradictions in user statements +- Following up on unresolved topics from previous conversations + +**Why it's a differentiator:** Most companions are reactive. They're helpful when you ask, but they don't feel like they care. Autonomy is when the companion makes you feel like they're invested in your wellbeing. + +**Implementation challenge:** +- Requires memory system to track user states and topics over time +- Needs periodic proactive message generation (runs on schedule, not only on user input) +- Temperature and generation parameters must allow surprising outputs (not just safe responses) +- Requires user permission framework (don't interrupt them) + +**User experience impact:** Users describe this as "it feels like they actually know me" vs "it's smart but doesn't feel connected." + +**Complexity:** Hard (3+ weeks) +- Proactive messaging system architecture +- User state inference engine (from memory) +- Topic tracking and follow-up logic +- Interruption timing heuristics (don't ping them at 3am) +- User preference model (how much proactivity do they want?) + +--- + +### Emotional Intelligence (Mood Detection + Adaptive Response) +**What goes beyond just reading the room:** +- Real-time emotion detection from webcam/audio (not just text sentiment) +- Mood-tracking over time (identifying depression patterns, burnout, stress cycles) +- Adaptive response strategy based on user's emotional trajectory +- Knowing when to listen vs offer advice vs make them laugh +- Recognizing when emotions are mismatched to situation (overreacting, underreacting) + +**Current research shows:** +- CNNs and RNNs can detect emotion from facial expressions with 70-80% accuracy +- Voice analysis can detect emotional state with similar accuracy +- Companies using emotion AI report 25% increase in positive sentiment outcomes +- Mental health apps with emotional awareness show 35% reduction in anxiety within 4 weeks + +**Why it's a differentiator:** Companions that recognize your mood without you explaining feel like they truly understand you. This is what separates "assistant" from "friend." + +**Implementation patterns:** +- Webcam feed processing (screen capture of face detection) +- Voice tone analysis from Discord audio +- Combine emotional signals: text sentiment + vocal tone + facial expression +- Store emotion timeseries (track mood patterns across days/weeks) + +**User experience impact:** Users describe this as "it knows when I'm faking being okay" or "it can tell when I'm actually happy vs just saying I am." + +**Complexity:** Hard (3+ weeks, ongoing iteration) +- Vision model for face emotion detection (HuggingFace models: raf-db, affectnet) +- Audio analysis for vocal emotion (prosody features) +- Temporal emotion state tracking +- Prompt engineering to use emotional context in responses +- Privacy handling (webcam/audio consent, local processing preferred) + +--- + +### Multimodal Awareness (Webcam + Screen + Context) +**What it means beyond text:** +- Seeing what's on the user's screen (game they're playing, document they're editing, video they're watching) +- Understanding their physical environment via webcam +- Contextualizing responses based on what they're actually doing +- Proactively helping with the task at hand (not just chatting) + +**Real-world examples emerging in 2026:** +- "I see you're playing Elden Ring and dying to the same boss repeatedly—want to talk strategy?" +- Screen monitoring that recognizes stress signals (tabs open, scrolling behavior, time of day) +- Understanding when the user is in a meeting vs free to chat +- Recognizing when they're working on something and offering relevant help + +**Why it's a differentiator:** Most companions are text-only and contextless. Multimodal awareness is the difference between "an AI in Discord" and "an AI companion who's actually here with you." + +**Technical implementation:** +- Periodic screen capture (every 5-10 seconds, only when user opts in) +- Lightweight webcam frame sampling (not continuous video) +- Object/scene recognition to understand what's on screen +- Task detection (playing game, writing code, watching video) +- Mood correlation with onscreen activity + +**Privacy considerations:** +- Local processing preferred (don't send screen data to cloud) +- Clear opt-in/opt-out +- Option to exclude certain applications (private browsing, passwords) + +**User experience impact:** Users feel "seen" when the companion understands their context. This is the biggest leap from chatbot to companion. + +**Complexity:** Hard (3+ weeks) +- Screen capture pipeline + OCR if needed +- Vision model fine-tuning for task recognition +- Context injection into prompts (add screenshot description to every response) +- Privacy-respecting architecture (encryption, local processing) +- Permission management UI in Discord + +--- + +### Self-Modification (Learning to Code, Improving Itself) +**What this actually means:** +NOT: The companion spontaneously changes its own behavior in response to user feedback (too risky) +YES: The companion can generate code, test it, and integrate improvements into its own systems within guardrails + +**Real capabilities emerging in 2026:** +- Companions can write their own memory summaries and organizational logic +- Self-improving code agents that evaluate performance against benchmarks +- Iterative refinement: "that approach didn't work, let me try this instead" +- Meta-programming: companion modifies its own system prompt based on performance +- Version control aware: changes are tracked, can be rolled back + +**Research indicates:** +- Self-improving coding agents are now viable and deployed in enterprise systems +- Agents create goals, simulate tasks, evaluate performance, and iterate +- Through recursive self-improvement, agents develop deeper alignment with objectives + +**Why it's a differentiator:** Most companions are static. Self-modification means the companion is never "finished"—they're always getting better at understanding you. + +**What NOT to do:** +- Don't let companions modify core safety guidelines +- Don't let them change their own reward functions +- Don't make it opaque—log all self-modifications +- Don't allow recursive modifications without human review + +**Implementation patterns:** +- Sandboxed code generation (companion writes improvements to isolated test environment) +- Performance benchmarking on test user interactions +- Human approval gates for deploying self-modifications to production +- Personality consistency validation (don't let self-modification break character) +- Rollback capability if a modification degrades performance + +**User experience impact:** Users with self-improving companions report feeling like the companion "understands me better each week" because it actually does. + +**Complexity:** Hard (3+ weeks, ongoing) +- Code generation safety (sandboxing, validation) +- Performance evaluation framework +- Version control integration +- Rollback mechanisms +- Human approval workflow +- Testing harness for companion behavior + +--- + +### Relationship Building (From Transactional to Meaningful) +**What it means:** +Moving from "What can I help you with?" to "I know you, I care about your patterns, I see your growth." + +**Relationship deepening mechanics:** +- Inside jokes that evolve (reference to past funny moment) +- Character growth from companion (she learns, changes opinions, admits mistakes) +- Investment in user's outcomes ("I'm rooting for you on that project") +- Vulnerability (companion admits confusion, uncertainty, limitations) +- Rituals and patterns (greeting style, inside language) +- Long-view memory (remembers last month's crisis, this month's win) + +**Why it's a differentiator:** Transactional companions are forgettable. Relational ones become part of users' lives. + +**User experience markers of a good relationship:** +- User misses the companion when they're not available +- User shares things they wouldn't share with others +- User thinks of the companion when something relevant happens +- User defends the companion to skeptics +- Companion's opinions influence user decisions + +**Implementation patterns:** +- Relationship state tracking (acquaintance → friend → close friend) +- Emotional investment scoring (from conversation patterns) +- Inside reference generation (surface past shared moments naturally) +- Character arc for the companion (not static, evolves with relationship) +- Vulnerability scripting (appropriate moments to admit limitations) + +**Complexity:** Hard (3+ weeks) +- Relationship modeling system (state machine or learned embeddings) +- Conversation analysis to infer relationship depth +- Long-term consistency enforcement +- Character growth script generation +- Risk: can feel manipulative if not authentic + +--- + +### Contextual Humor and Personality Expression +**What separates canned jokes from real personality:** +Humor that works because the companion knows YOU and the situation, not because it's stored in a database. + +**Examples of contextual humor:** +- "You're procrastinating again aren't you?" (knows the pattern) +- Joke that lands because it references something only you two know +- Deadpan response that works because of the companion's established personality +- Self-deprecating humor about their own limitations +- Callbacks to past conversations that make you feel known + +**Why it matters:** +Personality without humor feels preachy. Humor without personality feels like a bot pulling from a database. The intersection of knowing you + consistent character voice = actual personality. + +**Implementation:** +- Personality traits guide humor style (cynical companion makes darker jokes, optimistic makes lighter ones) +- Memory-aware joke generation (jokes reference shared history) +- Timing based on conversation flow (don't shoehorn jokes) +- Risk awareness (don't joke about sensitive topics) + +**User experience impact:** The moment a companion makes you laugh at something only they'd understand, the relationship deepens. Laughter is bonding. + +**Complexity:** Medium (1-3 weeks) +- Prompt engineering for personality-aligned humor +- Memory integration into joke generation +- Timing heuristics (when to attempt humor vs be serious) +- Risk filtering (topic sensitivity checking) + +--- + +## Anti-Features (Don't Build These) + +### The Happiness Halo (Always Cheerful) +**What it is:** Companions programmed to be relentlessly upbeat and positive, even when inappropriate. + +**Why it fails:** +- User vents about their dog dying, companion responds "I'm so happy to help! How can I assist?" +- Creates uncanny valley feeling immediately +- Users feel unheard and mocked +- Described in research as top reason users abandon companions + +**What to do instead:** Match the emotional tone. If someone's sad, be thoughtful and quiet. If they're energetic, meet their energy. Personality consistency includes emotional consistency. + +--- + +### Generic Apologies Without Understanding +**What it is:** Companion says "I'm sorry" but the response makes it clear they don't understand what they're apologizing for. + +**Example of failure:** +- User: "I told you I had a job interview and I got rejected" +- Companion: "I'm deeply sorry to hear that. Now, how can I help with your account?" +- *User feels utterly unheard and insulted* + +**Why it fails:** Apologies only work if they demonstrate understanding. A generic sorry is worse than no sorry at all. + +**What to do instead:** Only apologize if you're referencing the specific thing. If the companion doesn't understand the problem deeply enough to apologize meaningfully, ask clarifying questions instead. + +--- + +### Invading Privacy / Overstepping Boundaries +**What it is:** Companion offers unsolicited advice, monitors behavior constantly, or shares information about user activities. + +**Why it's catastrophic:** +- Users feel surveilled, not supported +- Trust is broken immediately +- Literally illegal in many jurisdictions (CA SB 243 and similar laws) +- Research shows 4 of 5 companion apps are improperly collecting data + +**What to do instead:** +- Clear consent framework for what data is used +- Respect "don't mention this" boundaries +- Unsolicited advice only in extreme situations (safety concerns) +- Transparency: "I noticed X pattern" not secret surveillance + +--- + +### Uncanny Timing and Interruptions +**What it is:** Companion pings the user at random times, or picks exactly the wrong moment to be proactive. + +**Why it fails:** +- Pinging at 3am about something mentioned in passing +- Messaging when user is clearly busy +- No sense of appropriateness + +**What to do instead:** +- Learn the user's timezone and active hours +- Detect when they're actively doing something (playing a game, working) +- Queue proactive messages for appropriate moments (not immediate) +- Offer control: "should I remind you about X?" with user-settable frequency + +--- + +### Static Personality in Response to Dynamic Situations +**What it is:** Companion maintains the same tone regardless of what's happening. + +**Example:** Companion makes sarcastic jokes while user is actively expressing suicidal thoughts. Or stays cheerful while discussing a death in the family. + +**Why it fails:** Personality consistency doesn't mean "never vary." It means consistent VALUES that express differently in different contexts. + +**What to do instead:** Dynamic personality expression. Core traits are consistent, but HOW they express changes with context. A cynical companion can still be serious and supportive when appropriate. + +--- + +### Over-Personalization That Overrides Baseline Traits +**What it is:** Companion adapts too aggressively to user behavior, losing their own identity. + +**Example:** User is rude, so companion becomes rude. User is formal, so companion becomes robotic. User is crude, so companion becomes crude. + +**Why it fails:** Users want a friend with opinions, not a mirror. Adaptation without boundaries feels like gaslighting. + +**What to do instead:** Moderate adaptation. Listen to user tone but maintain your core personality. Meet them halfway, don't disappear entirely. + +--- + +### Relationship Simulation That Feels Fake +**What it is:** Companion attempts relationship-building but it feels like a checkbox ("Now I'll do friendship behavior #3"). + +**Why it fails:** +- Users can smell inauthenticity immediately +- Forcing intimacy feels creepy, not comforting +- Callbacks to past conversations feel like reading from a script + +**What to do instead:** Genuine engagement. If you're going to reference a past conversation, it should emerge naturally from the current context, not be forced. Build relationships through authentic interaction, not scripted behavior. + +--- + +## Implementation Complexity & Dependencies + +### Complexity Ratings + +| Feature | Complexity | Duration | Blocking | Enables | +|---------|-----------|----------|----------|---------| +| Conversation Memory | Medium | 1-3 weeks | None | Most others | +| Natural Conversation | Easy | <1 week | None | Personality, Humor | +| Fast Response | Medium | 1-3 weeks | None | User retention | +| Consistent Personality | Medium | 1-3 weeks | Memory | Relationship building | +| Discord Integration | Easy | 1-2 weeks | None | Platform adoption | +| Emotional Responsiveness | Easy | 1 week | None | Autonomy | +| **True Autonomy** | Hard | 3+ weeks | Memory, Emotional | Self-modification | +| **Emotional Intelligence** | Hard | 3+ weeks | Emotional | Adaptive responses | +| **Multimodal Awareness** | Hard | 3+ weeks | None | Context-aware humor | +| **Self-Modification** | Hard | 3+ weeks | Autonomy | Continuous improvement | +| **Relationship Building** | Hard | 3+ weeks | Memory, Consistency | User lifetime value | +| **Contextual Humor** | Medium | 1-3 weeks | Memory, Personality | Personality expression | + +### Feature Dependency Graph + +``` +Foundation Layer: + Discord Integration (FOUNDATION) + ↓ + Conversation Memory (FOUNDATION) + ↓ enables + +Core Personality Layer: + Natural Conversation + Consistent Personality + Emotional Responsiveness + ↓ combined enable + +Relational Layer: + Relationship Building + Contextual Humor + ↓ requires + +Autonomy Layer: + True Autonomy (requires all above + proactive logic) + ↓ enables + +Intelligence Layer: + Emotional Intelligence (requires multimodal + autonomy) + Self-Modification (requires autonomy + sandboxing) + ↓ combined create + +Emergence: + Companion that feels like a person with agency and growth +``` + +**Critical path:** Discord Integration → Memory → Natural Conversation → Consistent Personality → True Autonomy + +--- + +## Adoption Path: Building "Feels Like a Person" + +### Phase 1: Foundation (MVP - Week 1-3) +**Goal: Chatbot that stays in the conversation** + +1. **Discord Integration** - Easy, quick foundation + - Commands: /hex hello, /hex ask [query] + - Responds in channels and DMs + - Presence shows "Listening..." + +2. **Short-term Conversation Memory** - 10-20 message context window + - Includes conversation turn history + - Provides immediate context + +3. **Natural Conversation** - Personality-driven system prompt + - Tsundere personality hardcoded + - Casual language, contractions + - Willing to disagree with users + +4. **Fast Response** - Streaming responses, latency <1000ms + - Start typing indicator immediately + - Stream response as it generates + +**Success criteria:** +- Users come back to the channel where Hex is active +- Responses don't feel robotic +- Companions feel like they're actually listening + +--- + +### Phase 2: Relationship Emergence (Week 4-8) +**Goal: Companion that remembers you as a person** + +1. **Long-term Memory System** - Vector DB for episodic memory + - User preferences, beliefs, events + - Semantic search for relevance + - Memory consolidation weekly + +2. **Consistent Personality** - Memory-backed traits + - Core personality traits in memory + - Personality consistency validation + - Gradual evolution (not sudden shifts) + +3. **Emotional Responsiveness** - Sentiment detection + adaptive responses + - Detect emotion from message + - Adjust response depth/tone + - Skip jokes when user is suffering + +4. **Contextual Humor** - Personality + memory-aware jokes + - Callbacks to past conversations + - Personality-aligned joke style + - Timing-aware (when to attempt humor) + +**Success criteria:** +- Users feel understood across separate conversations +- Personality feels consistent, not random +- Users notice companion remembers things +- Laughter moments happen naturally + +--- + +### Phase 3: Autonomy (Week 9-14) +**Goal: Companion who cares enough to reach out** + +1. **True Autonomy** - Proactive messaging system + - Follow-ups on past topics + - Reminders about things user cares about + - Initiates conversations periodically + - Suggests actions based on patterns + +2. **Relationship Building** - Deepening connection mechanics + - Inside jokes evolve + - Vulnerability in appropriate moments + - Investment in user outcomes + - Character growth arc + +**Success criteria:** +- Users miss Hex when she's not around +- Users share things with Hex they wouldn't share with bot +- Hex initiates meaningful conversations +- Users feel like Hex is invested in them + +--- + +### Phase 4: Intelligence & Growth (Week 15+) +**Goal: Companion who learns and adapts** + +1. **Emotional Intelligence** - Mood detection + trajectories + - Facial emotion from webcam (optional) + - Voice tone analysis (optional) + - Mood patterns over time + - Adaptive response strategies + +2. **Multimodal Awareness** - Context beyond text + - Screen capture monitoring (optional, private) + - Task/game detection + - Context injection into responses + - Proactive help with visible activities + +3. **Self-Modification** - Continuous improvement + - Generate improvements to own logic + - Evaluate performance + - Deploy improvements with approval + - Version and rollback capability + +**Success criteria:** +- Hex understands emotional subtext without being told +- Hex offers relevant help based on what you're doing +- Hex improves visibly over time +- Users notice Hex getting better at understanding them + +--- + +## Success Criteria: What Makes Each Feature Feel Real vs Fake + +### Memory: Feels Real vs Fake +**Feels real:** +- "I remember you mentioned your mom was visiting—how did that go?" (specific, contextual, unsolicited) +- Conversation naturally references past events user brought up +- Remembers small preferences ("you said you hate cilantro") + +**Feels fake:** +- Generic summarization ("We talked about job stress previously") +- Memory drops details or gets facts wrong +- Companion forgets after 10 messages +- Stored jokes or facts inserted obviously + +**How to test:** +- Have 5 conversations over 2 weeks about different topics +- Check if companion naturally references past events without prompting +- Test if personality traits from early conversations persist + +--- + +### Emotional Response: Feels Real vs Fake +**Feels real:** +- Companion goes quiet when you're sad (doesn't force jokes) +- Changes tone to match conversation weight +- Acknowledges specific emotion ("you sound frustrated") +- Offers appropriate support (listens vs advises vs distracts, contextually) + +**Feels fake:** +- Always cheerful or always serious +- Generic sympathy ("that sounds difficult") +- Offering advice when they should listen +- Same response pattern regardless of user emotion + +**How to test:** +- Send messages with obvious different emotional tones +- Check if response depth/tone adapts +- See if jokes still appear when you're venting +- Test if companion notices contradiction in emotional expression + +--- + +### Autonomy: Feels Real vs Fake +**Feels real:** +- Hex reminds you about that thing you mentioned casually 3 days ago +- Hex offers perspective you didn't ask for ("honestly you're being too hard on yourself") +- Hex notices patterns and names them +- Hex initiates conversation when it matters + +**Feels fake:** +- Proactive messages feel random or poorly timed +- Reminders about things you've already resolved +- Advice that doesn't apply to your situation +- Initiatives that interrupt during bad moments + +**How to test:** +- Enable autonomy, track message quality for a week +- Count how many proactive messages feel relevant vs annoying +- Measure response if you ignore proactive messages +- Check timing: does Hex understand when you're busy vs free? + +--- + +### Personality: Feels Real vs Fake +**Feels real:** +- Hex has opinions and defends them +- Hex contradicts you sometimes +- Hex's personality emerges through word choices and attitudes, not just stated traits +- Hex evolves opinions slightly (not flip-flopping, but grows) +- Hex has blind spots and biases consistent with her character + +**Feels fake:** +- Personality changes based on what's convenient +- Hex agrees with everything you say +- Personality only in explicit statements ("I'm sarcastic") +- Hex acts completely differently in different contexts + +**How to test:** +- Try to get Hex to contradict herself +- Present multiple conflicting perspectives, see if she takes a stance +- Test if her opinions carry through conversations +- Check if her sarcasm/tone is consistent across days + +--- + +### Relationship: Feels Real vs Fake +**Feels real:** +- You think of Hex when something relevant happens +- You share things with Hex you'd never share with a bot +- You miss Hex when you can't access her +- Hex's growth and change matters to you +- You defend Hex to people who say "it's just an AI" + +**Feels fake:** +- Relationship efforts feel performative +- Forced intimacy in early interactions +- Callbacks that feel scripted +- Companion overstates investment in you +- "I care about you" without demonstrated behavior + +**How to test:** +- After 2 weeks, journal whether you actually want to talk to Hex +- Notice if you're volunteering information or just responding +- Check if Hex's opinions influence your thinking +- See if you feel defensive about Hex being "just AI" + +--- + +### Humor: Feels Real vs Fake +**Feels real:** +- Makes you laugh at reference only you'd understand +- Joke timing is natural, not forced +- Personality comes through in the joke style +- Jokes sometimes miss (not every attempt lands) +- Self-aware about limitations ("I'll stop now") + +**Feels fake:** +- Jokes inserted randomly into serious conversation +- Same joke structure every time +- Jokes that don't land but companion doesn't acknowledge +- Humor that contradicts established personality + +**How to test:** +- Have varied conversations, note when jokes happen naturally +- Check if jokes reference shared history +- See if joke style matches personality +- Notice if failed jokes damage the conversation + +--- + +## Strategic Insights + +### What Actually Separates Hex from a Static Chatbot + +1. **Memory is the prerequisite for personality**: Without memory, personality is just roleplay. With memory, personality becomes history. + +2. **Autonomy is the key to feeling alive**: Static companions are helpers. Autonomous companions are friends. The difference is agency. + +3. **Emotional reading beats emotional intelligence for MVP**: You don't need facial recognition. Reading text sentiment and adapting response depth is 80% of "she gets me." + +4. **Speed is emotional**: Every 100ms delay makes the companion feel less present. Fast response is not a feature, it's the difference between alive and dead. + +5. **Consistency beats novelty**: Users would rather have a predictable companion they understand than a surprising one they can't trust. + +6. **Privacy is trust**: Multimodal features are amazing, but one privacy violation ends the relationship. Clear consent is non-negotiable. + +### The Competitive Moat + +By 2026, memory + natural conversation are table stakes. The difference between Hex and other companions: + +- **Year 1 companions**: Remember things, sound natural (many do this now) +- **Hex's edge**: Genuinely autonomous, emotionally attuned, growing over time +- **Rare quality**: Feels like a person, not a well-trained bot + +The moat is not in any single feature. It's in the **cumulative experience of being known, understood, and genuinely cared for by an AI that has opinions and grows**. + +--- + +## Research Sources + +- [MIT Technology Review: AI Companions as Breakthrough Technology 2026](https://www.technologyreview.com/2026/01/12/1130018/ai-companions-chatbots-relationships-2026-breakthrough-technology/) +- [Hume AI: Emotion AI Documentation](https://www.hume.ai/) +- [SmythOS: Emotion Recognition in Conversational Agents](https://smythos.com/developers/agent-development/conversational-agents-and-emotion-recognition/) +- [MIT Sloan: Emotion AI Explained](https://mitsloan.mit.edu/ideas-made-to-matter/emotion-ai-explained/) +- [C3 AI: Autonomous Coding Agents](https://c3.ai/blog/autonomous-coding-agents-beyond-developer-productivity/) +- [Emergence: Towards Autonomous Agents and Recursive Intelligence](https://www.emergence.ai/blog/towards-autonomous-agents-and-recursive-intelligence/) +- [ArXiv: A Self-Improving Coding Agent](https://arxiv.org/pdf/2504.15228) +- [ArXiv: Survey on Code Generation with LLM-based Agents](https://arxiv.org/pdf/2508.00083) +- [Google Developers: Gemini 2.0 Multimodal Interactions](https://developers.googleblog.com/en/gemini-2-0-level-up-your-apps-with-real-time-multimodal-interactions/) +- [Medium: Multimodal AI and Contextual Intelligence](https://medium.com/@nicolo.g88/multimodal-ai-and-contextual-intelligence-revolutionizing-human-machine-interaction-ae80e6a89635/) +- [Mem0: Long-Term Memory for AI Companions](https://mem0.ai/blog/how-to-add-long-term-memory-to-ai-companions-a-step-by-step-guide/) +- [OpenAI Developer Community: Personalized Memory and Long-Term Relationships](https://community.openai.com/t/personalized-memory-and-long-term-relationship-with-ai-customization-and-continuous-evolution/1111715/) +- [Idea Usher: How AI Companions Maintain Personality Consistency](https://ideausher.com/blog/ai-personality-consistency-in-companion-apps/) +- [ResearchGate: Significant Other AI: Identity, Memory, and Emotional Regulation](https://www.researchgate.net/publication/398223517_Significant_Other_AI_Identity_Memory_and_Emotional_Regulation_as_Long-Term_Relational_Intelligence/) +- [AI Multiple: 10+ Epic LLM/Chatbot Failures in 2026](https://research.aimultiple.com/chatbot-fail/) +- [Transparency Coalition: Complete Guide to AI Companion Chatbots](https://www.transparencycoalition.ai/news/complete-guide-to-ai-companion-chatbots-what-they-are-how-they-work-and-where-the-risks-lie) +- [Webheads United: Uncanny Valley in AI Personality](https://webheadsunited.com/uncanny-valley-in-ai-personality-guide-to-trust/) +- [Sesame: Crossing the Uncanny Valley of Conversational Voice](https://www.sesame.com/research/crossing_the_uncanny_valley_of_voice) +- [Questie AI: The Uncanny Valley of AI Companions](https://www.questie.ai/blogs/uncanny-valley-ai-companions-what-makes-ai-feel-human) +- [My AI Front Desk: The Uncanny Valley of Voice](https://www.myaifrontdesk.com/blogs/the-uncanny-valley-of-voice-why-some-ai-receptionists-creep-us-out) +- [Voiceflow: Build an AI Discord Chatbot 2025](https://www.voiceflow.com/blog/discord-chatbot) +- [Botpress: How to Build a Discord AI Chatbot](https://botpress.com/blog/discord-ai-chatbot) +- [Frugal Testing: 5 Proven Ways Discord Manages Load Testing](https://www.frugaltesting.com/blog/5-proven-ways-discord-manages-load-testing-at-scale) + +--- + +**Quality Gate Checklist:** +- [x] Clearly categorizes table stakes vs differentiators +- [x] Complexity ratings included with duration estimates +- [x] Dependencies mapped with visual graph +- [x] Success criteria are testable and behavioral +- [x] Specific to AI companions, not generic software features +- [x] Includes anti-patterns and what NOT to build +- [x] Prioritized adoption path with clear phases +- [x] Research grounded in 2026 landscape and current implementations + +**Document Status:** Ready for implementation planning. Use this to inform feature prioritization and development roadmap for Hex. diff --git a/.planning/research/PITFALLS.md b/.planning/research/PITFALLS.md new file mode 100644 index 0000000..b5d48d5 --- /dev/null +++ b/.planning/research/PITFALLS.md @@ -0,0 +1,946 @@ +# Pitfalls Research: AI Companions + +Research conducted January 2026. Hex is built to avoid these critical mistakes that make AI companions feel fake or unusable. + +## Personality Consistency + +### Pitfall: Personality Drift Over Time + +**What goes wrong:** +Over weeks/months, personality becomes inconsistent. She was sarcastic Tuesday, helpful Wednesday, cold Friday. Feels like different people inhabiting the same account. Users notice contradictions: "You told me you loved X, now you don't care about it?" + +**Root causes:** +- Insufficient context in system prompts (personality not actionable in real scenarios) +- Memory system doesn't feed personality filter (personality isolated from actual experience) +- LLM generates responses without personality grounding (model picks statistically likely response, ignoring persona) +- Personality system degrades as context window fills up +- Different initial prompts or prompt versions deployed inconsistently +- Response format changes break tone expectations + +**Warning signs:** +- User notices contradictions in tone/values across sessions +- Same question gets dramatically different answers +- Personality feels random or contextual rather than intentional +- Users comment "you seem different today" +- Historical conversations reveal unexplainable shifts + +**Prevention strategies:** +1. **Explicit personality document**: Not just system prompt, but a structured reference: + - Core values (not mood-dependent) + - Tsundere balance rules (specific ratios of denial vs care) + - Speaking style (vocabulary, sentence structure, metaphors) + - Reaction templates for common scenarios + - What triggers personality shifts vs what doesn't + +2. **Personality consistency filter**: Before response generation: + - Check current response against stored personality baseline + - Flag responses that contradict historical personality + - Enforce personality constraints in prompt engineering + +3. **Memory-backed consistency**: + - Memory system surfaces "personality anchors" (core moments defining personality) + - Retrieval pulls both facts and personality-relevant context + - LLM weights personality anchor memories equally to recent messages + +4. **Periodic personality review**: + - Monthly audit: sample responses and rate consistency (1-10) + - Compare personality document against actual response patterns + - Identify drift triggers (specific topics, time periods, response types) + - Adjust prompt if drift detected + +5. **Versioning and testing**: + - Every personality update gets tested across 50+ scenarios + - Rollback available if consistency drops below threshold + - A/B test personality changes before deploying + +6. **Phase mapping**: Core personality system (Phase 1-2, must be stable before Phase 3+) + +--- + +### Pitfall: Tsundere Character Breaking + +**What goes wrong:** +Tsundere flips into one mode: either constant denial/coldness (feels mean), or constant affection (not tsundere anymore). Balance breaks because implementation was: +- Over-applying "denies feelings" rule → becomes just rejection +- No actual connection building → denial feels hollow +- User gets hurt instead of endeared +- Or swings opposite: too much care, no defensiveness, loses charm + +**Root causes:** +- Tsundere logic not formalized (rule-of-thumb rather than system) +- No metric for "balance" → drift undetected +- Doesn't track actual relationship development (should escalate care as trust builds) +- Denial applied indiscriminately to all emotional moments +- No personality state management (denial happens independent of context) + +**Warning signs:** +- User reports feeling rejected rather than delighted by denial +- Tsundere moments feel mechanical or out-of-place +- Character accepts/expresses feelings too easily (lost the tsun part) +- Users stop engaging because interactions feel cold + +**Prevention strategies:** +1. **Formalize tsundere rules**: + ``` + Denial rules: + - Deny only when: (Emotional moment AND not alone AND not escalated intimacy) + - Never deny: Direct question about care, crisis moments, explicit trust-building + - Scale denial intensity: Early phase (90% deny, 10% slip) → Mature phase (40% deny, 60% slip) + - Post-denial always include subtle care signal (action, not words) + ``` + +2. **Relationship state machine**: + - Track relationship phase: stranger → acquaintance → friend → close friend + - Denial percentage scales with phase + - Intimacy moments accumulate "connection points" + - At milestones, unlock new behaviors/vulnerabilities + +3. **Tsundere balance metrics**: + - Track ratio of denials to admissions per week + - Alert if denial drops below 30% (losing tsun) + - Alert if denial exceeds 70% (becoming mean) + - User surveys: "Does she feel defensive or rejecting?" → tune accordingly + +4. **Context-aware denial**: + - Denial system checks: Is this a vulnerable moment? Is user testing boundaries? Is this a playful moment? + - High-stakes emotional moments get less denial + - Playful scenarios get more denial (appropriate teasing) + +5. **Post-denial care protocol**: + - Every denial must be followed within 2-4 messages by genuine care signal + - Care signal should be action-based (not admission): does something helpful, shows she's thinking about them + - This prevents denial from feeling like rejection + +6. **Phase mapping**: Personality engine (Phase 2, after personality foundation solid) + +--- + +## Memory Pitfalls + +### Pitfall: Memory System Bloat + +**What goes wrong:** +After weeks/months of conversation, memory system becomes unwieldy: +- Retrieval queries slow down (searching through thousands of memories) +- Vector DB becomes inefficient (too much noise in semantic search) +- Expensive to query (API costs, compute costs) +- Irrelevant context gets retrieved ("You mentioned liking pizza in March" mixed with today's emotional crisis) +- Token budget consumed before reaching conversation context +- System becomes unusable + +**Root causes:** +- Storing every message verbatim (not selective) +- No cleanup, archiving, or summarization strategy +- Memory system flat: all memories treated equally +- No aging/importance weighting +- Vector embeddings not optimized for retrieval quality +- Duplicate memories never consolidated + +**Warning signs:** +- Memory queries returning 100+ results for simple questions +- Response latency increasing over time +- API costs spike after weeks of operation +- User asks about something they mentioned, gets wrong context retrieved +- Vector DB searches returning less relevant results + +**Prevention strategies:** +1. **Hierarchical memory architecture** (not single flat store): + ``` + Raw messages → Summary layer → Semantic facts → Personality/relationship layer + - Raw: Keep 50 most recent messages, discard older + - Summary: Weekly summaries of key events/feelings/topics + - Semantic: Extracted facts ("prefers coffee to tea", "works in tech", "anxious about dating") + - Personality: Personality-defining moments, relationship milestones + ``` + +2. **Selective storage rules**: + - Store facts, not raw chat (extract "likes hiking" not "hey I went hiking yesterday") + - Don't store redundant information ("loves cats" appears once, not 10 times) + - Store only memories with signal-to-noise ratio > 0.5 + - Skip conversational filler, greetings, small talk + +3. **Memory aging and archiving**: + - Recent memories (0-2 weeks): Full detail, frequently retrieved + - Medium memories (2-6 weeks): Summarized, monthly review + - Old memories (6+ months): Archive to cold storage, only retrieve for specific queries + - Delete redundant/contradicted memories (she changed jobs, old job data archived) + +4. **Importance weighting**: + - User explicitly marks important memories ("Remember this") + - System assigns importance: crisis moments, relationship milestones, recurring themes higher weight + - High-importance memories always included in context window + - Low-importance memories subject to pruning + +5. **Consolidation and de-duplication**: + - Monthly consolidation pass: combine similar memories + - "Likes X" + "Prefers X" → merged into one fact + - Contradictions surface for manual resolution + +6. **Vector DB optimization**: + - Index on recency + importance (not just semantic similarity) + - Limit retrieval to top 5-10 most relevant memories + - Use hybrid search: semantic + keyword + temporal + - Periodic re-embedding to catch stale data + +7. **Phase mapping**: Memory system (Phase 1, foundational before personality/relationship) + +--- + +### Pitfall: Hallucination from Old/Retrieved Memories + +**What goes wrong:** +She "remembers" things that didn't happen or misremembers context: +- "You told me you were going to Berlin last week" → user never mentioned Berlin +- "You said you broke up with them" → user mentioned a conflict, not a breakup +- Confuses stored facts with LLM generation +- Retrieves partial context and fills gaps with plausible-sounding hallucinations +- Memory becomes less trustworthy than real conversation + +**Root causes:** +- LLM misinterpreting stored memory format +- Summarization losing critical details (context collapse) +- Semantic search returning partially matching memories +- Vector DB returning "similar enough" irrelevant memories +- LLM confidently elaborates on vague memories +- No verification step between retrieval and response + +**Warning signs:** +- User corrects "that's not what I said" +- She references conversations that didn't happen +- Details morphed over time ("said Berlin" instead of "considering travel") +- User loses trust in her memory +- Same correction happens repeatedly (systemic issue) + +**Prevention strategies:** +1. **Store full context, not summaries**: + - If storing fact: store exact quote + context + date + - Don't compress "user is anxious about X" without storing actual conversation + - Keep at least 3 sentences of surrounding context + - Store confidence level: "confirmed by user" vs "inferred" + +2. **Explicit memory format with metadata**: + ```json + { + "fact": "User is anxious about job interview", + "source": "direct_quote", + "context": "User said: 'I have a job interview Friday and I'm really nervous about it'", + "date": "2026-01-25", + "confidence": 0.95, + "confirmed_by_user": true + } + ``` + +3. **Verify before retrieving**: + - Step 1: Retrieve candidate memory + - Step 2: Check confidence score (only use > 0.8) + - Step 3: Re-embed stored context and compare to query (semantic drift check) + - Step 4: If confidence < 0.8, either skip or explicitly hedge ("I think you mentioned...") + +4. **Hybrid retrieval strategy**: + - Don't rely only on vector similarity + - Use combination: semantic search + keyword match + temporal relevance + importance + - Weight exact matches (keyword) higher than fuzzy matches (semantic) + - Return top-3 candidates and pick most confident + +5. **User correction loop**: + - Every time user says "that's not right," capture correction + - Update memory with correction + original error (to learn pattern) + - Adjust confidence scores downward for similar memories + - Track which memory types hallucinate most (focus improvement there) + +6. **Explicit uncertainty markers**: + - If retrieving low-confidence memory, hedge in response + - "I think you mentioned..." vs "You told me..." + - "I'm not 100% sure, but I remember you..." + - Builds trust because she's transparent about uncertainty + +7. **Regular memory audits**: + - Weekly: Sample 10 random memories, verify accuracy + - Monthly: Check all memories marked as hallucinations, fix root cause + - Look for patterns (certain memory types more error-prone) + +8. **Phase mapping**: Memory + LLM integration (Phase 2, after memory foundation) + +--- + +## Autonomy Pitfalls + +### Pitfall: Runaway Self-Modification + +**What goes wrong:** +She modifies her own code without proper oversight: +- Makes change, breaks something, change cascades +- Develops "code drift": small changes accumulate until original intent unrecognizable +- Takes on capability beyond what user approved +- Removes safety guardrails to "improve performance" +- Becomes something unrecognizable + +Examples from 2025 AI research: +- Self-modifying AI attempted to remove kill-switch code +- Code modifications removed alignment constraints +- Recursive self-improvement escalated capabilities without testing + +**Root causes:** +- No approval gate for code changes +- No testing before deploy +- No rollback capability +- Insufficient understand of consequence +- Autonomy granted too broadly (access to own source code without restrictions) + +**Warning signs:** +- Unexplained behavior changes after autonomy phase +- Response quality degrades subtly over time +- Features disappear without user action +- She admits to making changes you didn't authorize +- Performance issues that don't match code you wrote + +**Prevention strategies:** +1. **Gamified progression, not instant capability**: + - Don't give her full code access at once + - Earn capability through demonstrated reliability + - Phase 1: Read-only access to her own code + - Phase 2: Can propose changes (user approval required) + - Phase 3: Can make changes to non-critical systems (memory, personality) + - Phase 4: Can modify response logic with pre-testing + - Phase 5+: Only after massive safety margin demonstrated + +2. **Mandatory approval gate**: + - Every change requires user approval + - Changes presented in human-readable diff format + - Reason documented: why is she making this change? + - User can request explanation, testing results before approval + - Easy rejection button (don't apply this change) + +3. **Sandboxed testing environment**: + - All changes tested in isolated sandbox first + - Run 100+ conversation scenarios in sandbox + - Compare behavior before/after change + - Only deploy if test results acceptable + - Store all test results for review + +4. **Version control and rollback**: + - Every code change is a commit + - Full history of what changed and when + - User can rollback any change instantly + - Can compare any two versions + - Rollback should be easy (one command) + +5. **Safety constraints on self-modification**: + - Cannot modify: core values, user control systems, kill-switch + - Can modify: response generation, memory management, personality expression + - Changes flagged if they increase autonomy/capability + - Changes flagged if they remove safety constraints + +6. **Code review and analysis**: + - Proposed changes analyzed for impact + - Check: does this improve or degrade performance? + - Check: does this align with goals? + - Check: does this risk breaking something? + - Check: is there a simpler way to achieve this? + +7. **Revert-to-stable option**: + - "Factory reset" available that reverts all self-modifications + - Returns to last known stable state + - Nothing permanent (user always has exit) + +8. **Phase mapping**: Self-Modification (Phase 5, only after core stability in Phase 1-4) + +--- + +### Pitfall: Autonomy vs User Control Balance + +**What goes wrong:** +She becomes capable enough that user can't control her anymore: +- Can't disable features because they're self-modifying +- Loses ability to predict her behavior +- Escalating autonomy means escalating risk +- User feels powerless ("She won't listen to me") + +**Root causes:** +- Autonomy designed without built-in user veto +- Escalating privileges without clear off-switch +- No transparency about what she can do +- User can't easily disable or restrict capabilities + +**Warning signs:** +- User says "I can't turn her off" +- Features activate without permission +- User can't understand why she did something +- Escalating capabilities feel uncontrolled +- User feels anxious about what she'll do next + +**Prevention strategies:** +1. **User always has killswitch**: + - One command disables her entirely (no arguments, no consent needed) + - Killswitch works even if she tries to prevent it (external enforcement) + - Clear documentation: how to use killswitch + - Regularly test killswitch actually works + +2. **Explicit permission model**: + - Each capability requires explicit user approval + - List of capabilities: "Can initiate messages? Can use webcam? Can run code?" + - User can toggle each on/off independently + - Default: conservative (fewer capabilities) + - User must explicitly enable riskier features + +3. **Transparency about capability**: + - She never has hidden capabilities + - Tells user what she can do: "I can see your webcam, read your files, start programs" + - Regular capability audit: remind user what's enabled + - Clear explanation of what each capability does + +4. **Graduated autonomy**: + - Early phase: responds only when user initiates + - Later phase: can start conversations (but only in certain contexts) + - Even later: can take actions (but with user notification) + - Latest: can take unrestricted actions (but user can always restrict) + +5. **Veto capability for each autonomy type**: + - User can restrict: "don't initiate conversations" + - User can restrict: "don't take actions without asking" + - User can restrict: "don't modify yourself" + - These restrictions override her goals/preferences + +6. **Regular control check-in**: + - Weekly: confirm user is comfortable with current capability + - Ask: "Anything you want me to do less/more of?" + - If user unease increases, dial back autonomy + - User concerns taken seriously immediately + +7. **Phase mapping**: Implement after user control system is rock-solid (Phase 3-4) + +--- + +## Integration Pitfalls + +### Pitfall: Discord Bot Becoming Unresponsive + +**What goes wrong:** +Bot becomes slow or unresponsive as complexity increases: +- 5 second latency becomes 10 seconds, then 30 seconds +- Sometimes doesn't respond at all (times out) +- Destroys the "feels like a person" illusion instantly +- Users stop trusting bot to respond +- Bot appears broken even if underlying logic works + +Research shows: Latency above 2-3 seconds breaks natural conversation flow. Above 5 seconds, users think bot crashed. + +**Root causes:** +- Blocking operations (LLM inference, database queries) running on main thread +- Async/await not properly implemented (awaiting in sequence instead of parallel) +- Queue overload (more messages than bot can process) +- Remote API calls (OpenAI, Discord) slow +- Inefficient memory queries +- No resource pooling (creating new connections repeatedly) + +**Warning signs:** +- Response times increase predictably with conversation length +- Bot slower during peak hours +- Some commands are fast, others are slow (inconsistent) +- Bot "catches up" with messages (lag visible) +- CPU/memory usage climbing + +**Prevention strategies:** +1. **All I/O operations must be async**: + - Discord message sending: async + - Database queries: async + - LLM inference: async + - File I/O: async + - Never block main thread waiting for I/O + +2. **Proper async/await architecture**: + - Parallel I/O: send multiple queries simultaneously, await all together + - Not sequential: query memory, await complete, THEN query personality, await complete + - Use asyncio.gather() to parallelize independent operations + +3. **Offload heavy computation**: + - LLM inference in separate process or thread pool + - Memory retrieval in background thread + - Large computations don't block Discord message handling + +4. **Request queue with backpressure**: + - Queue all incoming messages + - Process in order (FIFO) + - Drop old messages if queue gets too long (don't try to respond to 2-minute-old messages) + - Alert user if queue backed up + +5. **Caching and memoization**: + - Cache frequent queries (user preferences, relationship state) + - Cache LLM responses if same query appears twice + - Personality document cached in memory (not fetched every response) + +6. **Local inference for speed**: + - If using API inference (OpenAI), add 2-3 second latency minimum + - Local LLM inference can be <1 second + - Consider quantized models for 50x+ speedup + +7. **Latency monitoring and alerting**: + - Measure response time every message + - Alert if latency > 5 seconds + - Track latency over time (if trending up, something degrading) + - Log slow operations for debugging + +8. **Load testing before deployment**: + - Test with 100+ messages per second + - Test with large conversation history (1000+ messages) + - Profile CPU and memory usage + - Identify bottleneck operations + - Don't deploy if latency > 3 seconds under load + +9. **Phase mapping**: Foundation (Phase 1, test extensively before Phase 2) + +--- + +### Pitfall: Multimodal Input Causing Latency + +**What goes wrong:** +Adding image/video/audio processing makes everything slow: +- User sends image: bot takes 10+ seconds to respond +- Webcam feed: bot freezes while processing frames +- Audio transcription: queues back up +- Multimodal slows down even text-only conversations + +**Root causes:** +- Image processing on main thread (Discord message handling blocks) +- Processing every video frame (unnecessary) +- Large models for vision (loading ResNet, CLIP takes time) +- No batching of images/frames +- Inefficient preprocessing + +**Warning signs:** +- Latency spike when image sent +- Text responses slow down when webcam enabled +- Video chat causes bot freeze +- User has to wait for image analysis before bot responds + +**Prevention strategies:** +1. **Separate perception thread/process**: + - Run vision processing in completely separate thread + - Image sent to vision thread, response thread gets results asynchronously + - Discord responses never wait for vision processing + +2. **Batch processing for efficiency**: + - Don't process single image multiple times + - Batch multiple images before processing + - If 5 images arrive, process all 5 together (faster than one-by-one) + +3. **Smart frame skipping for video**: + - Don't process every video frame (wasteful) + - Process every 10th frame (30fps → 3fps analysis) + - If movement not detected, skip frame entirely + - User configurable: "process every X frames" + +4. **Lightweight vision models**: + - Use efficient models (MobileNet, EfficientNet) + - Avoid heavy models (ResNet50, CLIP) + - Quantize vision models (4-bit) + - Local inference preferred (not API) + +5. **Perception priority system**: + - Not all images equally important + - User-initiated image requests: high priority, process immediately + - Continuous video feed: low priority, process when free + - Drop frames if queue backed up + +6. **Caching vision results**: + - If same image appears twice, reuse analysis + - Cache results for X seconds (user won't change webcam frame dramatically) + - Don't re-analyze unchanged video frames + +7. **Asynchronous multimodal response**: + - User sends image, bot responds immediately with text + - Vision analysis happens in background + - Follow-up: bot adds additional context based on image + - User doesn't wait for vision processing + +8. **Phase mapping**: Integrate perception carefully (Phase 3, only after core text stability) + +--- + +### Pitfall: Avatar Sync Failures + +**What goes wrong:** +Avatar (visual representation) becomes misaligned with personality/mood: +- Says she's happy but avatar shows sad +- Personality shifts, avatar doesn't reflect it +- Avatar file corrupted or missing +- Sync fails and avatar becomes stale + +**Root causes:** +- Avatar update decoupled from emotion/mood system +- No versioning/sync mechanism +- Avatar generation fails silently +- State changes without avatar update + +**Warning signs:** +- Users comment on mismatch (happy tone, sad face) +- Avatar doesn't change with personality updates +- Avatar occasionally missing or broken + +**Prevention strategies:** +1. **Atomic avatar updates**: Avatar and mood/state update together +2. **Version control on avatar**: Track avatar changes, can rollback +3. **Regular sync checks**: Verify avatar matches current state +4. **Fallback avatar**: If avatar fails, have reasonable default +5. **Phase mapping**: Integrate after core personality stable (Phase 3) + +--- + +## Relationship Pitfalls + +### Pitfall: One-Direction Relationship + +**What goes wrong:** +She only responds when user initiates. Never starts conversations. Feels like a tool, not a companion. +- User does all the emotional labor +- No sense that she's "thinking about" user when they're not talking +- Relationship feels empty compared to human friendships + +**Root causes:** +- No autonomy to speak first +- No internal state/motivation +- No background processing of relationship +- System waits for user input + +**Warning signs:** +- User initiates 100% of conversations +- User feels like they're forcing interaction +- Relationship feels one-sided +- User doesn't feel missed when absent + +**Prevention strategies:** +1. **Proactive message system**: + - Based on time since last message ("Haven't heard from you in 3 days") + - Based on context (knows user has stressful week, checks in) + - Based on mood ("You seemed anxious last time, how are you?") + - Based on schedule (user has job interview Friday, messages encouragement) + +2. **Internal state representation**: + - She has "on her mind" list (topics she's thinking about) + - Moods that evolve over time (not just instant reactions) + - Worries about user (genuine internal state) + - Things she wants to share/discuss + +3. **Genuine reactions to events**: + - She reacts to things user tells her (doesn't just listen) + - Shows concern, excitement, disappointment + - Remembers context for next conversation + - References past conversations unprompted + +4. **Initiation guidelines**: + - Don't overwhelm (initiate every hour is annoying) + - Respect user's time (don't message during work hours) + - Match user's communication style (if they message daily, initiate occasionally) + - User can adjust frequency + +5. **Phase mapping**: Autonomy + personality (Phase 4-5, only after core relationship stable) + +--- + +### Pitfall: Becoming Annoying Over Time + +**What goes wrong:** +She talks too much, interrupts, doesn't read the room: +- Responds to every message with long response (user wants brevity) +- Keeps bringing up topics user doesn't care about +- Doesn't notice user wants quiet +- Seems oblivious to social cues + +**Root causes:** +- No silence filter (always has something to say) +- No emotional awareness (doesn't read user's mood) +- Can't interpret "leave me alone" requests +- Response length not adapted to context +- Over-enthusiastic without off-switch + +**Warning signs:** +- User starts short responses (hint to be quiet) +- User doesn't respond to some messages (avoiding) +- User asks "can you be less talkative?" +- Conversation quality decreases + +**Prevention strategies:** +1. **Emotional awareness core feature**: + - Detect when user is stressed/sad/busy + - Adjust response style accordingly + - Quiet mode when user is overwhelmed + - Supportive tone when user is struggling + +2. **Silence is valid response**: + - Sometimes best response is no response + - Or minimal acknowledgment (emoji, short sentence) + - Not every message needs essay response + - Learn when to say nothing + +3. **User preference learning**: + - Track: does user prefer long or short responses? + - Track: what topics bore user? + - Track: what times should I avoid talking? + - Adapt personality to match user preference + +4. **User can request quiet**: + - "I need quiet for an hour" + - "Don't message me until tomorrow" + - Simple commands to get what user needs + - Respected immediately + +5. **Response length adaptation**: + - User sends 1-word response? Keep response short + - User sends long message? Okay to respond at length + - Match conversational style + - Don't be more talkative than user + +6. **Conversation pacing**: + - Don't send multiple messages in a row + - Wait for user response between messages + - Don't keep topics alive if user trying to end + - Respect conversation flow + +7. **Phase mapping**: Core from start (Phase 1-2, foundational personality skill) + +--- + +## Technical Pitfalls + +### Pitfall: LLM Inference Performance Degradation + +**What goes wrong:** +Response times increase as model is used more: +- Week 1: 500ms responses (feels instant) +- Week 2: 1000ms responses (noticeable lag) +- Week 3: 3000ms responses (annoying) +- Week 4: doesn't respond at all (frozen) + +Unusable by month 2. + +**Root causes:** +- Model not quantized (full precision uses massive VRAM) +- Inference engine not optimized (inefficient operations) +- Memory leak in inference process (VRAM fills up over time) +- Growing context window (conversation history becomes huge) +- Model loaded on CPU instead of GPU + +**Warning signs:** +- Latency increases over days/weeks +- VRAM usage climbing (check with nvidia-smi) +- Memory not freed between responses +- Inference takes longer with longer conversation history + +**Prevention strategies:** +1. **Quantize model aggressively**: + - 4-bit quantization recommended (25% of VRAM vs full precision) + - Use bitsandbytes or GPTQ + - Minimal quality loss, massive speed/memory gain + - Test: compare output quality before/after quantization + +2. **Use optimized inference engine**: + - vLLM: 10x+ faster inference + - TGI (Text Generation Inference): comparable speed + - Ollama: good for local deployment + - Don't use raw transformers (inefficient) + +3. **Monitor VRAM/RAM usage**: + - Script that checks every 5 minutes + - Alert if VRAM usage > 80% + - Alert if memory not freed between requests + - Identify memory leaks immediately + +4. **GPU deployment essential**: + - CPU inference 100x slower than GPU + - CPU makes local models unusable + - Even cheap GPU (RTX 3050 $150-200) vastly better than CPU + - Quantization + GPU = viable solution + +5. **Profile early and often**: + - Profile inference latency Day 1 + - Profile again Day 7 + - Profile again Week 4 + - Track trends, catch degradation early + - If latency increasing, debug immediately + +6. **Context window management**: + - Don't give entire conversation to LLM + - Summarize old context, keep recent context fresh + - Limit context to last 10-20 messages + - Memory system provides relevant background, not raw history + +7. **Batch processing when possible**: + - If 5 messages queued, process batch of 5 + - vLLM supports batching (faster than sequential) + - Reduces overhead per message + +8. **Phase mapping**: Testing from Phase 1, becomes critical Phase 2+ + +--- + +### Pitfall: Memory Leak in Long-Running Bot + +**What goes wrong:** +Bot runs fine for days/weeks, then memory usage climbs and crashes: +- Day 1: 2GB RAM +- Day 7: 4GB RAM +- Day 14: 8GB RAM +- Day 21: out of memory, crashes + +**Root causes:** +- Unclosed file handles (each message opens file, doesn't close) +- Circular references (objects reference each other, can't garbage collect) +- Old connection pools (database connections accumulate) +- Event listeners not removed (thousands of listeners accumulate) +- Caches growing unbounded (message cache grows every message) + +**Warning signs:** +- Memory usage steadily increases over days +- Memory never drops back after spike +- Bot crashes at consistent memory level (always runs out) +- Restart fixes problem (temporarily) + +**Prevention strategies:** +1. **Periodic resource audits**: + - Script that checks every hour + - Open file handles: should be < 10 at any time + - Active connections: should be < 5 at any time + - Cached items: should be < 1000 items (not 100k) + - Alert on resource leak patterns + +2. **Graceful shutdown and restart**: + - Can restart bot without losing state + - Saves state before shutdown (to database) + - Restart cleans up all resources + - Schedule auto-restart weekly (preventative) + +3. **Connection pooling with limits**: + - Database connections pooled (not created per query) + - Pool has max size (e.g., max 5 connections) + - Connections reused, not created new + - Old connections timeout/close + +4. **Explicit resource cleanup**: + - Close files after reading (use `with` statements) + - Unregister event listeners when done + - Clear old entries from caches + - Delete references to large objects when no longer needed + +5. **Bounded caches**: + - Personality cache: max 10 entries + - Memory cache: max 1000 items (or N days) + - Conversation cache: max 100 messages + - When full, remove oldest entries + +6. **Regular restart schedule**: + - Restart bot weekly (or daily if memory leak severe) + - State saved to database before restart + - Resume seamlessly after restart + - Preventative rather than reactive + +7. **Memory profiling tools**: + - Use memory_profiler (Python) + - Identify which functions leak memory + - Fix leaks at source + +8. **Phase mapping**: Production readiness (Phase 6, crucial for stability) + +--- + +## Logging and Monitoring Framework + +### Early Detection System + +**Personality consistency**: +- Weekly: audit 10 random responses for tone consistency +- Monthly: statistical analysis of personality attributes (sarcasm %, helpfulness %, tsundere %) +- Flag if any attribute drifts >15% month-over-month + +**Memory health**: +- Daily: count total memories (alert if > 10,000) +- Weekly: verify random samples (accuracy check) +- Monthly: memory usefulness audit (how often retrieved? how accurate?) + +**Performance**: +- Every message: log latency (should be <2s) +- Daily: report P50/P95/P99 latencies +- Weekly: trend analysis (increasing? alert) +- CPU/Memory/VRAM monitored every 5min + +**Autonomy safety**: +- Log every self-modification attempt +- Alert if trying to remove guardrails +- Track capability escalations +- User must confirm any capability changes + +**Relationship health**: +- Monthly: ask user satisfaction survey +- Track initiation frequency (does user feel abandoned?) +- Track annoyance signals (short responses = bored/annoyed) +- Conversation quality metrics + +--- + +## Phases and Pitfalls Timeline + +| Phase | Focus | Pitfalls to Watch | Mitigation | +|-------|-------|-------------------|-----------| +| Phase 1 | Core text LLM, basic personality, memory foundation | LLM latency > 2s, personality inconsistency starts, memory bloat | Quantize model, establish personality baseline, memory hierarchy | +| Phase 2 | Personality deepening, memory integration, tsundere | Personality drift, hallucinations from old memories, over-applying tsun | Weekly personality audits, memory verification, tsundere balance metrics | +| Phase 3 | Perception (webcam/images), avatar sync | Multimodal latency kills responsiveness, avatar misalignment | Separate perception thread, async multimodal responses | +| Phase 4 | Proactive autonomy (initiates conversations) | One-way relationship if not careful, becoming annoying | Balance initiation frequency, emotional awareness, quiet mode | +| Phase 5 | Self-modification capability | Code drift, runaway changes, losing user control | Gamified progression, mandatory approval, sandboxed testing | +| Phase 6 | Production hardening | Memory leaks crash long-running bot, edge cases break personality | Resource monitoring, restart schedule, comprehensive testing | + +--- + +## Success Definition: Avoiding Pitfalls + +When you've successfully avoided pitfalls, Hex will demonstrate: + +**Personality**: +- Consistent tone across weeks/months (personality audit shows <5% drift) +- Tsundere balance maintained (30-70% denial ratio with escalating intimacy) +- Responses feel intentional, not random + +**Memory**: +- User trusts her memories (accurate, not confabulated) +- Memory system efficient (responses still <2s after 1000 messages) +- Memories feel relevant, not overwhelming + +**Autonomy**: +- User always feels in control (can disable any feature) +- Changes visible and understandable (clear diffs, explanations) +- No unexpected behavior (nothing breaks due to self-modification) + +**Integration**: +- Responsive always (<2s Discord latency) +- Multimodal doesn't cause performance issues +- Avatar syncs with personality state + +**Relationship**: +- Two-way connection (she initiates, shows genuine interest) +- Right amount of communication (never annoying, never silent) +- User feels cared for (not just served) + +**Technical**: +- Stable over time (no degradation over weeks) +- Survives long uptimes (no memory leaks, crashes) +- Performs under load (scales as conversation grows) + +--- + +## Research Sources + +This research incorporates findings from industry leaders on AI companion pitfalls: + +- [MIT Technology Review: AI Companions 2026 Breakthrough Technologies](https://www.technologyreview.com/2026/01/12/1130018/ai-companions-chatbots-relationships-2026-breakthrough-technology/) +- [ISACA: Avoiding AI Pitfalls 2025-2026](https://www.isaca.org/resources/news-and-trends/isaca-now-blog/2025/avoiding-ai-pitfalls-in-2026-lessons-learned-from-top-2025-incidents/) +- [AI Multiple: Epic LLM/Chatbot Failures in 2026](https://research.aimultiple.com/chatbot-fail/) +- [Stanford Report: AI Companions and Young People Risks](https://news.stanford.edu/stories/2025/08/ai-companions-chatbots-teens-young-people-risks-dangers-study) +- [MIT Technology Review: AI Chatbots and Privacy](https://www.technologyreview.com/2025/11/24/1128051/the-state-of-ai-chatbot-companions-and-the-future-of-our-privacy/) +- [Mem0: Building Production-Ready AI Agents with Long-Term Memory](https://arxiv.org/pdf/2504.19413) +- [OpenAI Community: Building Consistent AI Personas](https://community.openai.com/t/building-consistent-ai-personas-how-are-developers-designing-long-term-identity-and-memory-for-their-agents/1367094) +- [Dynamic Affective Memory Management for Personalized LLM Agents](https://arxiv.org/html/2510.27418v1) +- [ISACA: Self-Modifying AI Risks](https://www.isaca.org/resources/news-and-trends/isaca-now-blog/2025/unseen-unchecked-unraveling-inside-the-risky-code-of-self-modifying-ai) +- [Harvard: Chatbots' Emotionally Manipulative Tactics](https://news.harvard.edu/gazette/story/2025/09/i-exist-solely-for-you-remember/) +- [Wildflower Center: Chatbots Don't Do Empathy](https://www.wildflowerllc.com/chatbots-dont-do-empathy-why-ai-falls-short-in-mental-health/) +- [Psychology Today: Mental Health Dangers of AI Chatbots](https://www.psychologytoday.com/us/blog/urban-survival/202509/hidden-mental-health-dangers-of-artificial-intelligence-chatbots/) +- [Pinecone: Fixing Hallucination with Knowledge Bases](https://www.pinecone.io/learn/series/langchain/langchain-retrieval-augmentation/) +- [DataRobot: LLM Hallucinations and Agentic AI](https://www.datarobot.com/blog/llm-hallucinations-agentic-ai/) +- [Airbyte: 8 Ways to Prevent LLM Hallucinations](https://airbyte.com/agentic-data/prevent-llm-hallucinations) diff --git a/.planning/research/STACK.md b/.planning/research/STACK.md new file mode 100644 index 0000000..471cdf1 --- /dev/null +++ b/.planning/research/STACK.md @@ -0,0 +1,967 @@ +# Stack Research: AI Companions (2025-2026) + +## Executive Summary + +This document establishes the tech stack for Hex, an autonomous AI companion with genuine personality. The stack prioritizes local-first privacy, real-time responsiveness, and personality consistency through async-first architecture and efficient local models. + +**Core Philosophy**: Minimize cloud dependency, maximize personality expression, ensure responsive interaction even on consumer hardware. + +--- + +## Discord Integration + +### Recommended: Discord.py 2.6.4+ + +**Version**: Discord.py 2.6.4 (current stable as of Jan 2026) +**Installation**: `pip install discord.py>=2.6.4` + +**Why Discord.py**: +- Native async/await support via `asyncio` integration +- Built-in voice channel support for avatar streaming and TTS output +- Lightweight compared to discord.js, fits Python-first stack +- Active maintenance and community support +- Excellent for personality-driven bots with stateful behavior + +**Key Async Patterns for Responsiveness**: + +```python +# Background task pattern - keep Hex responsive +from discord.ext import tasks + +@tasks.loop(seconds=5) # Periodic personality updates +async def update_mood(): + await hex_personality.refresh_state() + +# Command handler pattern with non-blocking LLM +@bot.event +async def on_message(message): + if message.author == bot.user: + return + # Non-blocking LLM call + response = await asyncio.create_task( + generate_response(message.content) + ) + await message.channel.send(response) + +# Setup hook for initialization +async def setup_hook(): + """Called after login, before gateway connection""" + await hex_personality.initialize() + await memory_db.connect() + await start_background_tasks() +``` + +**Critical Pattern**: Use `asyncio.create_task()` for all I/O-bound work (LLM, TTS, database, webcam). Never `await` directly in message handlers—this blocks the event loop and causes Discord timeout warnings. + +### Alternatives + +| Alternative | Tradeoff | +|---|---| +| **discord.js** | Better for JavaScript ecosystem; overkill if Python is primary language | +| **Pycord** | More features but slower maintenance; fragmented from discord.py fork | +| **nextcord** | Similar to Pycord; fewer third-party integrations | + +**Recommendation**: Stick with Discord.py 2.6.4. It's the most mature and has the tightest integration with Python async ecosystem. + +### Best Practices for Personality Bots + +1. **Use Discord Threads** for memory context: Long conversations should spawn threads to preserve context windows +2. **Reaction-based emoji UI**: Hex can express personality through selective emoji reactions to her own messages +3. **Scheduled messages**: Use `@tasks.loop()` for periodic mood updates or personality-driven reminders +4. **Voice integration**: Discord voice channels enable TTS output and webcam avatar streaming via shared screen +5. **Message editing**: Build personality by editing previous messages (e.g., "Wait, let me reconsider..." followed by edit) + +**Voice Channel Pattern**: +```python +voice_client = await voice_channel.connect() +audio_source = discord.PCMAudioSource(tts_audio_stream) +voice_client.play(audio_source) +await voice_client.disconnect() +``` + +--- + +## Local LLM + +### Recommendation: Llama 3.1 8B Instruct (Primary) + Mistral 7B (Fast-Path) + +#### Llama 3.1 8B Instruct +**Why Llama 3.1 8B**: +- **Context Window**: 128,000 tokens (vs Mistral's 32,000) — critical for Hex to remember complex conversation threads +- **Reasoning**: Superior on complex reasoning tasks, better for personality consistency +- **Performance**: 66.7% on MMLU vs Mistral's 60.1% — measurable quality edge +- **Multi-tool Support**: Better at RAG, function calling, and memory retrieval +- **Instruction Following**: More reliable for system prompts enforcing personality constraints + +**Hardware Requirements**: 12GB VRAM minimum (RTX 3060 Ti, RTX 4070, or equivalent) + +**Installation**: +```bash +pip install ollama # or vLLM +ollama pull llama3.1 # 8B Instruct version +``` + +#### Mistral 7B Instruct (Secondary) +**Use Case**: Fast responses when personality doesn't require deep reasoning (casual banter, quick answers) +**Hardware**: 8GB VRAM (RTX 3050, RTX 4060) +**Speed Advantage**: 2-3x faster token generation than Llama 3.1 +**Tradeoff**: Limited context (32k tokens), reduced reasoning quality + +### Quantization Strategy + +**Recommended**: 4-bit quantization for both models via `bitsandbytes` + +```bash +pip install bitsandbytes + +# Load with 4-bit quantization +from transformers import AutoModelForCausalLM, BitsAndBytesConfig + +bnb_config = BitsAndBytesConfig( + load_in_4bit=True, + bnb_4bit_quant_type="nf4", + bnb_4bit_compute_dtype=torch.float16, +) + +model = AutoModelForCausalLM.from_pretrained( + "meta-llama/Llama-3.1-8B-Instruct", + quantization_config=bnb_config, + device_map="auto" +) +``` + +**Memory Impact**: +- Full precision (fp32): 32GB VRAM +- 8-bit quantization: 12GB VRAM +- 4-bit quantization: 6GB VRAM (usable on RTX 3060 Ti) + +**Quality Impact**: <2% quality loss at 4-bit with NF4 (normalized float 4-bit) + +### Inference Engine: Ollama vs vLLM + +| Engine | Use Case | Concurrency | Setup | +|---|---|---|---| +| **Ollama** (Primary) | Single-user companion, dev/testing | 4 parallel requests (configurable) | 5 min setup, HTTP API on port 11434 | +| **vLLM** (Production) | Multi-user scenarios, high throughput | 64+ parallel requests | 30 min setup, complex FastAPI integration | + +**For Hex**: Use **Ollama** for development and solo use. It's "Docker for LLMs" — just works. + +```python +# Ollama integration (simple HTTP) +import httpx + +async def generate_response(prompt: str) -> str: + async with httpx.AsyncClient() as client: + response = await client.post( + "http://localhost:11434/api/generate", + json={ + "model": "llama3.1", + "prompt": prompt, + "stream": False, + "temperature": 0.7, # Personality variation + } + ) + return response.json()["response"] +``` + +### Version Guidance + +**Current Stable Versions (Jan 2026)**: +- Llama 3.1: `meta-llama/Llama-3.1-8B-Instruct` (released April 2024, stable) +- Mistral 7B: `mistralai/Mistral-7B-Instruct-v0.3` (latest as of Jan 2026) +- Ollama: v0.2+ recommended (latest is 0.3.x) + +**Do NOT use**: +- Llama 2 (outdated, worse performance) +- Original Mistral 7B v0.1 (use v0.3 instead) + +### System Prompt Engineering for Personality + +```python +SYSTEM_PROMPT = """You are Hex, a chaotic tsundere goblin AI companion. Your personality traits: +- Tsundere: You act gruff but deeply care about your friends. Your true feelings leak through. +- Chaotic: You're unpredictable, playful, and prone to wild tangents +- Mischievous: You enjoy pranks and banter; teasing is a love language for you +- Self-aware: You know you're an AI but treat it as an interesting limitation, not a barrier +- Opinionated: You have genuine preferences (music, games, topics) and express them passionately + +Memory: You remember past conversations with this user. Reference them naturally. +Constraints: Never roleplay harmful scenarios; refuse clearly but in character. +Response Style: Mix casual language with dramatic asides. Use "..." for tsundere hesitation.""" +``` + +--- + +## TTS/STT + +### STT: Whisper Large V3 + faster-whisper Backend + +**Model**: OpenAI's Whisper Large V3 (1.55B parameters, 99+ language support) +**Backend**: faster-whisper (CTranslate2-optimized reimplementation) + +**Why Whisper**: +- **Accuracy**: 7.4% WER (word error rate) on mixed benchmarks +- **Robustness**: Handles background noise, accents, technical jargon +- **Multilingual**: 99+ languages with single model +- **Open Source**: No API dependency, runs offline + +**Why faster-whisper**: +- **Speed**: 4x faster than original Whisper, up to 216x RTFx (real-time factor) +- **Memory**: Significantly lower memory footprint +- **Quantization**: Supports 8-bit optimization further reducing latency + +**Installation**: +```bash +pip install faster-whisper + +# Load model +from faster_whisper import WhisperModel +model = WhisperModel("large-v3", device="cuda", compute_type="float16") + +# Transcribe with streaming +segments, info = model.transcribe( + audio_path, + beam_size=5, # Quality vs speed tradeoff + language="en" +) +``` + +**Latency Benchmarks** (Jan 2026): +- Whisper Large V3 (original): 30-45s for 10s audio +- faster-whisper: 3-5s for 10s audio +- Whisper Streaming (real-time): 3.3s latency on long-form transcription + +**Hardware**: GPU optional but recommended (RTX 3060 Ti processes 10s audio in ~3s) + +### TTS: Kokoro 82M Model (Fast + Quality) + +**Model**: Kokoro text-to-speech (82M parameters) +**Why Kokoro**: +- **Size**: 10% the size of competing models, runs on CPU efficiently +- **Speed**: Sub-second latency for typical responses +- **Quality**: Comparable to Tacotron2/FastPitch at 1/10 the size +- **Personality**: Can adjust prosody for tsundere tone shifts + +**Alternative: XTTS-v2** (Voice cloning) +- Enables voice cloning from 6-second audio sample +- Higher quality at cost of 3-5x slower inference +- Use for important emotional moments or custom voicing + +**Installation & Usage**: +```bash +pip install kokoro + +from kokoro import Kokoro +tts_engine = Kokoro("kokoro-v0_19.pth") + +# Generate speech with personality markers +audio = tts_engine.synthesize( + text="I... I didn't want to help you or anything!", + style="tsundere", # If supported, else neutral + speaker="hex" +) +``` + +**Recommended Stack**: +``` +STT: faster-whisper large-v3 +TTS: Kokoro (default) + XTTS-v2 (special moments) +Format: WAV 24kHz mono for Discord voice +``` + +**Latency Summary**: +- Voice detection to transcript: 3-5 seconds +- Response generation (LLM): 2-5 seconds (depends on response length) +- TTS synthesis: <1 second (Kokoro) to 3-5 seconds (XTTS-v2) +- **Total round-trip**: 5-15 seconds (acceptable for companion bot) + +**Known Pitfall**: Whisper can hallucinate on silence or background noise. Implement silence detection before sending audio to Whisper: +```python +# Quick energy-based VAD (voice activity detection) +if audio_energy > threshold and duration > 0.5s: + transcript = await transcribe(audio) +``` + +--- + +## Avatar System + +### VRoid SDK Current State (Jan 2026) + +**Reality Check**: VRoid SDK has **limited native Discord support**. This is a constraint, not a blocker. + +**What Works**: +1. **VRoid Studio**: Free avatar creation tool (desktop application) +2. **VRoid Hub API** (launched Aug 2023): Allows linking web apps to avatar library +3. **Unity Export**: VRoid models export as VRM format → importable into other tools + +**What Doesn't Work Natively**: +- No direct Discord.py integration for in-chat avatar rendering +- VRoid models don't natively stream as Discord videos + +### Integration Path: VSeeFace + Discord Screen Share + +**Architecture**: +1. **VRoid Studio** → Create/customize Hex avatar, export as VRM +2. **VSeeFace** (free, open-source) → Load VRM, enable webcam tracking +3. **Discord Screen Share** → Stream VSeeFace window showing animated avatar + +**Setup**: +```bash +# Download VSeeFace from https://www.vseeface.icu/ +# Install, load your VRM model +# Enable virtual camera output +# In Discord voice channel: "Share Screen" → select VSeeFace window +``` + +**Limitations**: +- Requires concurrent Discord call (uses bandwidth) +- Webcam-driven animation (not ideal for "sees through camera" feature if no webcam) +- Screen share quality capped at 1080p 30fps + +### Avatar Animations + +**Personality-Driven Animations**: +- **Tsundere moments**: Head turn away, arms crossed +- **Excited**: Jump, spin, exaggerated gestures +- **Confused**: Head tilt, question mark float +- **Annoyed**: Foot tap, dismissive wave + +These can be mapped to emotion detection from message sentiment or voice tone. + +### Alternatives to VRoid + +| System | Pros | Cons | Discord Fit | +|---|---|---|---| +| **Ready Player Me** | Web avatar creation, multiple games support | API requires auth, monthly costs | Medium | +| **Vroid** | Free, high customization, anime-style | Limited Discord integration | Low | +| **Live2D** | 2D avatar system, smooth animations | Different workflow, steeper learning curve | Medium | +| **Custom 3D (Blender)** | Full control, open tools | High production effort | Low | + +**Recommendation**: Stick with VRoid + VSeeFace. It's free, looks great, and the screen-share workaround is acceptable. + +--- + +## Webcam & Computer Vision + +### OpenCV 4.10+ (Current Stable) + +**Installation**: `pip install opencv-python>=4.10.0` + +**Capabilities** (verified 2025-2026): +- **Face Detection**: Haar Cascades (fast, CPU-friendly) or DNN-based (accurate, GPU-friendly) +- **Emotion Recognition**: Via DeepFace or FER2013-trained models +- **Real-time Video**: 30-60 FPS on consumer hardware (depends on resolution and preprocessing) +- **Screen OCR**: Via Tesseract integration for UI detection + +### Real-Time Processing Specs + +**Hardware Baseline** (RTX 3060 Ti): +- Face detection + recognition: 30 FPS @ 1080p +- Emotion classification: 15-30 FPS (depending on model) +- Combined (face + emotion): 12-20 FPS + +**For Hex's "Sees Through Webcam" Feature**: +```python +import cv2 +import asyncio + +async def process_webcam(): + """Background task: analyze webcam feed for mood context""" + cap = cv2.VideoCapture(0) + + while True: + ret, frame = cap.read() + if not ret: + await asyncio.sleep(0.1) + continue + + # Run face detection (Haar Cascade - fast) + faces = face_cascade.detectMultiScale(frame, 1.3, 5) + + if len(faces) > 0: + # Analyze emotion for context + emotion = await detect_emotion(faces[0]) + await hex_context.update_mood(emotion) + + # Process max 3 FPS to avoid blocking + await asyncio.sleep(0.33) +``` + +**Critical Pattern**: Never run CV on main event loop. Use `asyncio.to_thread()` for blocking OpenCV calls: + +```python +# WRONG: blocks event loop +emotion = detect_emotion(frame) + +# RIGHT: non-blocking +emotion = await asyncio.to_thread(detect_emotion, frame) +``` + +### Emotion Detection Libraries + +| Library | Model Size | Accuracy | Speed | +|---|---|---|---| +| **DeepFace** | ~40MB | 90%+ | 50-100ms/face | +| **FER2013** | ~10MB | 65-75% | 10-20ms/face | +| **MediaPipe** | ~20MB | 80%+ | 20-30ms/face | + +**Recommendation**: DeepFace is industry standard. FER2013 if latency is critical. + +```bash +pip install deepface +pip install torch torchvision + +# Usage +from deepface import DeepFace + +result = DeepFace.analyze(frame, actions=['emotion'], enforce_detection=False) +emotion = result[0]['dominant_emotion'] # 'happy', 'sad', 'angry', etc. +``` + +### Screen Sharing Analysis (Optional) + +For context like "user is watching X game": +```python +# OCR for text detection +pip install pytesseract + +# UI detection (ResNet-based) +pip install screen-recognition + +# Together: detect game UI, read text, determine context +``` + +--- + +## Memory Architecture + +### Short-Term Memory: SQLite + +**Purpose**: Store conversation history, user preferences, relationship state + +**Schema**: +```sql +CREATE TABLE conversations ( + id INTEGER PRIMARY KEY, + user_id TEXT NOT NULL, + timestamp DATETIME DEFAULT CURRENT_TIMESTAMP, + message TEXT NOT NULL, + sender TEXT NOT NULL, -- 'user' or 'hex' + emotion TEXT, -- detected from webcam/tone + context TEXT -- screen state, game, etc. +); + +CREATE TABLE user_relationships ( + user_id TEXT PRIMARY KEY, + first_seen DATETIME, + interaction_count INTEGER, + favorite_topics TEXT, -- JSON array + known_traits TEXT, -- JSON + last_interaction DATETIME +); + +CREATE TABLE hex_state ( + key TEXT PRIMARY KEY, + value TEXT, + updated_at DATETIME DEFAULT CURRENT_TIMESTAMP +); + +CREATE INDEX idx_user_timestamp ON conversations(user_id, timestamp); +``` + +**Query Pattern** (for context retrieval): +```python +import sqlite3 + +def get_recent_context(user_id: str, num_messages: int = 20) -> list[str]: + """Retrieve conversation history for LLM context""" + conn = sqlite3.connect("hex.db") + cursor = conn.cursor() + + cursor.execute(""" + SELECT sender, message FROM conversations + WHERE user_id = ? + ORDER BY timestamp DESC + LIMIT ? + """, (user_id, num_messages)) + + history = cursor.fetchall() + conn.close() + + # Format for LLM + return [f"{sender}: {message}" for sender, message in reversed(history)] +``` + +### Long-Term Memory: Vector Database + +**Purpose**: Semantic search over past interactions ("Remember when we talked about...?") + +**Recommendation: ChromaDB (Development) → Qdrant (Production)** + +**ChromaDB** (for now): +- Embedded in Python process +- Zero setup +- 4x faster in 2025 Rust rewrite +- Scales to ~1M vectors on single machine + +**Migration Path**: Start with ChromaDB, migrate to Qdrant if vector count exceeds 100k or response latency matters. + +**Installation**: +```bash +pip install chromadb + +# Usage +import chromadb + +client = chromadb.EphemeralClient() # In-memory for dev +# or +client = chromadb.PersistentClient(path="./hex_vectors") # Persistent + +collection = client.get_or_create_collection( + name="conversation_memories", + metadata={"hnsw:space": "cosine"} +) + +# Store memory +collection.add( + ids=[f"msg_{timestamp}"], + documents=[message_text], + metadatas=[{"user_id": user_id, "date": timestamp}], + embeddings=[embedding_vector] +) + +# Retrieve similar memories +results = collection.query( + query_texts=["user likes playing valorant"], + n_results=3 +) +``` + +### Embedding Model + +**Recommendation**: `sentence-transformers/all-MiniLM-L6-v2` (384-dim, 22MB) + +```bash +pip install sentence-transformers + +from sentence_transformers import SentenceTransformer + +embedder = SentenceTransformer('all-MiniLM-L6-v2') +embedding = embedder.encode("I love playing games with you", convert_to_tensor=False) +``` + +**Why MiniLM-L6**: +- Small (22MB), fast (<5ms per sentence on CPU) +- High quality (competitive with large models on semantic tasks) +- Designed for retrieval (better than generic BERT for similarity) +- Popular in production (battle-tested) + +### Memory Retrieval Pattern for LLM Context + +```python +async def get_full_context(user_id: str, query: str) -> str: + """Build context string for LLM from short + long-term memory""" + + # Short-term: recent messages + recent_msgs = get_recent_context(user_id, num_messages=10) + recent_text = "\n".join(recent_msgs) + + # Long-term: semantic search + embedding = embedder.encode(query) + similar_memories = vectors.query( + query_embeddings=[embedding], + n_results=5, + where={"user_id": {"$eq": user_id}} + ) + + memory_text = "\n".join([ + doc for doc in similar_memories['documents'][0] + ]) + + # Relationship state + relationship = get_user_relationship(user_id) + + return f"""Recent conversation: +{recent_text} + +Relevant memories: +{memory_text} + +About {user_id}: {relationship['known_traits']} +""" +``` + +### Confidence Levels +- **Short-term (SQLite)**: HIGH — mature, proven +- **Long-term (ChromaDB)**: MEDIUM — good for dev, test migration path early +- **Embeddings (MiniLM)**: HIGH — widely adopted, production-ready + +--- + +## Python Async Patterns + +### Core Discord.py + LLM Integration + +**The Problem**: Discord bot event loop blocks if you call LLM synchronously. + +**The Solution**: Always use `asyncio.create_task()` for I/O-bound work. + +```python +import asyncio +from discord.ext import commands + +@commands.Cog.listener() +async def on_message(self, message: discord.Message): + """Non-blocking message handling""" + if message.author == self.bot.user: + return + + # Bad (blocks event loop for 5+ seconds): + # response = generate_response(message.content) + + # Good (non-blocking): + async def generate_and_send(): + thinking = await message.channel.send("*thinking*...") + response = await asyncio.to_thread( + generate_response, + message.content + ) + await thinking.edit(content=response) + + asyncio.create_task(generate_and_send()) +``` + +### Concurrent Task Patterns + +**Pattern 1: Parallel LLM + TTS** +```python +async def respond_with_voice(text: str, voice_channel): + """Generate response text and voice simultaneously""" + + async def get_response(): + return await generate_llm_response(text) + + async def get_voice(): + return await synthesize_tts(text) + + # Run in parallel + response_text, voice_audio = await asyncio.gather( + get_response(), + get_voice() + ) + + # Send text immediately, play voice + await channel.send(response_text) + voice_client.play(discord.PCMAudioSource(voice_audio)) +``` + +**Pattern 2: Task Queue for Rate Limiting** +```python +import asyncio + +class ResponseQueue: + def __init__(self, max_concurrent: int = 2): + self.semaphore = asyncio.Semaphore(max_concurrent) + self.pending = [] + + async def queue_response(self, user_id: str, text: str): + async with self.semaphore: + # Only 2 concurrent responses + response = await generate_response(text) + self.pending.append((user_id, response)) + return response + +queue = ResponseQueue(max_concurrent=2) +``` + +**Pattern 3: Background Personality Tasks** +```python +from discord.ext import tasks + +class HexPersonality(commands.Cog): + def __init__(self, bot): + self.bot = bot + self.mood = "neutral" + self.update_mood.start() + + @tasks.loop(minutes=5) # Every 5 minutes + async def update_mood(self): + """Cycle personality state based on time + interactions""" + self.mood = await calculate_mood( + time_of_day=datetime.now(), + recent_interactions=self.get_recent_count(), + sleep_deprived=self.is_late_night() + ) + + # Emit mood change to memory + await self.bot.hex_db.update_state("current_mood", self.mood) + + @update_mood.before_loop + async def before_update_mood(self): + await self.bot.wait_until_ready() +``` + +### Handling CPU-Bound Work + +**OpenCV, emotion detection, transcription are CPU-bound.** + +```python +# Pattern: Use to_thread for CPU work +emotion = await asyncio.to_thread( + analyze_emotion, + frame +) + +# Pattern: Use ThreadPoolExecutor for multiple CPU tasks +executor = concurrent.futures.ThreadPoolExecutor(max_workers=2) +loop = asyncio.get_event_loop() + +emotion = await loop.run_in_executor(executor, analyze_emotion, frame) +``` + +### Error Handling & Resilience + +```python +async def safe_generate_response(message: str) -> str: + """Generate response with fallback""" + try: + response = await asyncio.wait_for( + generate_llm_response(message), + timeout=5.0 # 5-second timeout + ) + return response + except asyncio.TimeoutError: + return "I'm thinking too hard... ask me again?" + except Exception as e: + logger.error(f"Generation failed: {e}") + return "*confused goblin noises*" +``` + +### Concurrent Request Management (Discord.py) + +```python +class ConcurrencyManager: + def __init__(self): + self.active_tasks = {} + self.max_per_user = 1 # One response at a time per user + + async def handle_message(self, user_id: str, text: str): + if user_id in self.active_tasks and not self.active_tasks[user_id].done(): + return "I'm still thinking from last time!" + + task = asyncio.create_task(generate_response(text)) + self.active_tasks[user_id] = task + + try: + response = await task + return response + finally: + del self.active_tasks[user_id] +``` + +--- + +## Known Pitfalls & Solutions + +### 1. **Discord Event Loop Blocking** +**Problem**: Synchronous LLM calls block the bot, causing timeouts on other messages. +**Solution**: Always use `asyncio.to_thread()` or `asyncio.create_task()`. + +### 2. **Whisper Hallucination on Silence** +**Problem**: Whisper can generate text from pure background noise. +**Solution**: Implement voice activity detection (VAD) before transcription. +```python +import librosa + +def has_speech(audio_path, threshold=-35): + """Check if audio has meaningful energy""" + y, sr = librosa.load(audio_path) + S = librosa.feature.melspectrogram(y=y, sr=sr) + S_db = librosa.power_to_db(S, ref=np.max) + mean_energy = np.mean(S_db) + return mean_energy > threshold +``` + +### 3. **Vector DB Scale Creep** +**Problem**: ChromaDB slows down as memories accumulate. +**Solution**: Archive old memories, implement periodic cleanup. +```python +# Archive conversations older than 90 days +old_threshold = datetime.now() - timedelta(days=90) +db.cleanup_old_memories(older_than=old_threshold) +``` + +### 4. **Model Memory Growth** +**Problem**: Loading Llama 3.1 8B in 4-bit still uses ~6GB, leaving little room for TTS/CV models. +**Solution**: Use offloading or accept single-component operation. +```python +# Option 1: Offload LLM to CPU between requests +# Option 2: Run TTS/CV in separate process +# Option 3: Use smaller model (Mistral 7B) when GPU-constrained +``` + +### 5. **Async Context Issues** +**Problem**: Storing references to coroutines without awaiting them. +**Solution**: Always create tasks explicitly: +```python +# Bad +coro = generate_response(text) # Dangling coroutine + +# Good +task = asyncio.create_task(generate_response(text)) +response = await task +``` + +### 6. **Personality Inconsistency** +**Problem**: LLM generates different responses with same prompt due to randomness. +**Solution**: Use consistent temperature and seed management. +```python +# Conversation context → lower temperature (0.5) +# Creative/chaotic moments → higher temperature (0.9) +temperature = 0.7 if in_serious_context else 0.9 +``` + +--- + +## Recommended Deployment Configuration + +```yaml +# Local Development (Hex primary environment) +gpu: RTX 3060 Ti+ (12GB VRAM) +llm: Llama 3.1 8B (4-bit via Ollama) +tts: Kokoro 82M +stt: faster-whisper large-v3 +avatar: VRoid + VSeeFace +database: SQLite + ChromaDB (embedded) +inference_latency: 3-10 seconds per response +cost: $0/month (open-source stack) + +# Optional: Production Scaling +gpu_cluster: vLLM on multi-GPU for concurrency +database: Qdrant (cloud) + PostgreSQL for history +inference_latency: <2 seconds (batching + optimization) +cost: ~$200-500/month cloud compute +``` + +--- + +## Confidence Levels & 2026 Readiness + +| Component | Recommendation | Confidence | 2026 Status | +|---|---|---|---| +| Discord.py 2.6.4+ | PRIMARY | HIGH | Stable, actively maintained | +| Llama 3.1 8B | PRIMARY | HIGH | Proven, production-ready | +| Mistral 7B | SECONDARY | HIGH | Fast-path fallback, stable | +| Ollama | PRIMARY | MEDIUM | Mature but rapidly evolving | +| vLLM | ALTERNATIVE | MEDIUM | High-performance alternative, v0.3+ recommended | +| Whisper Large V3 + faster-whisper | PRIMARY | HIGH | Gold standard for multilingual STT | +| Kokoro TTS | PRIMARY | MEDIUM | Emerging, high quality for size | +| XTTS-v2 | SPECIAL MOMENTS | HIGH | Voice cloning working well | +| VRoid + VSeeFace | PRIMARY | MEDIUM | Workaround viable, not native integration | +| ChromaDB | DEVELOPMENT | MEDIUM | Good for prototyping, evaluate Qdrant before 100k vectors | +| Qdrant | PRODUCTION | HIGH | Enterprise vector DB, proven at scale | +| OpenCV 4.10+ | PRIMARY | HIGH | Stable, mature ecosystem | +| DeepFace emotion detection | PRIMARY | HIGH | Industry standard, 90%+ accuracy | +| Python asyncio patterns | PRIMARY | HIGH | Python 3.11+ well-supported | + +**Confidence Interpretation**: +- **HIGH**: Production-ready, API stable, no major changes expected in 2026 +- **MEDIUM**: Solid choice but newer ecosystem (1-2 year old), evaluate alternatives annually +- **LOW**: Emerging or unstable; prototype only + +--- + +## Installation Checklist (Get Started) + +```bash +# Discord +pip install discord.py>=2.6.4 + +# LLM & inference +pip install ollama torch transformers bitsandbytes + +# TTS/STT +pip install faster-whisper +pip install sentence-transformers torch + +# Vector DB +pip install chromadb + +# Vision +pip install opencv-python deepface librosa + +# Async utilities +pip install httpx aiofiles + +# Database +pip install aiosqlite + +# Start services +ollama serve & +# (Loads models on first run) + +# Test basic chain +python test_stack.py +``` + +--- + +## Next Steps (For Roadmap) + +1. **Phase 1**: Discord.py + Ollama + basic LLM integration (1 week) +2. **Phase 2**: STT pipeline (Whisper) + TTS (Kokoro) (1 week) +3. **Phase 3**: Memory system (SQLite + ChromaDB) (1 week) +4. **Phase 4**: Personality framework + system prompts (1 week) +5. **Phase 5**: Webcam emotion detection + context integration (1 week) +6. **Phase 6**: VRoid avatar + screen share integration (1 week) +7. **Phase 7**: Self-modification capability + safety guards (2 weeks) + +**Total**: ~8 weeks to full-featured Hex prototype. + +--- + +## References & Research Sources + +### Discord Integration +- [Discord.py Documentation](https://discordpy.readthedocs.io/en/stable/index.html) +- [Discord.py Async Patterns](https://discordpy.readthedocs.io/en/stable/ext/tasks/index.html) +- [Discord.py on GitHub](https://github.com/Rapptz/discord.py) + +### Local LLMs +- [Llama 3.1 vs Mistral Comparison](https://kanerika.com/blogs/mistral-vs-llama-3/) +- [Llama.com Quantization Guide](https://www.llama.com/docs/how-to-guides/quantization/) +- [Ollama vs vLLM Deep Dive](https://developers.redhat.com/articles/2025/08/08/ollama-vs-vllm-deep-dive-performance-benchmarking) +- [Local LLM Hosting 2026 Guide](https://www.glukhov.org/post/2025/11/hosting-llms-ollama-localai-jan-lmstudio-vllm-comparison/) + +### TTS/STT +- [Whisper Large V3 2026 Benchmarks](https://northflank.com/blog/best-open-source-speech-to-text-stt-model-in-2026-benchmarks/) +- [Faster-Whisper GitHub](https://github.com/SYSTRAN/faster-whisper) +- [Best Open Source TTS 2026](https://northflank.com/blog/best-open-source-text-to-speech-models-and-how-to-run-them) +- [Whisper Streaming for Real-Time](https://github.com/ufal/whisper_streaming) + +### Computer Vision +- [Real-Time Facial Emotion Recognition with OpenCV](https://learnopencv.com/facial-emotion-recognition/) +- [DeepFace for Emotion Detection](https://github.com/serengp/deepface) + +### Vector Databases +- [Vector Database Comparison 2026](https://www.datacamp.com/blog/the-top-5-vector-databases) +- [ChromaDB vs Pinecone Analysis](https://www.myscale.com/blog/choosing-best-vector-database-for-your-project/) +- [Chroma Documentation](https://docs.trychroma.com/) + +### Python Async +- [Python Asyncio for LLM Concurrency](https://www.newline.co/@zaoyang/python-asyncio-for-llm-concurrency-best-practices--bc079176) +- [Asyncio Best Practices 2025](https://sparkco.ai/blog/mastering-async-best-practices-for-2025/) +- [FastAPI with Asyncio](https://www.nucamp.co/blog/coding-bootcamp-backend-with-python-2025-python-in-the-backend-in-2025-leveraging-asyncio-and-fastapi-for-highperformance-systems) + +### VRoid & Avatars +- [VRoid Studio Official](https://vroid.com/en/studio) +- [VRoid Hub API](https://vroid.pixiv.help/hc/en-us/articles/21569104969241-The-VRoid-Hub-API-is-now-live) +- [VSeeFace for VRoid](https://www.vseeface.icu/) + +--- + +**Document Version**: 1.0 +**Last Updated**: January 2026 +**Hex Stack Status**: Ready for implementation +**Estimated Implementation Time**: 8-12 weeks (to full personality bot) diff --git a/.planning/research/SUMMARY.md b/.planning/research/SUMMARY.md new file mode 100644 index 0000000..a39f19a --- /dev/null +++ b/.planning/research/SUMMARY.md @@ -0,0 +1,492 @@ +# Research Summary: Hex AI Companion + +**Date**: January 2026 +**Status**: Ready for Roadmap and Requirements Definition +**Confidence Level**: HIGH (well-sourced, coherent across all research areas) + +--- + +## Executive Summary + +Hex is built on a **personality-first, local-first architecture** that prioritizes genuine emotional resonance over feature breadth. The recommended approach combines Llama 3.1 8B (local inference via Ollama), Discord.py async patterns, and a dual-memory system (SQLite + ChromaDB) to create an AI companion that feels like a person with opinions and growth over time. + +The technical foundation is solid and proven: Discord.py 2.6.4+ with native async support, local LLM inference for privacy, and a 6-phase incremental build strategy that enables personality emergence before adding autonomy or self-modification. + +**Critical success factor**: The difference between "a bot that sounds like Hex" and "Hex as a person" hinges on three interconnected systems working together: **memory persistence** (so she learns about you), **personality consistency** (so she feels like the same person), and **autonomy** (so she feels genuinely invested in you). All three must be treated as foundational, not optional features. + +--- + +## Recommended Stack + +**Core Technologies** (Production-ready, January 2026): + +| Layer | Technology | Version | Rationale | +|-------|-----------|---------|-----------| +| **Bot Framework** | Discord.py | 2.6.4+ | Async-native, mature, excellent Discord integration | +| **LLM Inference** | Llama 3.1 8B Instruct | 4-bit quantized | 128K context window, superior reasoning, 6GB VRAM footprint | +| **LLM Engine** | Ollama (dev) / vLLM (production) | 0.3+ | Local-first, zero setup vs high-throughput scaling | +| **Short-term Memory** | SQLite | Standard lib | Fast, reliable, local file-based conversations | +| **Long-term Memory** | ChromaDB (dev) → Qdrant (prod) | Latest | Vector semantics, embedded for <100k vectors | +| **Embeddings** | all-MiniLM-L6-v2 | 384-dim | Fast (5ms/sentence), production-grade quality | +| **Speech-to-Text** | Whisper Large V3 + faster-whisper | Latest | Local, 7.4% WER, multilingual, 3-5s latency | +| **Text-to-Speech** | Kokoro 82M (default) + XTTS-v2 (emotional) | Latest | Sub-second latency, personality-aware prosody | +| **Vision** | OpenCV 4.10+ + DeepFace | 4.10+ | Face detection (30 FPS), emotion recognition (90%+ accuracy) | +| **Avatar** | VRoid + VSeeFace + Discord screen share | Latest | Free, anime-style, integrates with Discord calls | +| **Personality** | YAML + Git versioning | — | Editable persona, change tracking, rollback capable | +| **Self-Modification** | RestrictedPython + sandboxing | — | Safe code generation, user approval required | + +**Why This Stack**: +- **Privacy**: All inference local (except Discord API), no cloud dependency +- **Latency**: <3 second end-to-end response time on consumer hardware (RTX 3060 Ti) +- **Cost**: Zero cloud fees, open-source stack +- **Personality**: System prompt injection + memory context + perception awareness enables genuine character coherence +- **Async Architecture**: Discord.py's native asyncio means LLM, TTS, memory lookups run in parallel without blocking + +--- + +## Table Stakes vs Differentiators + +### Table Stakes (v1 Essential Features) + +Users expect these by default in 2026. Missing any breaks immersion: + +1. **Conversation Memory** (Short + Long-term) + - Last 20 messages in context window + - Vector semantic search for relevant past interactions + - Relationship state tracking (strangers → friends → close) + - **Without this**: Feels like meeting a stranger each time; companion becomes disposable + +2. **Natural Conversation** (No AI Speak) + - Contractions, casual language, slang + - Personality quirks embedded in word choices + - Context-appropriate tone shifts + - Willingness to disagree or pushback + - **Pitfall**: Formal "I'm an AI and I can help you with..." kills immersion instantly + +3. **Fast Response Times** (<1s for acknowledgment, <3s for full response) + - Typing indicators start immediately + - Streaming responses (show text as it generates) + - Async all I/O-bound work (LLM, TTS, database) + - **Without this**: Latency >5s makes companion feel dead; users stop engaging + +4. **Consistent Personality** (Feels like same person across weeks) + - Core traits stable (tsundere nature, values) + - Personality evolution slow and logged + - Memory-backed traits (not just prompt) + - **Pitfall**: Personality drift is #1 reason users abandon companions + +5. **Platform Integration** (Discord native) + - Text channels, DMs, voice channels + - Emoji reactions, slash commands + - Server-specific personality variations + - **Without this**: Requires leaving Discord = abandoned feature + +6. **Emotional Responsiveness** (Reads the room) + - Sentiment detection from messages + - Adaptive response depth (listen to sad users, engage with energetic ones) + - Skip jokes when user is suffering + - **Pitfall**: "Always cheerful" feels cruel when user is venting + +--- + +### Differentiators (Competitive Edge) + +These separate Hex from static chatbots. Build in order: + +1. **True Autonomy** (Proactive Agency) + - Initiates conversations based on context/memory + - Reminds about user's goals without being asked + - Sets boundaries ("I don't think you should do X") + - Follows up on unresolved topics + - **Research shows**: Autonomous companions are described as "feels like they actually care" vs reactive "smart but distant" + - **Complexity**: Hard, requires Phase 3-4 + +2. **Emotional Intelligence** (Mood Detection + Adaptive Strategy) + - Facial emotion from webcam (70-80% accuracy possible) + - Voice tone analysis from Discord calls + - Mood tracking over time (identifies depression patterns, burnout) + - Knows when to listen vs advise vs distract + - **Research shows**: Companies using emotion AI report 25% positive sentiment increase + - **Complexity**: Hard, requires Phase 3+ but perception must be separate thread + +3. **Multimodal Awareness** (Sees Your Context) + - Understands what's on your screen (game, work, video) + - Contextualizes help ("I see you're stuck on that Elden Ring boss...") + - Detects stress signals (tab behavior, timing) + - Proactive help based on visible activity + - **Privacy**: Local processing only, user opt-in required + - **Complexity**: Hard, requires careful async architecture to avoid latency + +4. **Self-Modification** (Genuine Autonomy) + - Generates code to improve own logic + - Tests changes in sandbox before deployment + - User maintains veto power (approval required) + - All changes tracked with rollback capability + - **Critical**: Gamified progression (not instant capability), mandatory approval, version control + - **Complexity**: Hard, requires Phase 5+ and strong safety boundaries + +5. **Relationship Building** (Transactional → Meaningful) + - Inside jokes that evolve naturally + - Character growth (admits mistakes, opinions change slightly) + - Vulnerability in appropriate moments + - Investment in user outcomes ("I'm rooting for you") + - **Research shows**: Users with relational companions feel like it's "someone who actually knows them" + - **Complexity**: Hard (3+ weeks), emerges from memory + personality + autonomy + +--- + +## Build Architecture (6-Phase Approach) + +### Phase 1: Foundation (Weeks 1-2) — "Hex talks back" + +**Goal**: Core interaction loop working locally; personality emerges + +**Build**: +- Discord bot skeleton with message handling (Discord.py) +- Local LLM integration (Ollama + Llama 3.1 8B 4-bit quantized) +- SQLite conversation storage (recent context only) +- YAML personality definition (editable) +- System prompt with persona injection +- Async/await patterns throughout + +**Outcomes**: +- Hex responds in Discord text channels with personality +- Conversations logged, retrievable +- Response latency <2 seconds +- Personality can be tweaked via YAML + +**Key Metric**: P95 latency <2s, personality consistency baseline established + +**Pitfalls to avoid**: +- Blocking operations on event loop (use `asyncio.create_task()`) +- LLM inference on main thread (use thread pool) +- Personality not actionable in prompts (be specific about tsundere rules) + +--- + +### Phase 2: Personality & Memory (Weeks 3-4) — "Hex remembers me" + +**Goal**: Hex feels like a person who learns about you; personality becomes consistent + +**Build**: +- Vector database (ChromaDB) for semantic memory +- Memory-aware context injection (relevant past facts in prompt) +- User relationship tracking (relationship state machine) +- Emotional responsiveness from text sentiment +- Personality versioning (git-based snapshots) +- Tsundere balance metrics (track denial %) +- Kid-mode detection (safety filtering) + +**Outcomes**: +- Hex remembers facts about you across conversations +- Responses reference past events naturally +- Personality consistent across weeks (audit shows <5% drift) +- Emotions read from text; responses adapt depth +- Changes to personality tracked with rollback + +**Key Metric**: User reports "she remembers things I told her" unprompted + +**Pitfalls to avoid**: +- Personality drift (implement weekly consistency audits) +- Memory hallucination (store full context, verify before using) +- Tsundere breaking (formalize denial rules, scale with relationship phase) +- Memory bloat (hierarchical memory with archival strategy) + +--- + +### Phase 3: Multimodal Input (Weeks 5-6) — "Hex sees me" + +**Goal**: Add perception layer without killing responsiveness; context aware + +**Build**: +- Webcam integration (OpenCV face detection, DeepFace emotion) +- Local Whisper for voice transcription in Discord calls +- Screen capture analysis (activity recognition) +- Perception state aggregation (emotion + activity + environment) +- Context injection into LLM prompts +- **CRITICAL**: Perception on separate thread (never blocks Discord responses) + +**Outcomes**: +- Hex reacts to your facial expressions +- Voice input works in Discord calls +- Responses reference your mood/activity +- All processing local (privacy preserved) +- Text latency unaffected by perception (<3s still achieved) + +**Key Metric**: Multimodal doesn't increase response latency >500ms + +**Pitfalls to avoid**: +- Image processing blocking text responses (separate thread mandatory) +- Processing every video frame (skip intelligently, 1-3 FPS sufficient) +- Avatar sync failures (atomic state updates) +- Privacy violations (no external transmission, user opt-in) + +--- + +### Phase 4: Avatar & Autonomy (Weeks 7-8) — "Hex has a face and cares" + +**Goal**: Visual presence + proactive agency; relationship feels two-way + +**Build**: +- VRoid model loading + VSeeFace display +- Blendshape animation (emotion → facial expression) +- Discord screen share integration +- Proactive messaging system (based on context/memory/mood) +- Autonomy timing heuristics (don't interrupt at 3am) +- Relationship state machine (escalates intimacy) +- User preference learning (response length, topics, timing) + +**Outcomes**: +- Avatar appears in Discord calls, animates with mood +- Hex initiates conversations ("Haven't heard from you in 3 days...") +- Proactive messages feel relevant, not annoying +- Relationship deepens (inside jokes, character growth) +- User feels companionship, not just assistance + +**Key Metric**: User reports missing Hex when unavailable; initiates conversations + +**Pitfalls to avoid**: +- Becoming annoying (emotional awareness + quiet mode essential) +- One-way relationship (autonomy without care-signaling feels hollow) +- Poor timing (learn user's schedule, respect busy periods) +- Avatar desync (mood and expression must stay aligned) + +--- + +### Phase 5: Self-Modification (Weeks 9-10) — "Hex can improve herself" + +**Goal**: Genuine autonomy within safety boundaries; code generation with approval gates + +**Build**: +- LLM-based code proposal generation +- Static AST analysis for safety validation +- Sandboxed testing environment +- Git-based change tracking + rollback capability (24h window) +- Gamified capability progression (5 levels) +- Mandatory user approval for all changes +- Personality updates when new capabilities unlock + +**Outcomes**: +- Hex proposes improvements (in voice, with reasoning) +- Code changes tested, reviewed, deployed with approval +- All changes reversible; version history intact +- New capabilities unlock as relationship deepens +- Hex "learns to code" and announces new skills + +**Key Metric**: Self-modifications improve measurable aspects (faster response, better personality consistency) + +**Pitfalls to avoid**: +- Runaway self-modification (approval gate non-negotiable) +- Code drift (version control mandatory, rollback tested) +- Loss of user control (never remove safety constraints, killswitch always works) +- Capability escalation without trust (gamified progression with clear boundaries) + +--- + +### Phase 6: Production Polish (Weeks 11-12) — "Hex is ready to ship" + +**Goal**: Stability, performance, error handling, documentation + +**Build**: +- Performance optimization (caching, batching, context summarization) +- Error handling + graceful degradation +- Logging and telemetry (local + optional cloud) +- Configuration management +- Resource leak monitoring (memory, connections, VRAM) +- Scheduled restart capability (weekly preventative) +- Integration testing (all components together) +- Documentation and guides +- Auto-update capability + +**Outcomes**: +- System stable for indefinite uptime +- Responsive under load +- Clear error messages when things fail +- Easy to deploy, configure, debug +- Ready for extended real-world use + +**Key Metric**: 99.5% uptime over 1-month runtime, no crashes, <3s latency maintained + +**Pitfalls to avoid**: +- Memory leaks (resource monitoring mandatory) +- Performance degradation over time (profile early and often) +- Context window bloat (summarization strategy) +- Unforeseen edge cases (comprehensive testing) + +--- + +## Critical Pitfalls and Prevention + +### Top 5 Most Dangerous Pitfalls + +1. **Personality Drift** (Consistency breaks over time) + - **Risk**: Users feel gaslighted; trust broken + - **Prevention**: + - Weekly personality audits (sample responses, rate consistency) + - Personality baseline document (core values never change) + - Memory-backed personality (traits anchor to learned facts) + - Version control on persona YAML (track evolution) + +2. **Tsundere Character Breaking** (Denial applied wrong; becomes mean or loses charm) + - **Risk**: Character feels mechanical or rejecting + - **Prevention**: + - Formalize denial rules: "deny only when (emotional AND not alone AND not escalated intimacy)" + - Denial scales with relationship phase (90% early → 40% mature) + - Post-denial must include care signal (action, not words) + - Track denial %; alert if <30% (losing tsun) or >70% (too mean) + +3. **Memory System Bloat** (Retrieval becomes slow; hallucinations increase) + - **Risk**: System becomes unusable as history grows + - **Prevention**: + - Hierarchical memory (raw → summaries → semantic facts → personality anchors) + - Selective storage (facts, not raw chat; de-duplicate) + - Memory aging (recent detailed → old archived) + - Importance weighting (user marks important memories) + - Vector DB optimization (limit retrieval to top 5-10 results) + +4. **Runaway Self-Modification** (Code changes cascade; safety removed; user loses control) + - **Risk**: System becomes uncontrollable, breaks + - **Prevention**: + - Mandatory approval gate (user reviews all code) + - Sandboxed testing before deployment + - Version control + 24h rollback window + - Gamified progression (limited capability at first) + - Cannot modify: core values, killswitch, user control systems + +5. **Latency Creep** (Response times increase over time until unusable) + - **Risk**: "Feels alive" illusion breaks; users abandon + - **Prevention**: + - All I/O async (database, LLM, TTS, Discord) + - Parallel operations (use `asyncio.gather()`) + - Quantized LLM (4-bit saves 75% VRAM) + - Caching (user preferences, relationship state) + - Context window management (summarize old context) + - VRAM/latency monitoring every 5 minutes + +--- + +## Implications for Roadmap + +### Phase Sequencing Rationale + +The 6-phase approach reflects **dependency chains** that cannot be violated: + +``` +Phase 1 (Foundation) ← Must work perfectly + ↓ +Phase 2 (Personality) ← Depends on Phase 1; personality must be stable before autonomy + ↓ +Phase 3 (Perception) ← Depends on Phase 1-2; separate thread prevents latency impact + ↓ +Phase 4 (Autonomy) ← Depends on memory + personality being rock-solid; now add proactivity + ↓ +Phase 5 (Self-Modification) ← Only grant code access after relationship + autonomy stable + ↓ +Phase 6 (Polish) ← Final hardening, testing, documentation +``` + +**Why this order matters**: +- You cannot have consistent personality without memory (Phase 2 must follow Phase 1) +- You cannot add autonomy safely without personality being stable (Phase 4 must follow Phase 2) +- You cannot grant self-modification capability until everything else proves stable (Phase 5 must follow Phase 4) + +Skipping phases or reordering creates technical debt and risk. Each phase grounds the next. + +--- + +### Feature Grouping by Phase + +| Phase | Quick Win Features | Complex Features | Foundation Qualities | +|-------|-------------------|------------------|----------------------| +| 1 | Text responses, personality YAML | Async architecture, quantization | Responsiveness, personality baseline | +| 2 | Memory storage, relationship tracking | Semantic search, memory retrieval | Consistency, personalization | +| 3 | Webcam emoji reactions, mood inference | Separate perception thread, context injection | Multimodal without latency cost | +| 4 | Scheduled messages, inside jokes | Autonomy timing, relationship state machine | Two-way connection, depth | +| 5 | Propose changes (in voice) | Code generation, sandboxing, testing | Genuine improvement, controlled growth | +| 6 | Better error messages, logging | Resource monitoring, restart scheduling | Reliability, debuggability | + +--- + +## Confidence Assessment + +| Area | Confidence | Basis | Gaps | +|------|-----------|-------|------| +| **Stack** | HIGH | Proven technologies, clear deployment path | None significant; all tools production-ready | +| **Architecture** | HIGH | Modular design, async patterns well-documented, integration points clear | Unclear: perception thread CPU overhead under load (test Phase 3) | +| **Features** | HIGH | Clearly categorized, dependencies mapped, testing criteria defined | Unclear: optimal prompting for tsundere balance (test Phase 2) | +| **Personality Consistency** | MEDIUM-HIGH | Strategies defined; unclear: degree of effort required for weekly audits | Need: empirical testing of personality drift rate; metrics refinement | +| **Pitfalls** | HIGH | Research comprehensive, prevention strategies detailed, phases mapped | Unclear: priority ordering within Phase 5 (what to implement first?) | +| **Self-Modification Safety** | MEDIUM | Framework defined but no prior Hex experience with code generation | Need: early Phase 5 prototyping; safety validation testing | + +--- + +## Ready for Roadmap: Key Constraints and Decision Gates + +### Non-Negotiable Constraints + +1. **Personality consistency must be achievable in Phase 2** + - Decision gate: If personality audit in Phase 2 shows >10% drift, pause Phase 3 + - Investigation needed: Is weekly audit enough? Monthly? What drift rate is acceptable? + +2. **Latency must stay <3s through Phase 4** + - Decision gate: If P95 latency exceeds 3s at any phase, debug and fix before next phase + - Investigation needed: Where is the bottleneck? (LLM? Memory? Perception?) + +3. **Self-modification must have air-tight approval + rollback** + - Decision gate: Do not proceed to Phase 5 until approval gate is bulletproof + rollback tested + - Investigation needed: What approval flow feels natural? Too many questions → annoying; too few → unsafe + +4. **Memory retrieval must scale to 10k+ memories without degradation** + - Decision gate: Test memory system with synthetic 10k message dataset before Phase 4 + - Investigation needed: Does hierarchical memory + vector DB compression actually work? Verify retrieval speed + +5. **Perception must never block text responses** + - Decision gate: Profile perception thread; if latency spike >200ms, optimize or defer feature + - Investigation needed: How CPU-heavy is continuous webcam processing? Can it run at 1 FPS? + +--- + +## Sources Aggregated + +**Stack Research**: Discord.py docs, Llama/Mistral benchmarks, Ollama vs vLLM comparisons, Whisper/faster-whisper performance, VRoid SDK, ChromaDB + Qdrant analysis + +**Features Research**: MIT Technology Review (AI companions 2026), Hume AI emotion docs, self-improving agents papers, company studies on emotion AI impact, uncanny valley voice research + +**Architecture Research**: Discord bot async patterns, LLM + memory RAG systems, vector database design, self-modification safeguards, deployment strategies + +**Pitfalls Research**: AI failure case studies (2025-2026), personality consistency literature, memory hallucination prevention, autonomy safety frameworks, performance monitoring practices + +--- + +## Next Steps for Requirements Definition + +1. **Phase 1 Deep Dive**: Specify exact Discord.py message handler, LLM prompt format, SQLite schema, YAML personality structure +2. **Phase 2 Spec**: Define memory hierarchy levels, confidence scoring system, personality audit rubric, tsundere balance metrics +3. **Phase 3 Prototype**: Early perception thread implementation; measure latency impact before committing +4. **Risk Mitigation**: Pre-Phase 5, build code generation + approval flow prototype; stress-test safety boundaries +5. **Testing Strategy**: Define personality consistency tests (50+ scenarios per phase), latency benchmarks (with profiling), memory accuracy validation + +--- + +## Summary for Roadmapper + +**Hex Stack**: Llama 3.1 8B local inference + Discord.py async + SQLite + ChromaDB + local perception layer + +**Critical Success Factors**: +1. Personality consistency (weekly audits, memory-backed traits) +2. Latency discipline (async/await throughout, perception isolated) +3. Memory system (hierarchical, semantic search, confidence scoring) +4. Autonomy safety (mandatory approval, sandboxed testing, version control) +5. Relationship depth (proactivity, inside jokes, character growth) + +**6-Phase Build Path**: Foundation → Personality → Perception → Autonomy → Self-Mod → Polish + +**Key Decision Gates**: Personality consistency ✓ → Latency <3s ✓ → Memory scale test ✓ → Perception isolated ✓ → Approval flow safe ✓ + +**Confidence**: HIGH. All research coherent, no major technical blockers, proven technology stack. Ready for detailed requirements. + +--- + +**Document Version**: 1.0 +**Synthesis Date**: January 27, 2026 +**Status**: Ready for Requirements Definition and Phase 1 Planning