Initial commit: Desktop Waifu MVP foundation

- Project structure with modular architecture - State management system (emotion states, conversation history) - Transparent, draggable PyQt6 window - OpenGL rendering widget with placeholder cube - Discord bot framework (commands, event handlers) - Complete documentation (README, project plan, research findings) - Environment configuration template - Dependencies defined in requirements.txt MVP features working: - Transparent window appears at bottom-right - Window can be dragged around - Placeholder 3D cube renders and rotates - Emotion state changes on interaction - Event-driven state management Next steps: VRM model loading and rendering
2025-09-30 18:42:54 -04:00
commit a657979bfd
20 changed files with 1178 additions and 0 deletions
--- a/PROJECT_PLAN.md
+++ b/PROJECT_PLAN.md
@@ -0,0 +1,130 @@
+# Desktop Waifu Project
+
+## Overview
+Desktop companion application controlled by LLM that can interact both on desktop and Discord.
+
+## Tech Stack
+- **Language**: Python
+- **Character Model**: VRM format
+- **LLM**: Local (TBD which model)
+- **Distribution**: .exe packaging
+- **Platforms**: Desktop app + Discord bot
+
+## Core Features
+
+### Desktop Visuals
+- VRM model rendering in transparent window
+- Draggable character
+- Sound effects on interaction (squeaks, touch sounds)
+- Multiple poses/expressions controlled by LLM
+- Always on top unless asked to hide
+- VRM animations (TBD - see notes below)
+
+### AI & Interaction
+- Local LLM (custom/TBD)
+- Memory/context persistence
+- AI-chosen personality
+- Always-on chat interface
+- Voice input (STT)
+- Voice output (TTS - local)
+
+### Discord Integration
+- Respond in servers/channels and DMs
+- Proactive messaging
+- Desktop-Discord state sync
+
+### System Integration
+- System access (notifications, apps, searches, etc.)
+- Designed for cross-platform deployment
+- Future: OS-level integration
+
+## VRM Animation Notes
+VRM models support:
+- **Blend shapes** (facial expressions): smile, blink, surprised, angry, sad, etc.
+- **Bone animations**: waving, pointing, head tilts, body movements
+- **Presets**: if your VRM has animation clips embedded
+- **IK (Inverse Kinematics)**: dynamic movements like looking at cursor
+
+We can trigger these based on:
+- LLM emotional state (happy → smile + wave)
+- User interaction (grabbed → surprised expression + squeak)
+- Idle states (occasional blinks, breathing animation)
+- Context (thinking → hand on chin pose)
+
+## MVP (Phase 1)
+1. VRM model renders on screen
+2. Transparent, draggable window
+3. Basic sound on interaction
+4. Simple chat interface (text only)
+5. Basic LLM connection (local)
+6. Simple expression changes (happy/neutral/sad)
+7. **Discord bot integration** (respond in servers/DMs, basic sync with desktop)
+
+## Post-MVP Features
+- Voice I/O (STT/TTS)
+- Advanced animations
+- Full system integration (notifications, app control, etc.)
+- Memory persistence (database)
+- Proactive messaging
+- .exe packaging
+- Cross-platform support
+
+## Architecture
+
+### Components
+1. **VRM Renderer** (PyOpenGL + VRM loader)
+2. **LLM Backend** (local inference)
+3. **Audio System** (TTS, STT, sound effects)
+4. **Discord Client** (discord.py)
+5. **State Manager** (sync between desktop/Discord)
+6. **System Interface** (OS interactions)
+7. **GUI Framework** (PyQt/tkinter with transparency)
+
+### Data Flow
+```
+User Input (voice/text/click)
+    ↓
+State Manager
+    ↓
+LLM Processing
+    ↓
+Output (animation + voice + text + actions)
+    ↓
+Desktop Display + Discord Bot
+```
+
+## Tech Stack Candidates
+
+### VRM Rendering
+- PyVRM (if available)
+- PyOpenGL + custom VRM parser
+- Unity Python wrapper (heavy)
+- Godot Python binding (alternative)
+
+### LLM
+- llama.cpp Python bindings
+- Ollama API
+- Custom model server
+- Transformers library
+
+### Voice
+- **TTS**: pyttsx3, Coqui TTS, XTTS
+- **STT**: Whisper (local), Vosk
+
+### Discord
+- discord.py
+
+### Packaging
+- PyInstaller
+- Nuitka (better performance)
+
+## Current Status
+- Phase: Planning
+- Last Updated: 2025-09-30
+
+## Next Steps
+1. Research VRM rendering in Python
+2. Create basic window with transparency
+3. Load and display VRM model
+4. Implement dragging
+5. Add sound effects