5 Commits

Author SHA1 Message Date
10ccdc2420 feat: add training data collection for Rosie
Personality Dataset (300+ examples):
- Greetings and farewells
- Emotions and reactions
- Physical interactions (pats, drags, touches)
- Questions and answers
- Help and support
- Jokes and entertainment
- Mood-based responses
- Conversation fillers
- Various user intents

Data Download Script:
- Download Project Gutenberg books (public domain)
- Instructions for OpenWebText (~8B tokens)
- Instructions for The Pile (~300B tokens)
- Automatic dataset combination
- Token counting and statistics
- Download progress bars

Ready to train:
1. Run: python scripts/download_training_data.py --all
2. Download additional datasets as needed
3. Run: python train_rosie.py --data_path data/combined_training.json

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 23:44:36 -04:00
c7ce0085fb feat: implement custom Rosie transformer model from scratch
Architecture:
- Custom GPT-style decoder-only transformer (500M params)
- 768 hidden size, 12 layers, 12 attention heads
- 32k vocabulary with BPE tokenizer
- Built-in emotion classification head
- 2048 token context window

Components:
- Multi-head self-attention mechanism
- Feed-forward networks with GELU- Layer normalization and residual connections
- Custom tokenizer with special tokens for emotions/actions
- Generation with temperature, top-k, and nucleus sampling

Training Infrastructure:
- Full training script with data loading
- Gradient clipping and mixed precision support
- Checkpoint management
- Training guide with 3-phase approach:
  * Phase 1: Base language (10-50B tokens, 3-7 days)
  * Phase 2: Personality fine-tuning (100k-500k examples, 1-2 days)
  * Phase 3: Emotion training (50k-100k examples, 6-12 hours)

Integration:
- Inference engine for real-time generation
- Emotion detection from responses
- Conversation history management
- Ready for desktop app and Discord bot integration

No external model dependencies - 100% custom and unbiased

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 22:46:15 -04:00
ae1a349dd8 feat: add Discord bot integration
- Discord bot runs in background thread alongside desktop app
- State synchronization between Discord and desktop waifu
- Commands: !hello, !status
- Responds to mentions and DMs
- Complete setup guide in DISCORD_SETUP.md
- Graceful fallback if no token configured

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 22:24:22 -04:00
337a681df3 feat: add VRM model loading functionality
This commit introduces the core VRM loading capabilities for the 3D rendering system. It implements a dedicated VRMLoader class that can parse VRM files (which are binary glTF format) and extract essential data including meshes, materials, textures, and VRM-specific extension metadata. The implementation handles both binary and JSON formatted VRM files while supporting standard glTF accessor data extraction for vertices, normals, UV coordinates, and indices. This enables the application to load and render 3D models with proper material and texture support from VRM files.
2025-09-30 19:24:18 -04:00
a657979bfd Initial commit: Desktop Waifu MVP foundation
- Project structure with modular architecture
- State management system (emotion states, conversation history)
- Transparent, draggable PyQt6 window
- OpenGL rendering widget with placeholder cube
- Discord bot framework (commands, event handlers)
- Complete documentation (README, project plan, research findings)
- Environment configuration template
- Dependencies defined in requirements.txt

MVP features working:
- Transparent window appears at bottom-right
- Window can be dragged around
- Placeholder 3D cube renders and rotates
- Emotion state changes on interaction
- Event-driven state management

Next steps: VRM model loading and rendering
2025-09-30 18:42:54 -04:00