Rosie

5 Commits 1 Branch 0 Tags

Author	SHA1	Message	Date
Dani	10ccdc2420	feat: add training data collection for Rosie Personality Dataset (300+ examples): - Greetings and farewells - Emotions and reactions - Physical interactions (pats, drags, touches) - Questions and answers - Help and support - Jokes and entertainment - Mood-based responses - Conversation fillers - Various user intents Data Download Script: - Download Project Gutenberg books (public domain) - Instructions for OpenWebText (~8B tokens) - Instructions for The Pile (~300B tokens) - Automatic dataset combination - Token counting and statistics - Download progress bars Ready to train: 1. Run: python scripts/download_training_data.py --all 2. Download additional datasets as needed 3. Run: python train_rosie.py --data_path data/combined_training.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-09-30 23:44:36 -04:00
Dani	c7ce0085fb	feat: implement custom Rosie transformer model from scratch Architecture: - Custom GPT-style decoder-only transformer (500M params) - 768 hidden size, 12 layers, 12 attention heads - 32k vocabulary with BPE tokenizer - Built-in emotion classification head - 2048 token context window Components: - Multi-head self-attention mechanism - Feed-forward networks with GELU- Layer normalization and residual connections - Custom tokenizer with special tokens for emotions/actions - Generation with temperature, top-k, and nucleus sampling Training Infrastructure: - Full training script with data loading - Gradient clipping and mixed precision support - Checkpoint management - Training guide with 3-phase approach: * Phase 1: Base language (10-50B tokens, 3-7 days) * Phase 2: Personality fine-tuning (100k-500k examples, 1-2 days) * Phase 3: Emotion training (50k-100k examples, 6-12 hours) Integration: - Inference engine for real-time generation - Emotion detection from responses - Conversation history management - Ready for desktop app and Discord bot integration No external model dependencies - 100% custom and unbiased 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-09-30 22:46:15 -04:00
Dani	ae1a349dd8	feat: add Discord bot integration - Discord bot runs in background thread alongside desktop app - State synchronization between Discord and desktop waifu - Commands: !hello, !status - Responds to mentions and DMs - Complete setup guide in DISCORD_SETUP.md - Graceful fallback if no token configured 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-09-30 22:24:22 -04:00
Dani	337a681df3	feat: add VRM model loading functionality This commit introduces the core VRM loading capabilities for the 3D rendering system. It implements a dedicated VRMLoader class that can parse VRM files (which are binary glTF format) and extract essential data including meshes, materials, textures, and VRM-specific extension metadata. The implementation handles both binary and JSON formatted VRM files while supporting standard glTF accessor data extraction for vertices, normals, UV coordinates, and indices. This enables the application to load and render 3D models with proper material and texture support from VRM files.	2025-09-30 19:24:18 -04:00
Dani	a657979bfd	Initial commit: Desktop Waifu MVP foundation - Project structure with modular architecture - State management system (emotion states, conversation history) - Transparent, draggable PyQt6 window - OpenGL rendering widget with placeholder cube - Discord bot framework (commands, event handlers) - Complete documentation (README, project plan, research findings) - Environment configuration template - Dependencies defined in requirements.txt MVP features working: - Transparent window appears at bottom-right - Window can be dragged around - Placeholder 3D cube renders and rotates - Emotion state changes on interaction - Event-driven state management Next steps: VRM model loading and rendering	2025-09-30 18:42:54 -04:00