- Added comprehensive tier definitions for low_end, mid_range, high_end
- Configurable thresholds for RAM, CPU cores, GPU requirements
- Model size recommendations per tier (1B-70B parameter range)
- Performance characteristics and scaling thresholds
- Global settings for model selection and scaling behavior
- Created src/resource/__init__.py with module docstring
- Exported HardwareTierDetector (to be implemented)
- Established resource management module foundation
- Added caching for GPU info to avoid repeated pynvml initialization
- Added pynvml failure tracking to skip repeated failed attempts
- Optimized CPU measurement interval from 1.0s to 0.05s
- Reduced monitoring overhead from ~1000ms to ~50ms per call
- Maintained accuracy while significantly improving performance
- Added pynvml import with graceful fallback handling
- Enhanced _get_gpu_info() method using pynvml for NVIDIA GPUs
- Added detailed GPU metrics: total/used/free VRAM, utilization, temperature
- Updated get_current_resources() to include comprehensive GPU info
- Maintained backward compatibility with existing gpu_vram_gb field
- Added gpu-tracker fallback for AMD/Intel GPUs
- Proper error handling for pynvml initialization failures
- Ensured pynvmlShutdown() always called in finally-style logic
Phase 01: Model Interface & Switching
- All 3 plans executed and verified
- LM Studio connectivity, resource monitoring, and intelligent switching implemented
Tasks completed: 3/3
- ModelManager with intelligent selection and switching
- Core Mai orchestration class
- CLI interface for testing and monitoring
SUMMARY: .planning/phases/01-model-interface/01-03-SUMMARY.md
Phase 1 complete - model interface foundation ready for Phase 2: Safety & Sandboxing
- Implement __main__.py with argparse command-line interface
- Add interactive chat loop for testing model switching
- Include status commands to show current model and resources
- Support models listing and manual model switching
- Add proper signal handling for graceful shutdown
- Include help text and usage examples
- Fix import issues for relative imports in package
- Initialize ModelManager, ContextManager, and subsystems
- Provide main conversation interface with process_message
- Support both synchronous and async operations
- Add system status monitoring and conversation history
- Include graceful shutdown with signal handlers
- Background resource monitoring and maintenance tasks
- Model switching commands and information methods
- Load model configuration from config/models.yaml
- Intelligent model selection based on system resources and context
- Dynamic model switching with silent behavior (no user notifications)
- Fallback chains for model failures
- Proper resource cleanup and error handling
- Background preloading capability
- Auto-retry on model failures with graceful degradation
Tasks completed: 2/2
- Created conversation data structures with Pydantic validation
- Implemented intelligent context manager with hybrid compression
SUMMARY: .planning/phases/01-model-interface/01-02-SUMMARY.md
STATE: Updated to reflect Plan 2 completion
ROADMAP: Updated Plan 2 as complete
- Add ConversationMetadata to imports
- Fix metadata initialization in create_conversation()
- Resolve type error for conversation metadata
File: src/models/context_manager.py
- Define Message, Conversation, ContextBudget, and ContextWindow classes
- Implement MessageRole and MessageType enums for classification
- Add Pydantic models for validation and serialization
- Include importance scoring and token estimation utilities
- Support system, user, assistant, and tool message types
File: src/models/conversation.py (147 lines)
Tasks completed: 4/4
- Created Python project foundation with dependencies
- Implemented LM Studio adapter with model discovery
- Implemented system resource monitoring with trend analysis
- Created model configuration system with fallback chains
SUMMARY: .planning/phases/01-model-interface/01-01-SUMMARY.md
STATE: Updated to reflect plan completion
- Created comprehensive model definitions in config/models.yaml
- Defined model categories: small, medium, large
- Specified resource requirements for each model
- Added context window sizes and capability lists
- Configured fallback chains for graceful degradation
- Included selection rules and switching triggers
- Added context management compression settings
- Created ResourceMonitor class with psutil integration
- Monitor CPU usage, memory availability, and GPU VRAM
- Added resource trend analysis for load prediction
- Implemented should_switch_model() logic based on thresholds
- Added can_load_model() method with safety margins
- Follow Pattern 2 from research: Resource-Aware Model Selection
- Graceful handling of missing gpu-tracker dependency
- Created LMStudioAdapter class using lmstudio-python SDK
- Added context manager get_client() for safe client handling
- Implemented list_available_models() with size estimation
- Added load_model(), unload_model(), get_model_info() methods
- Created mock_lmstudio.py for graceful fallback when lmstudio not installed
- Included error handling for LM Studio not running and model loading failures
- Implemented Pattern 1 from research: Model Client Factory
- Created pyproject.toml with lmstudio, psutil, pydantic dependencies
- Created requirements.txt as fallback for pip install
- Created src/models/__init__.py with proper imports
- Set up PEP 518 compliant package structure
- Fixed .gitignore to allow src/models/ directory
Organizes 15 phases into three major milestones:
- v1.0 Core (Phases 1-5): Foundation systems with models, safety, memory
- v1.1 Interfaces (Phases 6-10): CLI, self-improvement, approval, personality, Discord
- v1.2 Presence (Phases 11-15): Offline, voice visualization, avatar, Android, sync
Maps all 99 requirements to phases with success criteria per milestone.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Create comprehensive MCP.md documenting all available tools:
* Hugging Face Hub (models, datasets, papers, spaces, docs)
* Web search and fetch for research
* Code tools (Bash, Git, file ops)
* Claude Code (GSD) workflow agents
- Map MCP usage to specific phases:
* Phase 1: Model discovery (Mistral, Llama, quantized options)
* Phase 2: Safety research (sandboxing, verification papers)
* Phase 5: Conversation datasets and papers
* Phase 12: Voice visualization models and spaces
* Phase 13: Avatar generation tools and research
* Phase 14: Mobile inference frameworks and patterns
- Update config.json with MCP settings:
* Enable Hugging Face (mystiatech authenticated)
* Enable WebSearch for current practices
* Set default result limits
- Update PROJECT.md constraints to document MCP enablement
Research phases will leverage MCPs extensively for optimal
library/model selection, architecture patterns, and best practices.
- Enable git push.autoSetupRemote for automatic tracking setup
- Add push.followTags to include tags in pushes
- Install post-commit hook for automatic push after each commit
- Update config.json to document auto-push behavior
- Remote: master (giteas.fullmooncyberworks.com/mystiatech/Mai)
All commits will now automatically push to the remote branch.
- Update PROJECT.md: Add Android, visualizer, and avatar to v1
- Update REQUIREMENTS.md: 99 requirements across 15 phases (fresh slate)
- Add comprehensive README.md with setup, architecture, and usage
- Add PROGRESS.md for Discord forum sharing
- Add .gitignore for Python/.venv and project artifacts
- Note: All development via Claude Code/OpenCode workflow
- Note: Python deps managed via .venv virtual environment
Core value: Mai is a real collaborator, not a tool. She learns from you,
improves herself, has boundaries and opinions, and becomes more *her* over time.
v1 includes: Model interface, Safety, Resources, Memory, Conversation,
CLI, Self-Improvement, Approval, Personality, Discord, Offline, Voice
Visualization, Avatar, Android App, Device Sync.