Phase 03: resource-management
- Enhanced GPU detection with pynvml support
- Hardware tier detection and management system
- Proactive scaling with hybrid monitoring
- Personality-driven resource communication
- All phase goals verified
Tasks completed: 2/2
- Implemented ResourcePersonality with dere-tsun gremlin persona
- Integrated personality-aware model switching with degradation notifications
SUMMARY: .planning/phases/03-resource-management/03-04-SUMMARY.md
- Added ResourcePersonality import and initialization to ModelManager
- Created personality_aware_model_switch() method for graceful degradation notifications
- Only notifies users about capability downgrades, not upgrades (per requirements)
- Includes optional technical tips for resource optimization
- Updated proactive scaling callbacks to use personality-aware switching
- Enhanced failure handling with personality-driven resource requests
- Added _is_capability_downgrade() helper for capability comparison
- Created ResourcePersonality class with Drowsy Dere-Tsun Onee-san Hex-Mentor Gremlin personality
- Includes mood system with sleepy, grumpy, helpful, gremlin, and mentor states
- Personality-specific vocabularies for different emotional responses
- Optional technical tips with hexadecimal/coding references
- generate_resource_message() for contextual resource communications
- Support for resource requests, degradation notices, system status, and scaling recommendations
- Added ProactiveScaler integration with HardwareTierDetector
- Implemented pre-flight resource checks before model inference
- Enhanced model selection with scaling recommendations
- Added graceful degradation handling for resource constraints
- Integrated performance metrics tracking for scaling decisions
- Added proactive upgrade execution with stabilization periods
- Enhanced status reporting with scaling information
- Maintained silent switching behavior per Phase 1 decisions
- Created ProactiveScaler class for proactive resource management
- Implemented continuous background monitoring with configurable intervals
- Added pre-flight resource checks before operations
- Implemented graceful degradation cascades with stabilization periods
- Added trend analysis for predictive scaling decisions
- Included hysteresis to prevent model switching thrashing
- Provided callbacks for integration with ModelManager
- Thread-safe implementation with proper shutdown handling
- Created comprehensive hardware tier detection system
- Loads configurable tier definitions from YAML
- Classifies systems based on RAM, CPU cores, and GPU capabilities
- Provides model recommendations and performance characteristics
- Includes caching for performance and error handling
- Integrates with ResourceMonitor for real-time data
- Added comprehensive tier definitions for low_end, mid_range, high_end
- Configurable thresholds for RAM, CPU cores, GPU requirements
- Model size recommendations per tier (1B-70B parameter range)
- Performance characteristics and scaling thresholds
- Global settings for model selection and scaling behavior
- Created src/resource/__init__.py with module docstring
- Exported HardwareTierDetector (to be implemented)
- Established resource management module foundation
- Added caching for GPU info to avoid repeated pynvml initialization
- Added pynvml failure tracking to skip repeated failed attempts
- Optimized CPU measurement interval from 1.0s to 0.05s
- Reduced monitoring overhead from ~1000ms to ~50ms per call
- Maintained accuracy while significantly improving performance
- Added pynvml import with graceful fallback handling
- Enhanced _get_gpu_info() method using pynvml for NVIDIA GPUs
- Added detailed GPU metrics: total/used/free VRAM, utilization, temperature
- Updated get_current_resources() to include comprehensive GPU info
- Maintained backward compatibility with existing gpu_vram_gb field
- Added gpu-tracker fallback for AMD/Intel GPUs
- Proper error handling for pynvml initialization failures
- Ensured pynvmlShutdown() always called in finally-style logic
Phase 01: Model Interface & Switching
- All 3 plans executed and verified
- LM Studio connectivity, resource monitoring, and intelligent switching implemented
Tasks completed: 3/3
- ModelManager with intelligent selection and switching
- Core Mai orchestration class
- CLI interface for testing and monitoring
SUMMARY: .planning/phases/01-model-interface/01-03-SUMMARY.md
Phase 1 complete - model interface foundation ready for Phase 2: Safety & Sandboxing
- Implement __main__.py with argparse command-line interface
- Add interactive chat loop for testing model switching
- Include status commands to show current model and resources
- Support models listing and manual model switching
- Add proper signal handling for graceful shutdown
- Include help text and usage examples
- Fix import issues for relative imports in package
- Initialize ModelManager, ContextManager, and subsystems
- Provide main conversation interface with process_message
- Support both synchronous and async operations
- Add system status monitoring and conversation history
- Include graceful shutdown with signal handlers
- Background resource monitoring and maintenance tasks
- Model switching commands and information methods
- Load model configuration from config/models.yaml
- Intelligent model selection based on system resources and context
- Dynamic model switching with silent behavior (no user notifications)
- Fallback chains for model failures
- Proper resource cleanup and error handling
- Background preloading capability
- Auto-retry on model failures with graceful degradation
Tasks completed: 2/2
- Created conversation data structures with Pydantic validation
- Implemented intelligent context manager with hybrid compression
SUMMARY: .planning/phases/01-model-interface/01-02-SUMMARY.md
STATE: Updated to reflect Plan 2 completion
ROADMAP: Updated Plan 2 as complete
- Add ConversationMetadata to imports
- Fix metadata initialization in create_conversation()
- Resolve type error for conversation metadata
File: src/models/context_manager.py
- Define Message, Conversation, ContextBudget, and ContextWindow classes
- Implement MessageRole and MessageType enums for classification
- Add Pydantic models for validation and serialization
- Include importance scoring and token estimation utilities
- Support system, user, assistant, and tool message types
File: src/models/conversation.py (147 lines)