- Created src/resource/__init__.py with module docstring
- Exported HardwareTierDetector (to be implemented)
- Established resource management module foundation
- Added caching for GPU info to avoid repeated pynvml initialization
- Added pynvml failure tracking to skip repeated failed attempts
- Optimized CPU measurement interval from 1.0s to 0.05s
- Reduced monitoring overhead from ~1000ms to ~50ms per call
- Maintained accuracy while significantly improving performance
- Added pynvml import with graceful fallback handling
- Enhanced _get_gpu_info() method using pynvml for NVIDIA GPUs
- Added detailed GPU metrics: total/used/free VRAM, utilization, temperature
- Updated get_current_resources() to include comprehensive GPU info
- Maintained backward compatibility with existing gpu_vram_gb field
- Added gpu-tracker fallback for AMD/Intel GPUs
- Proper error handling for pynvml initialization failures
- Ensured pynvmlShutdown() always called in finally-style logic
- Implement __main__.py with argparse command-line interface
- Add interactive chat loop for testing model switching
- Include status commands to show current model and resources
- Support models listing and manual model switching
- Add proper signal handling for graceful shutdown
- Include help text and usage examples
- Fix import issues for relative imports in package
- Initialize ModelManager, ContextManager, and subsystems
- Provide main conversation interface with process_message
- Support both synchronous and async operations
- Add system status monitoring and conversation history
- Include graceful shutdown with signal handlers
- Background resource monitoring and maintenance tasks
- Model switching commands and information methods
- Load model configuration from config/models.yaml
- Intelligent model selection based on system resources and context
- Dynamic model switching with silent behavior (no user notifications)
- Fallback chains for model failures
- Proper resource cleanup and error handling
- Background preloading capability
- Auto-retry on model failures with graceful degradation
- Add ConversationMetadata to imports
- Fix metadata initialization in create_conversation()
- Resolve type error for conversation metadata
File: src/models/context_manager.py
- Define Message, Conversation, ContextBudget, and ContextWindow classes
- Implement MessageRole and MessageType enums for classification
- Add Pydantic models for validation and serialization
- Include importance scoring and token estimation utilities
- Support system, user, assistant, and tool message types
File: src/models/conversation.py (147 lines)
- Created ResourceMonitor class with psutil integration
- Monitor CPU usage, memory availability, and GPU VRAM
- Added resource trend analysis for load prediction
- Implemented should_switch_model() logic based on thresholds
- Added can_load_model() method with safety margins
- Follow Pattern 2 from research: Resource-Aware Model Selection
- Graceful handling of missing gpu-tracker dependency
- Created LMStudioAdapter class using lmstudio-python SDK
- Added context manager get_client() for safe client handling
- Implemented list_available_models() with size estimation
- Added load_model(), unload_model(), get_model_info() methods
- Created mock_lmstudio.py for graceful fallback when lmstudio not installed
- Included error handling for LM Studio not running and model loading failures
- Implemented Pattern 1 from research: Model Client Factory
- Created pyproject.toml with lmstudio, psutil, pydantic dependencies
- Created requirements.txt as fallback for pip install
- Created src/models/__init__.py with proper imports
- Set up PEP 518 compliant package structure
- Fixed .gitignore to allow src/models/ directory