- Completed Task 2: Context-aware and timeline search
- ContextAwareSearch class with topic classification and result prioritization
- TimelineSearch class with date-range filtering and temporal proximity
- Enhanced MemoryManager with unified search interface
- Supports semantic, keyword, context-aware, timeline, and hybrid search
- Added search result dataclasses with relevance scoring
- Integrated all search strategies into MemoryManager.search() method
All search modes operational:
- Semantic search with sentence-transformers embeddings
- Context-aware search with topic-based prioritization
- Timeline search with date filtering and recency weighting
- Hybrid search combining multiple strategies
Search results include conversation context and relevance scoring as required.
- Added sentence-transformers to requirements.txt for semantic embeddings
- Created src/memory/retrieval/ module with search capabilities
- Implemented SemanticSearch class with embedding generation and vector similarity
- Added SearchResult and SearchQuery dataclasses for structured search results
- Included hybrid search combining semantic and keyword matching
- Added conversation indexing for semantic search
- Followed lazy loading pattern for embedding model performance
Files created:
- src/memory/retrieval/__init__.py
- src/memory/retrieval/search_types.py
- src/memory/retrieval/semantic_search.py
- Updated src/memory/__init__.py with enhanced MemoryManager
Note: sentence-transformers installation requires proper venv setup in production
Phase 03: resource-management
- Enhanced GPU detection with pynvml support
- Hardware tier detection and management system
- Proactive scaling with hybrid monitoring
- Personality-driven resource communication
- All phase goals verified
Tasks completed: 2/2
- Implemented ResourcePersonality with dere-tsun gremlin persona
- Integrated personality-aware model switching with degradation notifications
SUMMARY: .planning/phases/03-resource-management/03-04-SUMMARY.md
- Added ResourcePersonality import and initialization to ModelManager
- Created personality_aware_model_switch() method for graceful degradation notifications
- Only notifies users about capability downgrades, not upgrades (per requirements)
- Includes optional technical tips for resource optimization
- Updated proactive scaling callbacks to use personality-aware switching
- Enhanced failure handling with personality-driven resource requests
- Added _is_capability_downgrade() helper for capability comparison
- Created ResourcePersonality class with Drowsy Dere-Tsun Onee-san Hex-Mentor Gremlin personality
- Includes mood system with sleepy, grumpy, helpful, gremlin, and mentor states
- Personality-specific vocabularies for different emotional responses
- Optional technical tips with hexadecimal/coding references
- generate_resource_message() for contextual resource communications
- Support for resource requests, degradation notices, system status, and scaling recommendations
- Added ProactiveScaler integration with HardwareTierDetector
- Implemented pre-flight resource checks before model inference
- Enhanced model selection with scaling recommendations
- Added graceful degradation handling for resource constraints
- Integrated performance metrics tracking for scaling decisions
- Added proactive upgrade execution with stabilization periods
- Enhanced status reporting with scaling information
- Maintained silent switching behavior per Phase 1 decisions
- Created ProactiveScaler class for proactive resource management
- Implemented continuous background monitoring with configurable intervals
- Added pre-flight resource checks before operations
- Implemented graceful degradation cascades with stabilization periods
- Added trend analysis for predictive scaling decisions
- Included hysteresis to prevent model switching thrashing
- Provided callbacks for integration with ModelManager
- Thread-safe implementation with proper shutdown handling
- Created comprehensive hardware tier detection system
- Loads configurable tier definitions from YAML
- Classifies systems based on RAM, CPU cores, and GPU capabilities
- Provides model recommendations and performance characteristics
- Includes caching for performance and error handling
- Integrates with ResourceMonitor for real-time data
- Added comprehensive tier definitions for low_end, mid_range, high_end
- Configurable thresholds for RAM, CPU cores, GPU requirements
- Model size recommendations per tier (1B-70B parameter range)
- Performance characteristics and scaling thresholds
- Global settings for model selection and scaling behavior
- Created src/resource/__init__.py with module docstring
- Exported HardwareTierDetector (to be implemented)
- Established resource management module foundation
- Added caching for GPU info to avoid repeated pynvml initialization
- Added pynvml failure tracking to skip repeated failed attempts
- Optimized CPU measurement interval from 1.0s to 0.05s
- Reduced monitoring overhead from ~1000ms to ~50ms per call
- Maintained accuracy while significantly improving performance
- Added pynvml import with graceful fallback handling
- Enhanced _get_gpu_info() method using pynvml for NVIDIA GPUs
- Added detailed GPU metrics: total/used/free VRAM, utilization, temperature
- Updated get_current_resources() to include comprehensive GPU info
- Maintained backward compatibility with existing gpu_vram_gb field
- Added gpu-tracker fallback for AMD/Intel GPUs
- Proper error handling for pynvml initialization failures
- Ensured pynvmlShutdown() always called in finally-style logic
Phase 01: Model Interface & Switching
- All 3 plans executed and verified
- LM Studio connectivity, resource monitoring, and intelligent switching implemented
Tasks completed: 3/3
- ModelManager with intelligent selection and switching
- Core Mai orchestration class
- CLI interface for testing and monitoring
SUMMARY: .planning/phases/01-model-interface/01-03-SUMMARY.md
Phase 1 complete - model interface foundation ready for Phase 2: Safety & Sandboxing
- Implement __main__.py with argparse command-line interface
- Add interactive chat loop for testing model switching
- Include status commands to show current model and resources
- Support models listing and manual model switching
- Add proper signal handling for graceful shutdown
- Include help text and usage examples
- Fix import issues for relative imports in package