--- phase: 03-resource-management plan: 03 type: execute wave: 2 depends_on: [03-01, 03-02] files_modified: [src/resource/scaling.py, src/models/model_manager.py] autonomous: true user_setup: [] must_haves: truths: - "Proactive scaling prevents performance degradation before it impacts users" - "Hybrid monitoring combines continuous checks with pre-flight validation" - "Graceful degradation completes current tasks before model switching" artifacts: - path: "src/resource/scaling.py" provides: "Proactive scaling algorithms with hybrid monitoring" min_lines: 150 - path: "src/models/model_manager.py" provides: "Enhanced model manager with proactive scaling integration" contains: "ProactiveScaler" min_lines: 650 key_links: - from: "src/resource/scaling.py" to: "src/models/resource_monitor.py" via: "Resource monitoring for scaling decisions" pattern: "ResourceMonitor" - from: "src/resource/scaling.py" to: "src/resource/tiers.py" via: "Hardware tier-based scaling thresholds" pattern: "HardwareTierDetector" - from: "src/models/model_manager.py" to: "src/resource/scaling.py" via: "Proactive scaling integration" pattern: "ProactiveScaler" --- Implement proactive scaling algorithms that combine continuous background monitoring with pre-flight checks to prevent performance degradation before it impacts users, with graceful degradation cascades and stabilization periods. Purpose: Enable Mai to anticipate resource constraints and scale models proactively while maintaining smooth user experience. Output: Proactive scaling system with hybrid monitoring, graceful degradation, and intelligent stabilization. @~/.opencode/get-shit-done/workflows/execute-plan.md @~/.opencode/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md # Enhanced components from previous plans @src/models/resource_monitor.py @src/resource/tiers.py # Research-based scaling patterns @.planning/phases/03-resource-management/03-RESEARCH.md Implement ProactiveScaler class src/resource/scaling.py Create the ProactiveScaler class implementing hybrid monitoring and proactive scaling: 1. **Hybrid Monitoring Architecture:** - Continuous background monitoring thread/task - Pre-flight checks before each model operation - Resource trend analysis with configurable windows - Performance metrics tracking (response times, failure rates) 2. **Proactive Scaling Logic:** - Scale at 80% resource usage (configurable per tier) - Consider overall system load context - Implement stabilization periods (5 minutes for upgrades) - Prevent thrashing with hysteresis 3. **Graceful Degradation Cascade:** - Complete current task at lower quality - Switch to smaller model after completion - Notify user of capability changes - Suggest resource optimizations 4. **Key Methods:** - start_continuous_monitoring(): Background monitoring loop - check_preflight_resources(): Quick validation before operations - analyze_resource_trends(): Predictive scaling decisions - initiate_graceful_degradation(): Controlled capability reduction - should_upgrade_model(): Check if resources allow upgrade 5. **Integration Points:** - Use enhanced ResourceMonitor for accurate metrics - Use HardwareTierDetector for tier-specific thresholds - Provide callbacks for model switching - Log scaling decisions with context Include proper async handling for background monitoring and thread-safe state management. python -c "from src.resource.scaling import ProactiveScaler; ps = ProactiveScaler(); print('ProactiveScaler initialized:', hasattr(ps, 'check_preflight_resources'))" confirms the class structure ProactiveScaler implements hybrid monitoring with graceful degradation Integrate proactive scaling into ModelManager src/models/model_manager.py Enhance ModelManager to integrate proactive scaling: 1. **Add ProactiveScaler Integration:** - Import and initialize ProactiveScaler in __init__ - Start continuous monitoring on initialization - Pass resource monitor and tier detector references 2. **Enhance generate_response with Proactive Scaling:** - Add pre-flight resource check before generation - Implement graceful degradation if resources constrained - Use proactive scaling recommendations for model selection - Track performance metrics for scaling decisions 3. **Update Model Selection Logic:** - Incorporate tier-based preferences - Use scaling thresholds from HardwareTierDetector - Factor in trend analysis predictions - Apply stabilization periods for upgrades 4. **Add Resource-Constrained Handling:** - Complete current response with smaller model if needed - Switch models proactively based on scaling predictions - Handle resource exhaustion gracefully - Maintain conversation context through switches 5. **Performance Tracking:** - Track response times and failure rates - Monitor resource usage during generation - Feed metrics back to ProactiveScaler - Adjust scaling behavior based on observed performance 6. **Cleanup and Shutdown:** - Stop continuous monitoring in shutdown() - Clean up scaling state and resources - Log scaling decisions and outcomes Ensure backward compatibility and maintain silent switching behavior per Phase 1 decisions. python -c "from src.models.model_manager import ModelManager; mm = ModelManager(); print('Proactive scaling integrated:', hasattr(mm, '_proactive_scaler'))" confirms integration ModelManager integrates proactive scaling for intelligent resource management Test proactive scaling behavior under various scenarios: - Gradual resource increase (should detect and upgrade after stabilization) - Sudden resource decrease (should immediately degrade gracefully) - Stable resource usage (should not trigger unnecessary switches) - Mixed workload patterns (should adapt scaling thresholds appropriately) Verify stabilization periods prevent thrashing and graceful degradation maintains user experience. ProactiveScaler successfully combines continuous monitoring with pre-flight checks, implements graceful degradation cascades, respects stabilization periods, and integrates seamlessly with ModelManager for intelligent resource management. After completion, create `.planning/phases/03-resource-management/03-03-SUMMARY.md`