Files
Mai/.planning/phases/03-resource-management/03-03-PLAN.md
Mai Development 1e071398ff
Some checks failed
Discord Webhook / git (push) Has been cancelled
docs(03): create phase plan
Phase 3: Resource Management
- 4 plan(s) in 2 wave(s)
- 2 parallel, 2 sequential
- Ready for execution
2026-01-27 17:58:09 -05:00

6.7 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, user_setup, must_haves
phase plan type wave depends_on files_modified autonomous user_setup must_haves
03-resource-management 03 execute 2
03-01
03-02
src/resource/scaling.py
src/models/model_manager.py
true
truths artifacts key_links
Proactive scaling prevents performance degradation before it impacts users
Hybrid monitoring combines continuous checks with pre-flight validation
Graceful degradation completes current tasks before model switching
path provides min_lines
src/resource/scaling.py Proactive scaling algorithms with hybrid monitoring 150
path provides contains min_lines
src/models/model_manager.py Enhanced model manager with proactive scaling integration ProactiveScaler 650
from to via pattern
src/resource/scaling.py src/models/resource_monitor.py Resource monitoring for scaling decisions ResourceMonitor
from to via pattern
src/resource/scaling.py src/resource/tiers.py Hardware tier-based scaling thresholds HardwareTierDetector
from to via pattern
src/models/model_manager.py src/resource/scaling.py Proactive scaling integration ProactiveScaler
Implement proactive scaling algorithms that combine continuous background monitoring with pre-flight checks to prevent performance degradation before it impacts users, with graceful degradation cascades and stabilization periods.

Purpose: Enable Mai to anticipate resource constraints and scale models proactively while maintaining smooth user experience. Output: Proactive scaling system with hybrid monitoring, graceful degradation, and intelligent stabilization.

<execution_context> @/.opencode/get-shit-done/workflows/execute-plan.md @/.opencode/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md

Enhanced components from previous plans

@src/models/resource_monitor.py @src/resource/tiers.py

Research-based scaling patterns

@.planning/phases/03-resource-management/03-RESEARCH.md

Implement ProactiveScaler class src/resource/scaling.py Create the ProactiveScaler class implementing hybrid monitoring and proactive scaling:
  1. Hybrid Monitoring Architecture:

    • Continuous background monitoring thread/task
    • Pre-flight checks before each model operation
    • Resource trend analysis with configurable windows
    • Performance metrics tracking (response times, failure rates)
  2. Proactive Scaling Logic:

    • Scale at 80% resource usage (configurable per tier)
    • Consider overall system load context
    • Implement stabilization periods (5 minutes for upgrades)
    • Prevent thrashing with hysteresis
  3. Graceful Degradation Cascade:

    • Complete current task at lower quality
    • Switch to smaller model after completion
    • Notify user of capability changes
    • Suggest resource optimizations
  4. Key Methods:

    • start_continuous_monitoring(): Background monitoring loop
    • check_preflight_resources(): Quick validation before operations
    • analyze_resource_trends(): Predictive scaling decisions
    • initiate_graceful_degradation(): Controlled capability reduction
    • should_upgrade_model(): Check if resources allow upgrade
  5. Integration Points:

    • Use enhanced ResourceMonitor for accurate metrics
    • Use HardwareTierDetector for tier-specific thresholds
    • Provide callbacks for model switching
    • Log scaling decisions with context

Include proper async handling for background monitoring and thread-safe state management. python -c "from src.resource.scaling import ProactiveScaler; ps = ProactiveScaler(); print('ProactiveScaler initialized:', hasattr(ps, 'check_preflight_resources'))" confirms the class structure ProactiveScaler implements hybrid monitoring with graceful degradation

Integrate proactive scaling into ModelManager src/models/model_manager.py Enhance ModelManager to integrate proactive scaling:
  1. Add ProactiveScaler Integration:

    • Import and initialize ProactiveScaler in init
    • Start continuous monitoring on initialization
    • Pass resource monitor and tier detector references
  2. Enhance generate_response with Proactive Scaling:

    • Add pre-flight resource check before generation
    • Implement graceful degradation if resources constrained
    • Use proactive scaling recommendations for model selection
    • Track performance metrics for scaling decisions
  3. Update Model Selection Logic:

    • Incorporate tier-based preferences
    • Use scaling thresholds from HardwareTierDetector
    • Factor in trend analysis predictions
    • Apply stabilization periods for upgrades
  4. Add Resource-Constrained Handling:

    • Complete current response with smaller model if needed
    • Switch models proactively based on scaling predictions
    • Handle resource exhaustion gracefully
    • Maintain conversation context through switches
  5. Performance Tracking:

    • Track response times and failure rates
    • Monitor resource usage during generation
    • Feed metrics back to ProactiveScaler
    • Adjust scaling behavior based on observed performance
  6. Cleanup and Shutdown:

    • Stop continuous monitoring in shutdown()
    • Clean up scaling state and resources
    • Log scaling decisions and outcomes

Ensure backward compatibility and maintain silent switching behavior per Phase 1 decisions. python -c "from src.models.model_manager import ModelManager; mm = ModelManager(); print('Proactive scaling integrated:', hasattr(mm, '_proactive_scaler'))" confirms integration ModelManager integrates proactive scaling for intelligent resource management

Test proactive scaling behavior under various scenarios: - Gradual resource increase (should detect and upgrade after stabilization) - Sudden resource decrease (should immediately degrade gracefully) - Stable resource usage (should not trigger unnecessary switches) - Mixed workload patterns (should adapt scaling thresholds appropriately)

Verify stabilization periods prevent thrashing and graceful degradation maintains user experience.

<success_criteria> ProactiveScaler successfully combines continuous monitoring with pre-flight checks, implements graceful degradation cascades, respects stabilization periods, and integrates seamlessly with ModelManager for intelligent resource management. </success_criteria>

After completion, create `.planning/phases/03-resource-management/03-03-SUMMARY.md`