Files
Mai/.planning/phases/01-model-interface/01-01-PLAN.md
Mai Development 1d9f19b8c2
Some checks failed
Discord Webhook / git (push) Has been cancelled
docs(01): create phase plan
Phase 01-model-interface: Foundation systems
- 3 plan(s) in 2 wave(s)
- 2 parallel, 1 sequential
- Ready for execution
2026-01-27 10:45:52 -05:00

7.2 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, must_haves
phase plan type wave depends_on files_modified autonomous must_haves
01-model-interface 01 execute 1
src/models/__init__.py
src/models/lmstudio_adapter.py
src/models/resource_monitor.py
config/models.yaml
requirements.txt
pyproject.toml
true
truths artifacts key_links
LM Studio client can connect and list available models
System resources (CPU/RAM/GPU) are monitored in real-time
Configuration defines models and their resource requirements
path provides min_lines
src/models/lmstudio_adapter.py LM Studio client and model discovery 50
path provides min_lines
src/models/resource_monitor.py System resource monitoring 40
path provides contains
config/models.yaml Model definitions and resource profiles models:
from to via pattern
src/models/lmstudio_adapter.py LM Studio server lmstudio-python SDK import lmstudio
from to via pattern
src/models/resource_monitor.py system APIs psutil library import psutil
Establish LM Studio connectivity and resource monitoring foundation.

Purpose: Create the core infrastructure for model discovery and system resource tracking, enabling intelligent model selection in later plans. Output: Working LM Studio client, resource monitor, and model configuration system.

<execution_context> @/.opencode/get-shit-done/workflows/execute-plan.md @/.opencode/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/phases/01-model-interface/01-RESEARCH.md @.planning/phases/01-model-interface/01-CONTEXT.md @.planning/codebase/ARCHITECTURE.md @.planning/codebase/STRUCTURE.md @.planning/codebase/STACK.md Task 1: Create project foundation and dependencies requirements.txt, pyproject.toml, src/models/__init__.py Create Python project structure with required dependencies: 1. Create pyproject.toml with project metadata and lmstudio, psutil, pydantic dependencies 2. Create requirements.txt as fallback for pip install 3. Create src/models/__init__.py with proper imports and version info 4. Create basic src/ directory structure if not exists 5. Set up Python package structure following PEP 518

Dependencies from research:

  • lmstudio >= 1.0.1 (official LM Studio SDK)
  • psutil >= 6.1.0 (system resource monitoring)
  • pydantic >= 2.10 (configuration validation)
  • gpu-tracker >= 5.0.1 (GPU monitoring, optional)

Follow packaging best practices with proper metadata, authors, and optional dependencies. pip install -e . succeeds and imports work: python -c "import lmstudio, psutil, pydantic" Project structure created with all dependencies installable via pip

Task 2: Implement LM Studio adapter and model discovery src/models/lmstudio_adapter.py Create LM Studio client following research patterns: 1. Implement LMStudioAdapter class using lmstudio-python SDK 2. Add context manager for safe client handling: get_client() 3. Implement list_available_models() using lms.list_downloaded_models() 4. Add load_model() method with error handling and fallback logic 5. Include model validation and capability detection 6. Follow Pattern 1 from research: Model Client Factory

Key methods:

  • init: Initialize client configuration
  • list_models(): Return list of (model_key, display_name, size_gb)
  • load_model(model_key): Load model with timeout and error handling
  • unload_model(model_key): Clean up model resources
  • get_model_info(model_key): Get model metadata and context window

Use proper error handling for LM Studio not running, model loading failures, and network issues. Unit test passes: python -c "from src.models.lmstudio_adapter import LMStudioAdapter; adapter = LMStudioAdapter(); print(len(adapter.list_models()) >= 0)" LM Studio adapter can connect and list available models, handles errors gracefully

Task 3: Implement system resource monitoring src/models/resource_monitor.py Create ResourceMonitor class following research patterns: 1. Monitor CPU usage (psutil.cpu_percent) 2. Track available memory (psutil.virtual_memory) 3. GPU VRAM monitoring if available (gpu-tracker library) 4. Provide resource snapshot with current usage and availability 5. Add resource trend tracking for load prediction 6. Implement should_switch_model() logic based on thresholds

Key methods:

  • get_current_resources(): Return dict with memory_percent, cpu_percent, available_memory_gb, gpu_vram_gb
  • get_resource_trend(window_minutes=5): Return resource usage trend
  • can_load_model(model_size_gb): Check if enough resources available
  • is_system_overloaded(): Return True if resources exceed thresholds

Follow Pattern 2 from research: Resource-Aware Model Selection Set sensible thresholds: 80% memory/CPU usage triggers model downgrading. python -c "from src.models.resource_monitor import ResourceMonitor; monitor = ResourceMonitor(); print('memory' in monitor.get_current_resources())" Resource monitor provides real-time system metrics and trend analysis

Task 4: Create model configuration system config/models.yaml Create model configuration following research architecture: 1. Define model categories by capability tier (small, medium, large) 2. Specify resource requirements for each model 3. Set context window sizes and token limits 4. Define model switching rules and fallback chains 5. Include model metadata (display names, descriptions)

Example structure: models:

  • key: "qwen/qwen3-4b-2507" display_name: "Qwen3 4B" category: "medium" min_memory_gb: 4 min_vram_gb: 2 context_window: 8192 capabilities: ["chat", "reasoning"]
  • key: "qwen/qwen2.5-7b-instruct" display_name: "Qwen2.5 7B Instruct" category: "large" min_memory_gb: 8 min_vram_gb: 4 context_window: 32768 capabilities: ["chat", "reasoning", "analysis"]

Include fallback chains for graceful degradation when resources are constrained. YAML validation passes: python -c "import yaml; yaml.safe_load(open('config/models.yaml'))" Model configuration defines available models with resource requirements and fallback chains

Verify core connectivity and monitoring: 1. LM Studio adapter can list available models 2. Resource monitor returns valid system metrics 3. Model configuration loads without errors 4. All dependencies import correctly 5. Error handling works when LM Studio is not running

<success_criteria> Core infrastructure ready for model management:

  • LM Studio client connects and discovers models
  • System resources are monitored in real-time
  • Model configuration defines resource requirements
  • Foundation supports intelligent model switching </success_criteria>
After completion, create `.planning/phases/01-model-interface/01-01-SUMMARY.md`