Mai/.planning/phases/01-model-interface/01-01-PLAN.md at c09ea8c8f2d0ffe3abbf2b5693d757a8fb44eef8

Files

Mai Development 1d9f19b8c2

Discord Webhook / git (push) Has been cancelled

Details

Phase 01-model-interface: Foundation systems
- 3 plan(s) in 2 wave(s)
- 2 parallel, 1 sequential
- Ready for execution

2026-01-27 10:45:52 -05:00

7.2 KiB

Raw Blame History

phase, plan, type, wave, depends_on, files_modified, autonomous, must_haves

phase

plan

type

wave

depends_on

files_modified

autonomous

must_haves

01-model-interface

execute

src/models/__init__.py

src/models/lmstudio_adapter.py

src/models/resource_monitor.py

config/models.yaml

requirements.txt

pyproject.toml

true

truths

artifacts

key_links

LM Studio client can connect and list available models

System resources (CPU/RAM/GPU) are monitored in real-time

Configuration defines models and their resource requirements

path	provides	min_lines
src/models/lmstudio_adapter.py	LM Studio client and model discovery	50

path	provides	min_lines
src/models/resource_monitor.py	System resource monitoring	40

path	provides	contains
config/models.yaml	Model definitions and resource profiles	models:

from	to	via	pattern
src/models/lmstudio_adapter.py	LM Studio server	lmstudio-python SDK	import lmstudio

from	to	via	pattern
src/models/resource_monitor.py	system APIs	psutil library	import psutil

Establish LM Studio connectivity and resource monitoring foundation.

Purpose: Create the core infrastructure for model discovery and system resource tracking, enabling intelligent model selection in later plans. Output: Working LM Studio client, resource monitor, and model configuration system.

<execution_context> @~~/.opencode/get-shit-done/workflows/execute-plan.md @~~/.opencode/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/phases/01-model-interface/01-RESEARCH.md @.planning/phases/01-model-interface/01-CONTEXT.md @.planning/codebase/ARCHITECTURE.md @.planning/codebase/STRUCTURE.md @.planning/codebase/STACK.md Task 1: Create project foundation and dependencies requirements.txt, pyproject.toml, src/models/__init__.py Create Python project structure with required dependencies: 1. Create pyproject.toml with project metadata and lmstudio, psutil, pydantic dependencies 2. Create requirements.txt as fallback for pip install 3. Create src/models/__init__.py with proper imports and version info 4. Create basic src/ directory structure if not exists 5. Set up Python package structure following PEP 518

Dependencies from research:

lmstudio >= 1.0.1 (official LM Studio SDK)
psutil >= 6.1.0 (system resource monitoring)
pydantic >= 2.10 (configuration validation)
gpu-tracker >= 5.0.1 (GPU monitoring, optional)

Follow packaging best practices with proper metadata, authors, and optional dependencies. pip install -e . succeeds and imports work: python -c "import lmstudio, psutil, pydantic" Project structure created with all dependencies installable via pip

Task 2: Implement LM Studio adapter and model discovery src/models/lmstudio_adapter.py Create LM Studio client following research patterns: 1. Implement LMStudioAdapter class using lmstudio-python SDK 2. Add context manager for safe client handling: get_client() 3. Implement list_available_models() using lms.list_downloaded_models() 4. Add load_model() method with error handling and fallback logic 5. Include model validation and capability detection 6. Follow Pattern 1 from research: Model Client Factory

Key methods:

init: Initialize client configuration
list_models(): Return list of (model_key, display_name, size_gb)
load_model(model_key): Load model with timeout and error handling
unload_model(model_key): Clean up model resources
get_model_info(model_key): Get model metadata and context window

Use proper error handling for LM Studio not running, model loading failures, and network issues. Unit test passes: python -c "from src.models.lmstudio_adapter import LMStudioAdapter; adapter = LMStudioAdapter(); print(len(adapter.list_models()) >= 0)" LM Studio adapter can connect and list available models, handles errors gracefully

Task 3: Implement system resource monitoring src/models/resource_monitor.py Create ResourceMonitor class following research patterns: 1. Monitor CPU usage (psutil.cpu_percent) 2. Track available memory (psutil.virtual_memory) 3. GPU VRAM monitoring if available (gpu-tracker library) 4. Provide resource snapshot with current usage and availability 5. Add resource trend tracking for load prediction 6. Implement should_switch_model() logic based on thresholds

Key methods:

get_current_resources(): Return dict with memory_percent, cpu_percent, available_memory_gb, gpu_vram_gb
get_resource_trend(window_minutes=5): Return resource usage trend
can_load_model(model_size_gb): Check if enough resources available
is_system_overloaded(): Return True if resources exceed thresholds

Follow Pattern 2 from research: Resource-Aware Model Selection Set sensible thresholds: 80% memory/CPU usage triggers model downgrading. python -c "from src.models.resource_monitor import ResourceMonitor; monitor = ResourceMonitor(); print('memory' in monitor.get_current_resources())" Resource monitor provides real-time system metrics and trend analysis

Task 4: Create model configuration system config/models.yaml Create model configuration following research architecture: 1. Define model categories by capability tier (small, medium, large) 2. Specify resource requirements for each model 3. Set context window sizes and token limits 4. Define model switching rules and fallback chains 5. Include model metadata (display names, descriptions)

Example structure: models:

key: "qwen/qwen3-4b-2507" display_name: "Qwen3 4B" category: "medium" min_memory_gb: 4 min_vram_gb: 2 context_window: 8192 capabilities: ["chat", "reasoning"]
key: "qwen/qwen2.5-7b-instruct" display_name: "Qwen2.5 7B Instruct" category: "large" min_memory_gb: 8 min_vram_gb: 4 context_window: 32768 capabilities: ["chat", "reasoning", "analysis"]

Include fallback chains for graceful degradation when resources are constrained. YAML validation passes: python -c "import yaml; yaml.safe_load(open('config/models.yaml'))" Model configuration defines available models with resource requirements and fallback chains

Verify core connectivity and monitoring: 1. LM Studio adapter can list available models 2. Resource monitor returns valid system metrics 3. Model configuration loads without errors 4. All dependencies import correctly 5. Error handling works when LM Studio is not running

<success_criteria> Core infrastructure ready for model management:

LM Studio client connects and discovers models
System resources are monitored in real-time
Model configuration defines resource requirements
Foundation supports intelligent model switching </success_criteria>

After completion, create `.planning/phases/01-model-interface/01-01-SUMMARY.md`

7.2 KiB Raw Blame History

7.2 KiB

Raw Blame History