Files
Sage/.planning/research/ARCHITECTURE.md
Dani B bd477b0baa docs: complete project research (ecosystem analysis)
Research files synthesized:
- STACK.md: Flutter + Supabase + Riverpod recommended stack
- FEATURES.md: 7 table stakes, 6 differentiators, 7 anti-features identified
- ARCHITECTURE.md: Offline-first sync with optimistic locking, RLS multi-tenancy
- PITFALLS.md: 5 critical pitfalls (v1), 8 moderate (v1.5), 3 minor (v2+)
- SUMMARY.md: Executive synthesis with 3-phase roadmap implications

Key findings:
- Stack: Flutter + Supabase free tier + mobile_scanner + Open Food Facts
- Critical pitfalls: Barcode mismatches, timezone bugs, sync conflicts, setup complexity, notification fatigue
- Phase structure: MVP (core) → expansion (usage tracking) → differentiation (prediction + sales)
- All research grounded in ecosystem analysis (12+ competitors), official documentation, and production incidents

Confidence: HIGH
Ready for roadmap creation: YES

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-27 23:42:37 -05:00

1303 lines
45 KiB
Markdown

# Architecture Patterns: Household Food Inventory Apps
**Domain:** Multi-user household inventory management
**Researched:** 2026-01-27
**Confidence Level:** MEDIUM-HIGH
## Executive Summary
Household food inventory apps require careful separation of concerns across multi-user households, real-time synchronization, offline-first resilience, and barcode/product lookup. The recommended architecture uses a **tenant-per-household model with PostgreSQL multi-tenancy**, Supabase Realtime for synchronization, client-side barcode scanning with cached product lookup, and local-first conflict resolution with server arbitration.
This document defines component responsibilities, data flow, build order implications, and scaling thresholds.
---
## Recommended Architecture: System Overview
```
┌─────────────────────────────────────────────────────────────────┐
│ FLUTTER MOBILE APP │
│ (iOS/Android - runs on 2-10 household members' phones) │
│ │
│ ┌──────────────┐ ┌─────────────┐ ┌──────────────────┐ │
│ │ Barcode │ │ Local │ │ UI/Sync │ │
│ │ Scanner │ │ Cache/DB │ │ Controllers │ │
│ │ (ML Kit) │ │ (SQLite) │ │ (Riverpod) │ │
│ └──────────────┘ └─────────────┘ └──────────────────┘ │
│ │ │ │ │
│ └───────────────────┼──────────────────┘ │
│ │ │
│ Queue of local │ Periodic sync (5-30s) │
│ changes before │ via websocket │
│ connection back │ │
└─────────────────────────────────────────────────────────────────┘
│ HTTPS + WSS
(auth token, household_id)
┌─────────────────────────────────────────────────────────────────┐
│ SUPABASE/BACKEND │
│ (Postgres + Realtime + Auth + Storage) │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ API Layer (PostgREST) │ │
│ │ - REST endpoints for CRUD │ │
│ │ - Row-Level Security (RLS) enforces household isolation│ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────────────────┼──────────────────────────────┐ │
│ │ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ │ │
│ │ │ Postgres DB │ │ Realtime │ │ │
│ │ │ (households, │ │ Broadcast │ │ │
│ │ │ items, │ ◄─┤ (websocket) │ │ │
│ │ │ transactions) │ │ - subscriptions │ │ │
│ │ │ │ │ - broadcasts │ │ │
│ │ └──────────────────┘ └──────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ Business Logic (Edge Functions / HTTP API) │ │ │
│ │ │ - Conflict resolution │ │ │
│ │ │ - Notification triggers │ │ │
│ │ │ - Barcode lookup (fallback if not in local DB) │ │ │
│ │ │ - Prediction pipeline scheduling │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ Storage (S3-compatible) │ │ │
│ │ │ - Receipt images │ │ │
│ │ │ - Product photos (cached from barcode API) │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌───────────┼───────────┐
│ │ │
┌────────▼────┐ ┌───▼────┐ ┌──▼──────────┐
│ Barcode API │ │Notif │ │ ML Pipeline │
│(Go-UPC, │ │Service │ │ (async job) │
│BarcodeLook) │ │(FCM, │ │ │
└─────────────┘ └────────┘ └─────────────┘
```
---
## Component Boundaries and Responsibilities
### 1. **Flutter Frontend (Mobile App)**
**Responsibility:** Present UI, capture input, manage local state, handle offline operation.
**Sub-components:**
| Component | Responsibility | Technology |
|-----------|--------------|-----------|
| **Barcode Scanner Module** | Capture barcode images via camera, decode with ML Kit, query local cache | ML Kit (Google), `camera` plugin, `barcode_flutter` |
| **Local SQLite Database** | Store inventory items, transactions, pending sync queue, barcode cache | SQLite via `sqflite` or `drift` |
| **Realtime Listener** | Subscribe to household changes via Supabase Realtime, merge remote changes | `supabase_flutter`, websocket handling |
| **Sync Engine** | Queue local changes, merge with remote state, handle conflicts, persist to server | Custom sync logic with exponential backoff |
| **UI Controllers** | Inventory list, add item form, household members, notifications | Flutter widgets, Riverpod for state |
| **Auth Manager** | Login, household selection, token refresh | `supabase_flutter` auth module |
**Key Patterns:**
- **Offline-First Local Storage:** All writes happen to SQLite first, async sync to server.
- **Pessimistic Locking for Critical Operations:** Changes to item quantities use server-side validation to prevent overselling (e.g., "remove 5 items but only 3 exist").
- **Periodic Full Sync:** Every 30-60s, fetch server version and merge (handles missed websocket messages).
- **Notification Queue:** Local cache of notifications prevents duplicate alerts when reconnecting.
**Scaling Implications:**
- At 1000+ items in a household, SQLite performance degrades on full sync. Consider pagination.
- Barcode cache should be limited to 10K most-recent scans per household (local storage constraint: ~200MB).
---
### 2. **Supabase Backend: PostgreSQL Database**
**Responsibility:** Single source of truth, enforce consistency, isolate households, track history.
**Schema Overview:**
```sql
-- Households (tenants)
CREATE TABLE households (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
created_at TIMESTAMP DEFAULT now(),
created_by UUID REFERENCES auth.users(id)
);
-- Household members (authorization)
CREATE TABLE household_members (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
household_id UUID NOT NULL REFERENCES households(id),
user_id UUID NOT NULL REFERENCES auth.users(id),
role TEXT CHECK (role IN ('owner', 'member', 'guest')),
joined_at TIMESTAMP DEFAULT now(),
UNIQUE(household_id, user_id)
);
-- Inventory items (what's in the fridge)
CREATE TABLE inventory_items (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
household_id UUID NOT NULL REFERENCES households(id),
-- Product info (either from barcode lookup or manual entry)
name TEXT NOT NULL,
barcode TEXT,
category TEXT,
quantity DECIMAL(10, 2) NOT NULL DEFAULT 0,
unit TEXT DEFAULT 'count',
-- Expiration tracking
expiration_date DATE,
purchase_date DATE,
purchase_price DECIMAL(10, 2),
-- Sync metadata
created_by UUID NOT NULL REFERENCES auth.users(id),
created_at TIMESTAMP DEFAULT now(),
updated_at TIMESTAMP DEFAULT now(),
-- For conflict resolution
last_modified_by UUID REFERENCES auth.users(id),
version_vector TEXT, -- serialized clock for CRDT (optional)
UNIQUE(household_id, barcode) -- prevent duplicates
);
-- Transaction log (audit trail + sales data)
CREATE TABLE transactions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
household_id UUID NOT NULL REFERENCES households(id),
item_id UUID REFERENCES inventory_items(id),
type TEXT CHECK (type IN ('add', 'remove', 'consume', 'discard')),
quantity_change DECIMAL(10, 2),
notes TEXT, -- "found at Costco $12.99", "expired"
receipt_image_path TEXT REFERENCES storage.objects(id),
created_by UUID NOT NULL REFERENCES auth.users(id),
created_at TIMESTAMP DEFAULT now(),
-- For community sales data
store_name TEXT,
sale_price DECIMAL(10, 2),
confidence_score DECIMAL(3, 2) -- 0-1: trust level for crowd-sourced prices
);
-- Barcode lookup cache (local cache of product info)
CREATE TABLE barcode_cache (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
barcode TEXT UNIQUE NOT NULL,
product_name TEXT,
product_category TEXT,
image_url TEXT,
expiration_estimate_days INTEGER,
-- Metadata
source TEXT, -- 'go_upc', 'user_submitted', 'crowdsourced'
last_updated TIMESTAMP DEFAULT now(),
lookup_count INTEGER DEFAULT 0 -- track popularity
);
-- Notifications (generated by triggers)
CREATE TABLE notifications (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
household_id UUID NOT NULL REFERENCES households(id),
user_id UUID NOT NULL REFERENCES auth.users(id),
type TEXT CHECK (type IN (
'item_expiring_soon',
'item_expired',
'low_stock',
'member_added_item',
'household_shared',
'sale_alert'
)),
item_id UUID REFERENCES inventory_items(id),
title TEXT,
message TEXT,
read_at TIMESTAMP,
created_at TIMESTAMP DEFAULT now(),
expires_at TIMESTAMP, -- auto-delete old notifications
-- Delivery status
delivered_via TEXT, -- 'in_app', 'push', 'email'
delivery_status TEXT DEFAULT 'pending' -- 'delivered', 'failed'
);
-- Household settings & preferences
CREATE TABLE household_settings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
household_id UUID UNIQUE NOT NULL REFERENCES households(id),
-- Notification preferences
notify_expiry_days INTEGER DEFAULT 3,
notify_low_stock_threshold DECIMAL(10, 2),
enable_push_notifications BOOLEAN DEFAULT true,
enable_email_summary BOOLEAN DEFAULT false,
-- Privacy & sharing
share_sales_data BOOLEAN DEFAULT false,
allow_community_lookup BOOLEAN DEFAULT true,
updated_at TIMESTAMP DEFAULT now()
);
-- Enable Row-Level Security
ALTER TABLE households ENABLE ROW LEVEL SECURITY;
ALTER TABLE inventory_items ENABLE ROW LEVEL SECURITY;
ALTER TABLE transactions ENABLE ROW LEVEL SECURITY;
ALTER TABLE notifications ENABLE ROW LEVEL SECURITY;
-- RLS Policies: Users can only see their own households
CREATE POLICY households_rls ON households
FOR SELECT USING (
EXISTS (
SELECT 1 FROM household_members
WHERE household_id = households.id
AND user_id = auth.uid()
)
);
CREATE POLICY items_rls ON inventory_items
FOR SELECT USING (
EXISTS (
SELECT 1 FROM household_members
WHERE household_id = inventory_items.household_id
AND user_id = auth.uid()
)
);
```
**Key Features:**
- **Household Isolation via RLS:** Row-Level Security policies ensure users can only access their household's data.
- **Version Vector (CRDT-ready):** The `version_vector` column stores logical clocks for eventual consistency (not required for MVP, but supports offline-first at scale).
- **Transaction Log for Audit & Community Data:** Every change is logged, enabling replay, sales data aggregation, and fraud detection.
- **Notification Triggers:** PostgreSQL triggers or Edge Functions auto-generate notifications on state changes.
---
### 3. **Realtime Sync Layer (Supabase Realtime)**
**Responsibility:** Push changes from server to clients in real-time via websockets.
**Patterns:**
| Pattern | How It Works | When to Use |
|---------|------------|-----------|
| **Postgres Changes** | Subscribe to table changes (INSERT, UPDATE, DELETE). Server broadcasts to all subscribers. | Default for inventory items, transactions |
| **Broadcast** | Ephemeral messages for non-persistent state (typing indicators, user presence) | Optional: "User X is editing this item" |
| **Presence** | Track which users are currently online in a household | Nice-to-have: show "3 members active now" |
**Architecture:**
```
[Client A] --websocket--> [Realtime Server]
[Client B] --websocket--> [Realtime Server] --postgres trigger--> [Postgres]
[Client C] --websocket--> [Realtime Server]
When item added by Client A:
1. Client A sends REST POST /inventory_items
2. Postgres insert triggers NOTIFY on 'inventory_items' channel
3. Realtime server broadcasts to all clients subscribed to that household
4. Clients B & C update local SQLite and UI in <500ms
```
**Limitations (Medium confidence from Supabase docs):**
- Realtime has connection limits per project (check Supabase pricing tier).
- If a client misses a message (network blip), no automatic replay.
- **Solution:** Periodic full-sync every 30-60s acts as a catch-up mechanism.
---
### 4. **Barcode Scanning & Product Lookup**
**Responsibility:** Decode barcodes, find product info, cache results.
**Architecture:**
```
Barcode Scan (on device)
├─> ML Kit decodes barcode to UPC (instant, offline)
└─> Query local cache (SQLite)
├─> Hit: Return product name, category, est. expiry
└─> Miss: Query server cache
├─> Hit: Return & cache locally
└─> Miss: Query external API (Go-UPC, BarcodeAPI)
└─> Cache result + show "Add item" form
```
**Component Details:**
| Layer | Technology | Responsibility |
|-------|-----------|-----------------|
| **On-Device Barcode Decode** | Google ML Kit (`google_ml_kit` Flutter plugin) | Fast, offline, works with UPC/EAN/QR codes |
| **Local Cache (SQLite)** | `sqflite` with `barcode_cache` table | Last 10K lookups, keyed by UPC |
| **Server-Side Cache** | Postgres `barcode_cache` table | Shared across household (saves API calls) |
| **External Product DB** | Go-UPC, BarcodeAPI, EAN-Search (fallback) | Authoritative product data, images, nutrition info |
**Decision Logic:**
```typescript
async Future<ProductInfo?> lookupBarcode(String upc) async {
// 1. Try local cache (instant, offline)
var local = await localDb.query('barcode_cache WHERE barcode = ?', [upc]);
if (local.isNotEmpty) return ProductInfo.fromLocal(local[0]);
// 2. Try server cache (household's previous lookups)
if (hasNetwork) {
var remote = await supabase
.from('barcode_cache')
.select()
.eq('barcode', upc)
.single();
if (remote != null) {
await localDb.insert('barcode_cache', remote); // cache locally
return ProductInfo.fromRemote(remote);
}
}
// 3. Query external API (rate-limited, may fail)
if (hasNetwork && apiCredits > 0) {
var external = await goUpcApi.lookup(upc);
// Cache on server for next household member
await supabase.from('barcode_cache').insert({
barcode: upc,
product_name: external.name,
// ... other fields
source: 'go_upc',
});
await localDb.insert('barcode_cache', external);
return ProductInfo.fromExternal(external);
}
// 4. Give up, let user enter manually
return null;
}
```
**Scaling Implications:**
- External API calls cost money ($0.01-0.05 per lookup). Cache aggressively.
- At 10K households with 50 items each (500K items), barcode API would cost $5K-25K/month if not cached.
- **Solution:** Server-side cache reduces API calls by 80-90% (most households share products).
---
### 5. **Offline Sync & Conflict Resolution**
**Responsibility:** Queue local changes, sync when online, resolve conflicts.
**Offline-First Sync Flow:**
```
User is offline:
1. Add item "Milk" to inventory
└> Write to local SQLite immediately
- Mark as `synced=false, pending_id=uuid1`
2. User takes multiple actions (add bread, remove eggs, update quantity)
└> All queued in local `pending_changes` table
User comes back online:
3. App detects network (can reach backend)
└> Sync begins with exponential backoff
4. For each pending change:
a) Send to server: POST /inventory_items + pending_id
b) Server applies change + returns new version
c) If conflict: apply conflict resolution rule
d) Update local record: mark synced=true, store server version
5. Receive broadcast of other members' changes
└> Merge into local state (Last-Write-Wins by default)
```
**Conflict Resolution Strategies:**
| Strategy | Pros | Cons | When to Use |
|----------|------|------|-----------|
| **Server Wins** | Simple, no lost updates | Local work discarded | Settings, shared state |
| **Last-Write-Wins (LWW)** | Preserves data, deterministic | Requires accurate time sync, may lose recent edits | Quantity updates (last edit timestamp wins) |
| **Custom: Unit + User** | Domain-aware | Complex to implement | Quantity: "If both edited, sum the changes" |
| **CRDT** | No conflicts, eventual consistency | High complexity, storage overhead | Collaborative document editing (not MVP) |
**Recommendation for Sage (MVP):**
Use **Last-Write-Wins with server arbitration**:
```sql
-- When conflict detected:
-- Server has: item.quantity = 5, updated_at = 2026-01-27 10:00:00
-- Client submits: quantity = 3, with parent_version_id = old_version_id
-- If parent version matches current, apply update
UPDATE inventory_items SET
quantity = 3,
last_modified_by = auth.uid(),
updated_at = now()
WHERE id = item_id
AND parent_version_id = submitted_parent_version_id;
-- If conflict (parent mismatch), return conflict and let client choose:
-- Option A: Accept server state (discard local change)
-- Option B: Rebase local change on server state (re-apply user's delta)
-- Option C: Manual merge (show user both versions)
```
---
### 6. **Notification System**
**Responsibility:** Trigger alerts for expiry, low stock, member activity; deliver via push/in-app/email.
**Architecture:**
```
Event (item expires, low stock detected)
├─> PostgreSQL Trigger (realtime trigger)
│ └─> INSERT notification row
├─> Supabase Realtime
│ └─> Broadcast to subscribed clients
└─> Edge Function (async job, on schedule)
├─> Query expiring items (SELECT * WHERE expiration_date <= today + 3 days)
├─> For each: generate notification if not already created
└─> Send push to users (FCM for Android, APNs for iOS)
```
**Trigger-Based Notifications (Real-time):**
```sql
-- Auto-generate notification when item expires
CREATE OR REPLACE FUNCTION notify_item_expired()
RETURNS TRIGGER AS $$
BEGIN
IF NEW.expiration_date = CURRENT_DATE
AND OLD.expiration_date != CURRENT_DATE THEN
-- Create notification for all household members
INSERT INTO notifications (
household_id, user_id, type, item_id,
title, message
)
SELECT
NEW.household_id,
hm.user_id,
'item_expired',
NEW.id,
'Item Expired',
NEW.name || ' has expired today'
FROM household_members hm
WHERE hm.household_id = NEW.household_id;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER item_expiry_trigger
AFTER UPDATE ON inventory_items
FOR EACH ROW EXECUTE FUNCTION notify_item_expired();
```
**Scheduled Notifications (Daily/Periodic):**
```typescript
// Supabase Edge Function (cron job)
// Runs every day at 8am
export async function handler(req: Request) {
const supabase = createClient(DB_URL, DB_KEY);
// Find items expiring in 0-3 days
const { data: expiringItems } = await supabase
.from('inventory_items')
.select()
.gte('expiration_date', today)
.lte('expiration_date', today + 3 days);
for (const item of expiringItems) {
const household_id = item.household_id;
// Check if notification already sent today
const { data: existing } = await supabase
.from('notifications')
.select()
.eq('item_id', item.id)
.gte('created_at', today);
if (existing.length === 0) {
// Create notification
await supabase.from('notifications').insert({
household_id,
item_id: item.id,
type: 'item_expiring_soon',
title: 'Item Expiring Soon',
message: `${item.name} expires in ${daysUntilExpiry} days`,
});
// Send push to all household members
await sendPushNotifications(household_id, {
title: 'Item Expiring Soon',
body: item.name,
});
}
}
}
```
**Push Notification Delivery:**
| Service | Responsibility | Configuration |
|---------|--------------|---|
| **FCM (Firebase Cloud Messaging)** | Deliver to Android devices | Requires Firebase project + server key |
| **APNs (Apple Push Notification)** | Deliver to iOS devices | Requires Apple Developer account + certificate |
| **In-App via Realtime** | Badge/banner inside app | Instant, no external service needed |
| **Email (optional)** | Daily/weekly digest | SendGrid or Supabase email (future) |
**Recommendation:** Start with **in-app only** (via Realtime broadcast), add push later when needed.
---
### 7. **Barcode Community Data Collection (Future)**
**Responsibility:** Collect user-submitted barcode data safely, prevent spam.
**Architecture:**
```
User scans item not in database
└─> "Item not found. Help others by sharing details?"
├─> User enters: name, category, expiry estimate, store, price
└─> Submit to server:
POST /api/submit_product {
barcode: "123456",
product_name: "Organic Milk",
category: "Dairy",
expiration_estimate_days: 14,
store_name: "Whole Foods",
sale_price: 4.99
}
```
**Server-Side Processing:**
```typescript
export async function handleProductSubmission(req: Request) {
const { barcode, product_name, ... } = await req.json();
// 1. Validate input (length, format, etc.)
if (!barcode || barcode.length < 8) {
return error('Invalid barcode');
}
// 2. Rate-limit per household (prevent spam)
const { data: submissions } = await supabase
.from('barcode_submissions')
.select()
.eq('household_id', user.household_id)
.gte('created_at', now - 1 hour);
if (submissions.length > 10) {
return error('Too many submissions in last hour');
}
// 3. Store with low confidence score (unverified)
await supabase.from('barcode_cache').upsert({
barcode,
product_name,
source: 'user_submitted',
confidence_score: 0.3, // Low confidence
lookup_count: 0,
});
// 4. Trigger review (queue for moderation)
await supabase.from('barcode_submissions').insert({
barcode,
household_id,
submitter_id: auth.uid(),
product_data: { product_name, category, ... },
status: 'pending_review',
});
return { success: true, message: 'Thank you for helping!' };
}
```
**Trust Model:**
- New submissions get `confidence_score = 0.3`.
- If 3+ different households agree on same data, raise to `0.7`.
- If data from external API, use `confidence_score = 0.95`.
- Client app uses score to decide: show as suggestions vs. auto-fill.
**Spam Prevention:**
- Rate-limit: 10 submissions/hour per household.
- Dedupe: Check if barcode already exists before storing.
- Moderation: Flag suspicious submissions (gibberish, profanity).
---
### 8. **File Storage (Receipts, Product Images)**
**Responsibility:** Store images with access control, optimize for mobile.
**Architecture:**
```
User captures receipt photo
├─> Compress on device (resize to 1200x900 max)
├─> Upload to Supabase Storage
│ POST /storage/v1/object/receipts/{household_id}/{uuid}.jpg
│ Headers: Authorization: Bearer {token}
└─> Storage RLS enforces household access:
"Only household members can access household_id/* objects"
```
**Supabase Storage RLS Policy:**
```sql
CREATE POLICY "household_can_access_own_receipts" ON storage.objects
FOR SELECT USING (
bucket_id = 'receipts'
AND (auth.uid() IN (
SELECT user_id FROM household_members
WHERE household_id = split_part(name, '/', 1)::uuid
))
);
```
**File Organization:**
```
receipts/
├── {household_id}/
│ ├── {uuid}.jpg -- receipt
│ ├── {uuid}_thumb.jpg -- thumbnail for list view
│ └── ...
```
**CDN & Image Optimization:**
- Supabase Storage includes global CDN (285+ cities).
- Built-in image transformation via `?width=400&quality=80` URL params.
- Use transformed images in list views (faster load).
**Costs:**
- Storage: $5-25/month per 1TB.
- Bandwidth: Included with Supabase plan (egress charges extra if >10GB/month).
---
### 9. **AI/ML Prediction Service (Future)**
**Responsibility:** Estimate item expiry based on purchase date, category, storage conditions.
**Recommended Architecture:**
```
User adds item "Fresh Strawberries"
├─> Client makes prediction request:
│ POST /api/predict_expiry {
│ product_name: "Fresh Strawberries",
│ category: "Produce",
│ purchase_date: "2026-01-27",
│ storage_conditions: "refrigerator"
│ }
└─> Server-side ML:
├─> Load pre-trained model (small, ~5MB)
│ (e.g., XGBoost or PyTorch mobile)
├─> Feature engineering:
│ - Days since purchase
│ - Category (produce: 3-7 days, dairy: 10-21 days)
│ - Storage type (fridge extends shelf life 2-3x)
│ - Historical data from transactions table
├─> Predict: "14 days expected shelf life"
│ └─> Return confidence interval: [10, 18] days
└─> Update item.expiration_date + confidence_score
```
**Model Location Options:**
| Location | Pros | Cons | Recommendation |
|----------|------|------|---|
| **Server (Edge Function)** | Easy to update, centralized | Adds latency (100-300ms), requires network | **Default for MVP** |
| **On-Device (TensorFlow Lite)** | Instant, works offline, privacy | Model size (5-20MB), requires device resources | Later versions |
| **Hybrid** | Best of both | Complex to manage | Phase 2 optimization |
**Data Requirements:**
- Historical transactions with `type='consume'` to learn when users actually eat items.
- At scale (10K+ transactions), model trains weekly to adapt to household preferences.
---
## Data Flow: Three Key Scenarios
### Scenario 1: User Adds Item Offline, Syncs Online
```
Timeline:
[OFFLINE - no network]
10:00 User adds "Milk" via barcode scan
├─> Local lookup finds product in cache
├─> Write to local SQLite: inventory_items (synced=false)
├─> Write to pending_changes queue
└─> Show "Syncing..." indicator
[NETWORK RESTORED]
10:05 Sync engine detects network
├─> Dequeue pending_changes
├─> POST /inventory_items {item: {name, barcode, quantity, ...}}
├─> Server processes: INSERT into inventory_items
├─> Server broadcasts via Realtime: CHANNEL household:123:inventory
└─> All clients receive update, merge into local DB
10:06 User's app receives Realtime broadcast
├─> Merge remote item into SQLite
├─> Update pending_changes: mark synced=true
├─> Mark local record with server_id
└─> UI updates (no duplicate entry)
Other household members:
├─> Receive Realtime broadcast
├─> "Dad added Milk"
├─> Notification generated: "John added Milk"
└─> In-app notification + optional push
```
### Scenario 2: Concurrent Edits (User A & B both remove from quantity)
```
State: item.quantity = 5
User A (offline):
10:00 ├─> Remove 2 units: quantity = 3
├─> Write to local: {quantity: 3, version: A}
├─> Queue for sync
User B (online):
10:01 ├─> Remove 1 unit: quantity = 4 (from current server state)
├─> POST /inventory_items/update (quantity=4, version=5)
├─> Server applies: quantity = 4, version = 6
User A (comes online):
10:05 ├─> Sync: POST /inventory_items/update {quantity=3, parent_version=5}
├─> Server: parent_version=5 exists, applies change
├─> Result: quantity = 3, version = 7
└─> Last-Write-Wins: A's edit (later timestamp) wins
Server state: 3 units (A removed 2, B removed 1 earlier)
```
**NOTE:** This is lossy (B's edit partially lost). Better approach:
```
Use delta/operational transform:
- User B: "subtract 1 from quantity"
- User A: "subtract 2 from quantity"
- Server merges both deltas: subtract 3 total
- Result: quantity = 2 (both edits preserved)
```
This requires more sophisticated conflict resolution (CRDT-style). **Save for Phase 2.**
### Scenario 3: Barcode Not Found, User Contributes Data
```
User scans "012345678901"
├─> Local cache: miss
├─> Server cache: miss
├─> External API (go-upc): miss (rare barcode)
├─> UI shows: "Not found. Would you like to add it?"
User confirms + enters:
├─> Name: "Artisanal Kombucha"
├─> Category: "Beverages"
├─> Expiry: "Keep refrigerated, 30 days"
├─> Store: "Whole Foods"
├─> Price: "$5.99"
└─> POST /api/submit_product {barcode, ...}
Server processes:
├─> Validate input + rate-limit check
├─> INSERT into barcode_cache (confidence_score=0.3, source='user_submitted')
├─> Queue for moderation review
└─> Return: "Thank you! Data recorded."
Next time ANY household member scans same barcode:
├─> Server cache hit (confidence=0.3)
├─> Show as suggestion: "Artisanal Kombucha?" with low-confidence indicator
├─> User can accept, edit, or ignore
```
---
## Build Order Implications
### Phase 1: MVP Foundations (Weeks 1-4)
**Builds these first:**
1. **Flutter App Scaffolding** + local SQLite
- Barcode scanning (ML Kit integration)
- Local cache layer
- Offline-capable UI
2. **Supabase Backend**
- Households + members table
- Inventory_items table with RLS
- Authentication (email/password)
3. **Basic Sync**
- REST API (add/remove/edit items)
- Realtime listeners (websocket to SQLite)
- Simple Last-Write-Wins conflict resolution
**Test:** Single household, 2 users, add/remove items, offline → online sync.
---
### Phase 2: Polish + Scale (Weeks 5-8)
**Builds on Phase 1:**
4. **Barcode Lookup Service**
- Server-side cache integration
- External API fallback (Go-UPC)
- Local caching strategy
5. **Notification System**
- Expiry alerts (trigger-based)
- Low-stock alerts (periodic)
- In-app notifications (Realtime broadcast)
6. **File Storage**
- Receipt image uploads
- Supabase Storage RLS
**Test:** Full inventory workflow (scan → add → receive alerts → sync across 3+ devices).
---
### Phase 3: Community + Intelligence (Weeks 9-12)
**Builds on Phases 1-2:**
7. **Community Barcode Data**
- User submissions
- Spam prevention
- Confidence scoring
8. **Prediction Service**
- Server-side ML (Edge Function)
- Expiry estimation
- Historical learning
9. **Sales Data Collection**
- Aggregate community prices
- Query for price alerts
**Test:** Households contribute data, predictions improve over time.
---
## Scalability Thresholds
### At 100 Households (~500 users, 5K items)
| Component | Threshold | Mitigation |
|-----------|-----------|-----------|
| **SQLite local DB** | ~500 items | Partition by household, use indexes |
| **Supabase Realtime** | <100 concurrent connections | Realtime works fine at this scale |
| **Postgres** | <1M rows total | Standard PostgreSQL handles easily |
| **Barcode cache hits** | 70-80% | Most households share products |
| **External API calls** | ~500/day | Budget: <$10/month |
**Status:** ✅ No changes needed, standard patterns work.
---
### At 10K Households (~50K users, 500K items)
| Component | Threshold | Mitigation |
|-----------|-----------|-----------|
| **Postgres queries** | Queries slow (>500ms) | Add indexes on (household_id, barcode) |
| **Realtime connections** | May hit limits (check Supabase tier) | Consider connection pooling or batching |
| **Barcode cache** | 1M+ cache entries | Archive old entries (unused >90 days) |
| **File storage** | ~10TB images | Implement auto-cleanup (delete old receipts) |
| **Notification generation** | 500K expiry checks/day | Move from triggers to batch job (Edge Function) |
| **External API costs** | ~5K calls/day | At 90% cache hit: ~500 calls/day (~$5-25/month) |
**Status:** ⚠️ Some optimization needed:
1. **Add database indexes:**
```sql
CREATE INDEX idx_items_household_barcode ON inventory_items(household_id, barcode);
CREATE INDEX idx_items_expiration ON inventory_items(household_id, expiration_date);
CREATE INDEX idx_transactions_household ON transactions(household_id, created_at);
```
2. **Batch notification generation** (not trigger-based):
```typescript
// Run once daily via cron (Edge Function)
// SELECT COUNT(*) per day << triggers on EVERY UPDATE
```
3. **Archive old barcode cache:**
```sql
-- Monthly job: delete unused barcodes
DELETE FROM barcode_cache WHERE lookup_count = 0 AND last_updated < now - 90 days;
```
---
### At 100K+ Households (Beyond MVP)
| Component | Threshold | Mitigation |
|-----------|-----------|-----------|
| **Postgres single instance** | ~10M rows, reaching limits | Read replicas + connection pooling |
| **Realtime server capacity** | 10K+ concurrent connections | May require dedicated Realtime cluster |
| **Storage costs** | ~100TB | Move to S3 with Supabase S3 integration |
| **Data replication** | Async sync issues | Consider CRDT or event-sourced architecture |
**Status:** ❌ Significant re-architecture needed. Out of scope for MVP/Phase 1-2.
---
## Self-Hosted Deployment Strategy
### Docker Compose Stack (for self-hosting)
If deploying on-premises instead of Supabase cloud:
```yaml
version: '3.8'
services:
# PostgreSQL database
postgres:
image: postgres:15
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
ports:
- "5432:5432"
# Supabase API (PostgREST)
postgrest:
image: postgrest/postgrest:v11
environment:
PGRST_DB_URI: postgres://user:password@postgres:5432/sage
PGRST_JWT_SECRET: ${JWT_SECRET}
ports:
- "3000:3000"
depends_on:
- postgres
# Supabase Realtime (WebSocket server)
realtime:
image: supabase/realtime:latest
environment:
DATABASE_URL: postgres://user:password@postgres:5432/sage
ports:
- "4000:4000"
depends_on:
- postgres
# Auth service (Supabase Auth)
auth:
image: supabase/auth:v2
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
JWT_SECRET: ${JWT_SECRET}
SITE_URL: https://yourdomain.com
ADDITIONAL_REDIRECT_URLS: https://yourdomain.com/callback
ports:
- "9999:9999"
depends_on:
- postgres
# Storage (S3-compatible, e.g., MinIO)
storage:
image: minio/minio:latest
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: ${MINIO_PASSWORD}
volumes:
- minio_data:/minio_data
ports:
- "9000:9000"
command: minio server /minio_data
# Nginx reverse proxy
nginx:
image: nginx:latest
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
ports:
- "80:80"
- "443:443"
depends_on:
- postgrest
- realtime
- auth
volumes:
postgres_data:
minio_data:
```
**Deployment Notes:**
- **Infrastructure:** VPS (4GB RAM, 20GB disk minimum) costs $10-30/month.
- **Complexity:** Self-hosting requires DevOps expertise (backups, monitoring, updates).
- **Recommendation:** Start with Supabase cloud (free tier), migrate to self-hosted if cost becomes issue.
---
## Anti-Patterns to Avoid
### 1. **Syncing Every Field Change Individually**
❌ **Bad:**
```typescript
// Don't do this: 5 separate HTTP requests
await api.updateQuantity(item_id, 5);
await api.updateExpiry(item_id, '2026-02-01');
await api.updateNotes(item_id, 'From Costco');
await api.updatePrice(item_id, 12.99);
await api.updateCategory(item_id, 'Dairy');
```
✅ **Good:**
```typescript
// Single request with all changes
await api.updateItem(item_id, {
quantity: 5,
expiration_date: '2026-02-01',
notes: 'From Costco',
purchase_price: 12.99,
category: 'Dairy'
});
```
**Why:** Reduces network overhead, reduces merge conflicts, faster sync.
---
### 2. **Ignoring Offline Scenarios**
❌ **Bad:**
```typescript
// Assumes network always available
async addItem(item) {
await api.post('/items', item);
notifyUser('Item added');
}
```
✅ **Good:**
```typescript
// Works offline or online
async addItem(item) {
// Write locally first
await localDb.insert(item);
notifyUser('Item added');
// Sync in background
if (hasNetwork) {
try {
const response = await api.post('/items', item);
await localDb.update(item.id, { server_id: response.id, synced: true });
} catch (err) {
// Will retry next time online
}
}
}
```
**Why:** Mobile networks are unreliable, users expect offline functionality.
---
### 3. **Forgot WHERE Clause (RLS Bypass)**
❌ **Critical Security Bug:**
```typescript
// What if RLS is disabled by accident?
const items = await supabase
.from('inventory_items')
.select(); // Returns ALL items, all households!
```
✅ **Good:**
```typescript
// RLS enforces this, but be explicit anyway
const items = await supabase
.from('inventory_items')
.select()
.eq('household_id', user.current_household_id);
```
**Why:** Defense in depth. Assume RLS has bugs, code defensively.
---
### 4. **Barcode API Without Caching**
❌ **Bad (expensive):**
```typescript
// Every lookup hits external API
async scanBarcode(upc) {
return await goUpcApi.lookup(upc); // $0.01 per call
}
```
✅ **Good:**
```typescript
// 3-tier caching
async scanBarcode(upc) {
// Local first
let product = await localCache.get(upc);
if (product) return product;
// Server second (shared across household)
product = await serverCache.get(upc);
if (product) {
await localCache.set(upc, product);
return product;
}
// External only if miss
product = await externalApi.lookup(upc);
await serverCache.set(upc, product);
await localCache.set(upc, product);
return product;
}
```
**Why:** External API calls are expensive and slow. 90% of scans are repeats.
---
### 5. **No Expiration Policy on Notifications**
❌ **Bad (data bloat):**
```sql
-- Notification table grows forever
SELECT COUNT(*) FROM notifications;
-- Returns: 50 million rows after 1 year
```
✅ **Good:**
```sql
-- Auto-delete old notifications
CREATE TRIGGER cleanup_old_notifications
AFTER INSERT ON notifications
FOR EACH STATEMENT EXECUTE FUNCTION delete_old_notifications();
CREATE FUNCTION delete_old_notifications()
RETURNS void AS $$
DELETE FROM notifications
WHERE created_at < NOW() - INTERVAL '30 days';
$$ LANGUAGE SQL;
```
**Why:** Notifications are transient, archiving them indefinitely wastes storage.
---
### 6. **Conflict Resolution Without Version Tracking**
❌ **Bad:**
```typescript
// Two clients update same field, no way to detect conflict
user_a.item.quantity = 5;
user_b.item.quantity = 5;
// Both send updates, last write wins, no awareness of conflict
```
✅ **Good:**
```typescript
// Track version numbers or timestamps
{
item_id: "abc123",
quantity: 5,
version: 10, // Server increments this
updated_at: "2026-01-27T10:00:00Z"
}
// Client includes version in update request
PUT /items/abc123 { quantity: 4, version: 10 }
// Server checks: "Is version still 10?"
// If yes: apply update, increment version to 11
// If no: conflict! Return conflict response
```
**Why:** Without versions, conflicts are silent. Users lose data unknowingly.
---
## Architecture Checklist for Sage
- [ ] **Authentication:** Implement household-aware auth (user can be in multiple households, select one per session)
- [ ] **RLS Policies:** Test that users cannot access other households' data
- [ ] **Barcode Scanning:** Integrate ML Kit on device, implement 3-tier cache
- [ ] **Realtime Sync:** Test websocket reconnection, missed message catch-up
- [ ] **Offline Queue:** Persist pending changes to SQLite, retry on network
- [ ] **Conflict Resolution:** Implement Last-Write-Wins as default, log conflicts for analysis
- [ ] **Notifications:** Trigger-based for real-time events, scheduled for periodic checks
- [ ] **File Storage:** Set up Supabase Storage with RLS, implement image compression
- [ ] **Error Handling:** Graceful degradation when API fails (show cached data, queue for later)
- [ ] **Monitoring:** Log sync errors, API failures, conflict rates to detect issues
---
## Sources
- [Supabase Realtime Documentation](https://supabase.com/docs/guides/realtime)
- [Listening to Postgres Changes with Flutter](https://supabase.com/docs/guides/realtime/realtime-listening-flutter)
- [Supabase Storage Documentation](https://supabase.com/docs/guides/storage)
- [Multi-Tenant Database Architecture Patterns](https://www.bytebase.com/blog/multi-tenant-database-architecture-patterns-explained/)
- [Designing Postgres Database for Multi-tenancy](https://www.crunchydata.com/blog/designing-your-postgres-database-for-multi-tenancy)
- [Offline-First App Architecture](https://docs.flutter.dev/app-architecture/design-patterns/offline-first)
- [Three Approaches to Offline-First Development](https://academy.realm.io/posts/three-approaches-offline-first-development/)
- [Barcode Scanning Best Practices](https://www.scandit.com/blog/make-barcode-scanner-app-performant/)
- [Go-UPC Barcode Database](https://go-upc.com/)
- [Barcode Lookup API](https://www.barcodelookup.com/api)
- [Conflict-Free Replicated Data Types (CRDTs)](https://crdt.tech/)
- [Last-Write-Wins vs CRDTs](https://dzone.com/articles/conflict-resolution-using-last-write-wins-vs-crdts)
- [Self-Hosting Supabase with Docker](https://supabase.com/docs/guides/self-hosting/docker)
- [Notification System Design Best Practices](https://www.systemdesignhandbook.com/guides/design-a-notification-system/)
- [Push vs In-App Notifications 2026](https://www.pushengage.com/push-vs-in-app-notifications/)
- [Mobile App Image Optimization](https://www.scandit.com/blog/make-barcode-scanner-app-performant/)
- [Food Expiration Prediction ML](https://www.sciencedirect.com/science/article/abs/pii/S0924224425001256)
- [Artificial Intelligence for Food Safety](https://www.sciencedirect.com/science/article/pii/S0924224425002894)