Vivi-Speech/.planning/research/ARCHITECTURE.md

# Architecture Research: Vivi Speech Translator

## Overview

Vivi Speech Translator is a Discord bot that detects emoji-based messages proxied by PluralKit, parses emoji sequences, looks up their meanings in a persistent global dictionary, and replies with natural language translations. The bot must operate across multiple servers, handle both channel and DM messages, and learn new emoji meanings over time.

This document outlines the recommended high-level architecture, component responsibilities, data flows, and scaling strategies.

---

## Core Components

### 1. Discord Client

**Responsibility:** Establish and maintain the connection to Discord's API and WebSocket.

**Key Details:**
- Uses `discord.Client` or `discord.ext.commands.Bot` from discord.py library
- Requires `Intents` configuration to specify which events the bot listens for:
  - `message_content` intent: Required to read message text (privileged intent, requires approval)
  - `guilds` intent: Track guild membership and changes
  - `direct_messages` intent: Listen for DMs
  - `dm_messages` intent: Read DM message content
- Initializes on startup and runs the main event loop via `client.run(token)`
- Handles connection failures and automatic reconnection

**Why This Matters:** Discord's event-driven architecture means the Client is the foundation—without it, the bot cannot receive any messages or respond to events.

---

### 2. Message Event Handler

**Responsibility:** Receive all messages, filter for relevance, and route to downstream processors.

**Key Details:**
- Implements `on_message` event in discord.py (async callback)
- Filters for:
  1. **Webhook Detection:** Check if `message.webhook_id` is not None (indicates a proxied message)
  2. **PluralKit Verification:** Query PluralKit API to confirm message was proxied by PluralKit (not another webhook system)
  3. **Vivi Detection:** Check if the `member_id` in the PluralKit response matches Vivi's registered member ID
  4. **Bot Self-Filter:** Ignore messages from Vivi Speech Translator bot itself
- Routes confirmed Vivi messages to the Emoji Parser
- Handles both guild channels and DMs

**PluralKit Detection Approach:**
When a message is received, the bot can query the PluralKit API using the message ID:
```
GET https://api.pluralkit.me/v2/messages/{message_id}
```
This returns a Message object containing:
- `member`: The member object that proxied the message (contains member_id, name, avatar, etc.)
- `sender`: The original user ID that sent the command (the account owner)
- `system`: The system that manages the members
- `timestamp`: When the message was sent
- `guild`: The guild ID where the message was sent
- `channel`: The channel ID where the message was sent

By checking if `response.member.id == vivi_member_id`, the bot can verify Vivi specifically sent the message.

**Rate Limiting:** PluralKit API has a 10/second rate limit for message lookups. The bot should handle rate limit responses gracefully with exponential backoff.

---

### 3. Emoji Parser

**Responsibility:** Extract and categorize emojis from a message into a structured sequence.

**Key Details:**
- Receives the confirmed Vivi message text from the Message Event Handler
- Uses regex patterns to extract:
  1. **Unicode Emojis:** Standard emoji characters (😷, ❌, etc.)
     - Pattern: `\p{Extended_Pictographic}` (matches full Unicode emoji range)
     - Alternative Python regex: `([\u00a9\u00ae\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff]|\ufe0f)`
  2. **Custom Server Emojis:** Discord custom emoji format `<:emoji_name:emoji_id>` or `<a:emoji_name:emoji_id>` (for animated)
     - Pattern: `<a?:[^:\s]+:\d+>`
- Preserves order of emojis as they appear left-to-right
- Returns a structured list like: `[{type: "emoji", value: "😷", id: None}, {type: "custom", value: "me1", id: "123456789"}]`
- Handles edge cases:
  - Emoji skin tone modifiers
  - Zero-width joiners (ZWJ sequences like family emojis)
  - Emoji variations

**Why This Order Matters:** The project spec notes that emoji sequences are compositional and context-dependent. Preserving order and distinguishing types allows the Translation Engine to understand the full intended meaning.

---

### 4. Translation Engine

**Responsibility:** Convert emoji sequences into natural language using the emoji dictionary.

**Key Details:**
- Receives structured emoji list from Emoji Parser
- For each emoji:
  1. Look up its meaning in the Emoji Dictionary (database)
  2. Handle three cases:
     - **Known emoji:** Include its meaning in output
     - **Unknown emoji:** Display the emoji itself with a placeholder or skip
     - **Custom emoji:** Look up by custom emoji ID in database
- Generates natural language output:
  - If all emojis are known: Compose as a sentence ("Vivi is sick, but not in the sinuses")
  - If some are unknown: Format as: "Known meanings: ... [Unknown emoji] ..."
  - If none are known: Reply: "I don't know what these emojis mean yet. You can teach me with the `/teach` command."
- Considers emoji context (e.g., combination of emojis might have a specific meaning)

**Output Format:** The bot should reply in a Discord message, either in the same channel (if public) or as a DM (if DM context).

---

### 5. Database Layer

**Responsibility:** Store and retrieve persistent data (emoji dictionary and server configurations).

**Key Details:**
- **Tech Stack:** SQLAlchemy ORM with PostgreSQL for production reliability
- **Async Support:** Use `sqlalchemy.ext.asyncio` or `asyncpg` to avoid blocking the Discord event loop
- **Initialization:** Override `Bot.start()` or use a `setup_hook` to connect to database on startup
- **Connection Pooling:** Configure connection pool to handle concurrent requests from message handlers

**Two Core Tables:**

1. **emoji_dictionary**
   - `emoji_string` (TEXT, PRIMARY KEY): The emoji character(s) or custom emoji format
   - `custom_emoji_id` (BIGINT, NULLABLE): Discord custom emoji ID (if custom emoji)
   - `meaning` (TEXT): The learned meaning
   - `created_at` (TIMESTAMP): When first learned
   - `updated_at` (TIMESTAMP): Last update time
   - `updated_by_user_id` (BIGINT, NULLABLE): User ID of who taught/corrected this
   - `updated_by_member_id` (TEXT, NULLABLE): PluralKit member ID (e.g., Vivi's ID)
   - `created_in_guild` (BIGINT, NULLABLE): Guild ID where first learned (for tracking origin, optional)
   - Indexes: emoji_string (for fast lookups), custom_emoji_id (for custom emoji queries)

2. **server_configuration**
   - `guild_id` (BIGINT, PRIMARY KEY): Discord server ID
   - `auto_translate` (BOOLEAN, DEFAULT TRUE): Auto-translate all Vivi messages or require `/translate` command
   - `created_at` (TIMESTAMP): When server config created
   - Updated by Configuration Command Handler

**Important Design Decisions:**
- **Global dictionary:** Emoji meanings are shared across all servers. Different systems can update meanings, but there's a single source of truth per emoji.
- **Per-server config:** Each server has its own settings (auto vs. on-demand mode).
- **User attribution:** Track who taught each emoji for transparency and conflict resolution.
- **No per-server emoji variants:** The spec intends a global dictionary, so "😷" means the same thing everywhere. Per-server overrides could be added later if needed.

---

### 6. Command Handler (Teaching & Configuration)

**Responsibility:** Process bot commands for teaching emojis and configuring server behavior.

**Key Details:**
- **Tech:** Discord.py `commands.Cog` extension for modular command organization
- **Commands to Implement:**

  1. `/teach <emoji_sequence> <meaning>`
     - Extract emojis from the sequence using the Emoji Parser
     - Insert or update each emoji in the database
     - Confirm: "Learned: 😷 = sick, 2️⃣ = two, etc."
     - Only Vivi or approved users should be able to teach (can be restricted by user role or system authentication)

  2. `/forget <emoji>`
     - Delete emoji from dictionary
     - Confirm deletion

  3. `/meaning <emoji>`
     - Look up and reply with the meaning of a specific emoji
     - If unknown, reply: "I don't know that one yet."

  4. `/config auto-translate <on|off>`
     - Update `server_configuration.auto_translate` in database
     - Only server admins can change this
     - Requires guild context (won't work in DMs)

  5. `/translate <emoji_sequence>` (On-demand mode)
     - Manually trigger translation of an emoji sequence
     - Works in both channels and DMs

- **Error Handling:**
  - Graceful failures if database is unavailable
  - Clear user feedback for invalid emoji sequences
  - Require proper permissions for sensitive commands

---

### 7. Configuration Layer

**Responsibility:** Load bot configuration (token, database connection string, etc.) at startup.

**Key Details:**
- Use environment variables for secrets: `DISCORD_TOKEN`, `DATABASE_URL`, `PLURALKIT_TOKEN` (optional, for user-specific API calls)
- Configuration file (e.g., `config.json` or `.env`) for non-secret settings:
  - Vivi's member ID (to filter for her messages specifically)
  - Default auto-translate mode
  - Logging level
- Initialize Config before starting the bot

---

## Data Flow

### Message Reception to Response

```
┌──────────────────────────────────────────────────────────────┐
│  Discord Message Event                                       │
│  (user posts message proxied by PluralKit webhook)          │
└──────────────────────────────────────────────────────────────┘
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Message Event Handler                                       │
│  - Check: webhook_id != None?                               │
│  - Query: PluralKit API for message info                    │
│  - Verify: member_id == Vivi's ID?                         │
│  - Filter: Ignore self-messages                            │
└──────────────────────────────────────────────────────────────┘
                            ↓
                     (YES, Vivi's message)
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Emoji Parser                                                │
│  - Extract emojis with regex                                │
│  - Categorize: Unicode vs. custom                           │
│  - Preserve order                                           │
│  - Output: [{type, value, id}, ...]                        │
└──────────────────────────────────────────────────────────────┘
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Translation Engine                                          │
│  - For each emoji: lookup in database                       │
│  - Compose natural language                                 │
│  - Handle unknown emojis                                    │
│  - Format response                                          │
└──────────────────────────────────────────────────────────────┘
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Database (emoji_dictionary)                                 │
│  - O(1) lookup by emoji_string (hash indexed)              │
│  - Return: meaning, metadata                               │
└──────────────────────────────────────────────────────────────┘
                            ↓
                  (Lookup Results)
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Response Formatting                                         │
│  - Compose message                                           │
│  - Check context: channel vs. DM                            │
│  - Apply server config: auto-translate mode                 │
└──────────────────────────────────────────────────────────────┘
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Discord API Response                                        │
│  - Send reply to channel or DM                              │
│  - Handle rate limits                                       │
│  - Log interaction                                          │
└──────────────────────────────────────────────────────────────┘
```

### Teaching Flow (Command-Driven)

```
┌──────────────────────────────────────────────────────────────┐
│  User runs: /teach 😷 2️⃣ "Vivi is sick, not sinuses"      │
└──────────────────────────────────────────────────────────────┘
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Command Handler                                             │
│  - Parse command arguments                                  │
│  - Authenticate: Is user authorized to teach?              │
│  - Extract emojis from sequence                             │
└──────────────────────────────────────────────────────────────┘
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Database Layer                                              │
│  - INSERT or UPDATE emoji_dictionary                        │
│  - Set: emoji_string, meaning, updated_by, timestamp       │
│  - Commit transaction                                       │
└──────────────────────────────────────────────────────────────┘
                            ↓
┌──────────────────────────────────────────────────────────────┐
│  Confirmation Reply                                          │
│  - "Learned: 😷 = sick, 2️⃣ = two"                           │
│  - Post in same context (channel or DM)                     │
└──────────────────────────────────────────────────────────────┘
```

---

## PluralKit Integration Details

### Detection Approach

1. **Webhook Detection (First Filter):**
   - Check `message.webhook_id` property in discord.py
   - If not None, message was sent via webhook (PluralKit proxy)

2. **PluralKit API Query (Confirmation):**
   - Query endpoint: `GET https://api.pluralkit.me/v2/messages/{message_id}`
   - The `message_id` can be the webhook message ID or the original message ID (original works for 30 minutes)
   - Parse response to get `member` object

3. **Member Verification:**
   - Extract `member.id` from API response
   - Compare with Vivi's known member ID (from config)
   - If match: Process as Vivi's message
   - If not match: Ignore (message from another member)

4. **Alternative: Member Names (Backup):**
   - If using member ID fails, fall back to checking `member.name`
   - Look for "Vivi" or configured member name

### API Endpoints Used

| Endpoint | Purpose | Rate Limit | Response |
|----------|---------|-----------|----------|
| `GET /v2/messages/{message}` | Get proxied message info | 10/sec | Message object with member, sender, guild, channel, timestamp |
| `GET /v2/systems/@me` | Get authenticated system info | 10/sec | Full system + members (requires token) |
| `GET /v2/members/{member}` | Get specific member info | 10/sec | Member object with proxy tags, avatar, etc. |

**Authentication (Optional):**
- Public queries (member lookup) don't require authentication
- System-specific queries (private member settings) require system token via `Authorization: Bearer {token}` header
- For Vivi's system, store the system token in environment variable `PLURALKIT_TOKEN` for authenticated access

### Implementation in discord.py

```python
import aiohttp

async def check_vivi_message(message: discord.Message, vivi_member_id: str) -> bool:
    """Check if message was proxied by Vivi via PluralKit."""

    # Step 1: Check if message is from a webhook
    if message.webhook_id is None:
        return False  # Not proxied

    # Step 2: Query PluralKit API
    async with aiohttp.ClientSession() as session:
        try:
            async with session.get(
                f"https://api.pluralkit.me/v2/messages/{message.id}"
            ) as resp:
                if resp.status != 200:
                    return False  # Not a PluralKit message

                data = await resp.json()

                # Step 3: Check member ID matches Vivi
                if data.get("member", {}).get("id") == vivi_member_id:
                    return True
                else:
                    return False
        except Exception as e:
            # Log error, but don't crash
            print(f"PluralKit API error: {e}")
            return False
```

---

## Database Schema

### emoji_dictionary Table

```sql
CREATE TABLE emoji_dictionary (
    emoji_string TEXT PRIMARY KEY,
    custom_emoji_id BIGINT NULLABLE,
    meaning TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_by_user_id BIGINT NULLABLE,
    updated_by_member_id TEXT NULLABLE,
    created_in_guild BIGINT NULLABLE
);

CREATE INDEX idx_emoji_string ON emoji_dictionary(emoji_string);
CREATE INDEX idx_custom_emoji_id ON emoji_dictionary(custom_emoji_id);
```

### server_configuration Table

```sql
CREATE TABLE server_configuration (
    guild_id BIGINT PRIMARY KEY,
    auto_translate BOOLEAN DEFAULT TRUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_guild_id ON server_configuration(guild_id);
```

### Alternative: SQLAlchemy ORM Definitions

```python
from sqlalchemy import Column, String, BigInteger, Boolean, DateTime, Text
from sqlalchemy.ext.declarative import declarative_base
from datetime import datetime

Base = declarative_base()

class EmojiDictionary(Base):
    __tablename__ = "emoji_dictionary"

    emoji_string = Column(String, primary_key=True)
    custom_emoji_id = Column(BigInteger, nullable=True)
    meaning = Column(Text, nullable=False)
    created_at = Column(DateTime, default=datetime.utcnow)
    updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
    updated_by_user_id = Column(BigInteger, nullable=True)
    updated_by_member_id = Column(String, nullable=True)
    created_in_guild = Column(BigInteger, nullable=True)

class ServerConfiguration(Base):
    __tablename__ = "server_configuration"

    guild_id = Column(BigInteger, primary_key=True)
    auto_translate = Column(Boolean, default=True)
    created_at = Column(DateTime, default=datetime.utcnow)
    updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
```

---

## Suggested Build Order

### Phase 1: Foundation (Week 1-2)
**Goal:** Get Vivi messages detected and logged.

1. **Set up Discord bot:**
   - Create Discord application and token
   - Initialize discord.py Client/Bot with required Intents
   - Implement basic `on_message` event handler
   - Test basic logging

2. **Implement PluralKit detection:**
   - Add webhook detection (check `message.webhook_id`)
   - Add PluralKit API query and member verification
   - Log when Vivi messages are detected
   - Handle API errors gracefully

3. **Database initialization:**
   - Set up PostgreSQL database
   - Create emoji_dictionary and server_configuration tables
   - Test connection from bot

**Deliverables:** Bot logs every Vivi message to console; doesn't respond yet.

---

### Phase 2: Emoji Parsing & Translation (Week 3-4)
**Goal:** Translate Vivi's emojis to text.

1. **Emoji parsing:**
   - Implement regex patterns for Unicode and custom emojis
   - Extract emoji sequences in order
   - Test with various emoji types

2. **Basic emoji lookup:**
   - Query emoji_dictionary table
   - Return meanings for known emojis
   - Handle unknown emojis

3. **Response formatting:**
   - Compose natural language from emoji meanings
   - Send reply to channel/DM
   - Handle edge cases (no emojis, all unknown)

4. **Manual testing:**
   - Create test emojis in database
   - Post Vivi messages and verify translations

**Deliverables:** Bot translates Vivi messages; appears in channels and DMs.

---

### Phase 3: Teaching Commands (Week 5-6)
**Goal:** Allow users to teach the bot emoji meanings.

1. **Implement `/teach` command:**
   - Parse emoji sequences
   - Insert into database with metadata
   - Confirm to user

2. **Implement `/meaning` command:**
   - Look up single emoji
   - Reply with meaning or "not learned yet"

3. **Implement `/forget` command:**
   - Delete emoji from database
   - Require admin or Vivi permission

4. **Permission system:**
   - Restrict teaching to authorized users (Vivi + alters)
   - Use Discord roles or user ID allowlist

**Deliverables:** Users can teach and query emoji meanings via commands.

---

### Phase 4: Per-Server Configuration (Week 7)
**Goal:** Allow servers to opt into/out of auto-translation.

1. **Implement `/config auto-translate` command:**
   - Toggle auto-translate on/off per server
   - Requires admin permission
   - Only works in guild context (not DMs)

2. **Update message handler:**
   - Check server config before auto-translating
   - Only reply if auto_translate == TRUE
   - In DMs, always translate when `/translate` is used

3. **On-demand translation:**
   - `/translate` command for manual translation
   - Works in any context

**Deliverables:** Servers can control translation behavior; bot respects preferences.

---

### Phase 5: Polish & Edge Cases (Week 8+)
**Goal:** Handle real-world complexity.

1. **Natural language formatting:**
   - Improve composition of translations
   - Handle emoji modifiers (skin tones, ZWJ sequences)
   - Custom emoji descriptions

2. **Error handling & resilience:**
   - Database unavailability
   - PluralKit API failures
   - Rate limiting with exponential backoff
   - Graceful degradation

3. **Logging & monitoring:**
   - Structured logging for debugging
   - Monitor database performance
   - Track API error rates

4. **Codebase refactoring:**
   - Move commands to separate Cogs
   - Organize into modules: `cogs/teaching.py`, `cogs/config.py`, etc.
   - Add docstrings and type hints

5. **Testing:**
   - Unit tests for emoji parsing
   - Integration tests for database queries
   - End-to-end tests with Discord

**Deliverables:** Robust, maintainable codebase ready for production.

---

## Scaling Considerations

### Multi-Server Architecture

**Challenge:** Bot will operate in many Discord servers simultaneously, each with potentially thousands of members.

**Solution:**

1. **Shared Emoji Dictionary:**
   - Single global PostgreSQL database with all emoji meanings
   - All servers query the same emoji_dictionary table
   - Updates are reflected across all servers immediately
   - Reduces redundancy and keeps meanings consistent

2. **Per-Server Configuration:**
   - Each guild has its own row in server_configuration
   - Fast lookup by guild_id (indexed)
   - Allows servers to choose auto-translate vs. on-demand

3. **Connection Pooling:**
   - SQLAlchemy async engine with `pool_size=20, max_overflow=10` (tunable)
   - Reuses database connections across handlers
   - Prevents connection exhaustion under load

### Performance Optimization

1. **Emoji Lookup Performance:**
   - Primary key index on emoji_dictionary.emoji_string for O(1) lookup
   - Secondary index on custom_emoji_id for custom emoji queries
   - Consider in-memory cache (Redis) if lookups become bottleneck:
     - Query Redis first (1ms latency)
     - Fall back to PostgreSQL
     - Invalidate cache on updates

2. **Caching Strategy (Optional, Post-MVP):**
   - Use Redis for frequently accessed emojis
   - TTL: 1 hour (emoji meanings change rarely)
   - Invalidate cache when `/teach` or `/forget` commands update dictionary
   - Benefits: Reduced database load, lower latency

3. **Rate Limiting:**
   - PluralKit API: 10 requests/second (already enforced by API)
   - Discord API: 50 requests/minute per channel (built into discord.py)
   - Implement local rate limiting with `asyncio.Semaphore` for PluralKit queries:
     ```python
     semaphore = asyncio.Semaphore(5)  # Max 5 concurrent PluralKit queries
     ```

4. **Message Handler Optimization:**
   - Webhook detection (local check): ~0ms
   - PluralKit API query: ~100-200ms (async, non-blocking)
   - Emoji parsing (regex): ~1-5ms
   - Database lookup: ~1-50ms
   - **Total:** ~100-250ms per message (acceptable, happens in background)

### Scaling Beyond 2,000 Guilds

**Discord Requirement:** Bots with 2,000+ guilds must implement sharding.

**Sharding in discord.py:**
- discord.py handles sharding automatically if configured
- Bot distributes connections across multiple "shards" to different Discord servers
- Each shard handles a subset of guilds
- Emoji dictionary remains shared across all shards (single database)

**Example Configuration:**
```python
intents = discord.Intents.default()
bot = discord.AutoShardedBot(intents=intents)  # Automatic sharding

# Bot will shard automatically based on guild count
```

### Database Scaling

**For Millions of Emojis:**
- Table partitioning by emoji language/category (if dict grows huge)
- Read replicas for queries (if read-heavy)
- Consider denormalization (e.g., cache popular emoji meanings in memory)

**Current Recommendation:** Single PostgreSQL database is sufficient for MVP. Scale if needed post-launch.

---

## Component Interaction Diagram

```
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│  Discord Server (Guild)                                         │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │ #general                                                 │  │
│  │ Vivi (proxied): 😷 2️⃣ 🍑 ❌ 🤧                         │  │
│  │ Vivi Speech Translator: "Vivi is sick, but not in ..."  │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
                            ↑ (message event)
                            │
┌──────────────────────────┴──────────────────────────────────────┐
│  Discord.py Bot Framework                                       │
│                                                                 │
│  ┌────────────────────────────────────────────────────────┐    │
│  │ Discord Client                                         │    │
│  │ - Maintains WebSocket connection                       │    │
│  │ - Routes events to handlers                            │    │
│  └────────────────────────────────────────────────────────┘    │
│                            ↓                                     │
│  ┌────────────────────────────────────────────────────────┐    │
│  │ Message Event Handler (on_message)                     │    │
│  │ - Filter: webhook_id?                                 │    │
│  │ - Query: PluralKit API                                │    │
│  │ - Verify: member_id == Vivi?                          │    │
│  └────────────────────────────────────────────────────────┘    │
│                            ↓                                     │
│  ┌────────────────────────────────────────────────────────┐    │
│  │ Emoji Parser                                           │    │
│  │ - Extract emojis (regex)                              │    │
│  │ - Categorize (Unicode/custom)                         │    │
│  │ - Preserve order                                      │    │
│  └────────────────────────────────────────────────────────┘    │
│                            ↓                                     │
│  ┌────────────────────────────────────────────────────────┐    │
│  │ Translation Engine                                     │    │
│  │ - Lookup emojis in database                           │    │
│  │ - Compose natural language                            │    │
│  │ - Format response                                     │    │
│  └────────────────────────────────────────────────────────┘    │
│                            ↓                                     │
│  ┌────────────────────────────────────────────────────────┐    │
│  │ Command Handler (Cogs)                                 │    │
│  │ - /teach: Learn emoji meanings                        │    │
│  │ - /meaning: Look up emoji                             │    │
│  │ - /forget: Delete emoji                               │    │
│  │ - /config: Server preferences                         │    │
│  │ - /translate: Manual translation                      │    │
│  └────────────────────────────────────────────────────────┘    │
│                                                                 │
└──────────────────────────────┬──────────────────────────────────┘
                               ↓ (database queries/updates)
┌──────────────────────────────────────────────────────────────────┐
│  Database Layer                                                  │
│                                                                  │
│  ┌────────────────────────────────────────────────────────┐     │
│  │ SQLAlchemy ORM + asyncio                               │     │
│  │ - Async connection pool                                │     │
│  │ - Connection reuse                                     │     │
│  │ - Transaction management                              │     │
│  └────────────────────────────────────────────────────────┘     │
│                            ↓                                      │
│  ┌────────────────────────────────────────────────────────┐     │
│  │ PostgreSQL Database                                    │     │
│  │                                                        │     │
│  │  emoji_dictionary:          server_configuration:     │     │
│  │  ├─ 😷 → sick               ├─ guild_123 → auto ON   │     │
│  │  ├─ 2️⃣ → two               ├─ guild_456 → auto OFF  │     │
│  │  ├─ 🍑 → peach             └─ guild_789 → auto ON    │     │
│  │  ├─ ❌ → no                                            │     │
│  │  └─ :me1: → Vivi           (+ metadata, timestamps)   │     │
│  │  (+ metadata, timestamps)                             │     │
│  └────────────────────────────────────────────────────────┘     │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

External APIs:
┌────────────────────────────────────────────────────────────┐
│ PluralKit API (api.pluralkit.me)                           │
│ - GET /v2/messages/{id} → member info                     │
└────────────────────────────────────────────────────────────┘
```

---

## Technology Stack Recommendations

| Layer | Component | Technology | Why |
|-------|-----------|-----------|-----|
| **Bot Framework** | Discord Integration | discord.py 2.x | Async-native, active community, rich feature set |
| **Database ORM** | Persistence | SQLAlchemy 2.0 + asyncio | Async support, type-safe, widely adopted |
| **Database** | Data Store | PostgreSQL | Reliable, open-source, JSONB for future extensibility |
| **Async Runtime** | Concurrency | asyncio (built-in) | Lightweight, integrated with discord.py |
| **Caching** | Performance (Phase 5+) | Redis | Fast in-memory lookups, TTL support, distributed |
| **Logging** | Debugging | Python logging module | Built-in, structured logging can extend |
| **API Requests** | HTTP Calls | aiohttp | Async-native, connection pooling |
| **Testing** | Quality Assurance | pytest + pytest-asyncio | Async test support, fixtures |
| **Deployment** | Hosting | Docker + systemd or cloud | Reproducible environment, easy updates |

---

## Summary

**Recommended Architecture:**

Vivi Speech Translator is a modular Discord bot with a clear separation of concerns:

1. **Discord Client** listens for messages and routes them through a detection pipeline
2. **Message Event Handler** identifies when Vivi speaks (via PluralKit webhook + API verification)
3. **Emoji Parser** extracts emoji sequences while preserving order and type information
4. **Translation Engine** looks up meanings and composes responses
5. **Database Layer** (PostgreSQL + SQLAlchemy) stores a shared global emoji dictionary and per-server configurations
6. **Command Handler** (discord.py Cogs) allows teaching, querying, and configuration

The bot prioritizes:
- **Reliability:** Graceful error handling, retry logic, database transactions
- **Performance:** O(1) emoji lookups via indexing, async operations to avoid blocking, caching for scale
- **Scalability:** Shared emoji dictionary, per-server configs, optional Redis caching, Discord sharding support
- **Maintainability:** Modular Cog architecture, clear component boundaries, comprehensive logging

Build in phases: detection → parsing → translation → teaching → configuration → polish. This delivers value early (Phase 2) while establishing the foundation for features.

The bot can grow from a single server to thousands, limited primarily by PluralKit API rate limits (easily worked around) and database performance (PostgreSQL handles millions of rows efficiently).

---

## Related Documentation

- [PluralKit API Reference](https://pluralkit.me/api/)
- [discord.py Documentation](https://discordpy.readthedocs.io/)
- [SQLAlchemy Async Documentation](https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html)
- [PostgreSQL Documentation](https://www.postgresql.org/docs/)
- [Redis Documentation](https://redis.io/documentation)