Knowledge Pack Files
SideButton Marketing Website Knowledge Pack Files
Browse the source files that power the SideButton Marketing Website MCP server knowledge pack.
sidebutton install sidebutton.com Responsibility Map
Portal Page (Astro SSR)
- Page:
website/src/pages/portal/agents.astro— SSR-rendered fleet overview - Components:
website/src/components/portal/AgentRow.astro— agent card (screenshot, metrics, status badges, actions)website/src/components/portal/RunWorkflowModal.astro— two-panel modal (workflow picker + config)website/src/components/portal/ShareModal.astro— share agents across accountswebsite/src/components/portal/StatusBadge.astro— online/offline/error indicatorwebsite/src/components/portal/PortalLayout.astro— portal wrapper with sidebarwebsite/src/components/portal/PortalSidebar.astro— sidebar nav (Chat, Agents, Jobs, Queue, Workflows, Settings)
- Owns: fleet view SSR, client-side polling, screenshot modal, metric display, version tracking
API Routes (Astro SSR endpoints)
| Method | Path | File | Purpose |
|---|---|---|---|
| GET | /api/agents/fleet-status | pages/api/agents/fleet-status.ts | Lightweight polling — all agents with metrics, snapshots, active jobs, queue counts, cooldown |
| GET | /api/agents/health | pages/api/agents/health.ts | Pings each agent at http://{ip}:9876/health, updates DB status |
| POST | /api/agents/heartbeat | pages/api/agents/heartbeat.ts | Boot-time registration — agent sends name+token, portal upserts row + DNS |
| POST | /api/agents/{id}/refresh | pages/api/agents/[agentId]/refresh.ts | On-demand ping + screenshot grab, returns fresh data |
| GET | /api/agents/screenshot/{name} | pages/api/agents/screenshot/[name].ts | Serves PNG from data/screenshots/{name}.png |
| POST | /api/agents/{id}/reboot | pages/api/agents/[agentId]/reboot.ts | Sends reboot command to agent VM |
| POST | /api/agents/{id}/apply-config | pages/api/agents/[agentId]/apply-config.ts | Pushes env vars + .mcp.json to agent |
| GET | /api/agents/{id}/events | pages/api/agents/[agentId]/events.ts | Agent event log (tool calls, status changes) |
| GET/POST | /api/agents/{id}/settings | pages/api/agents/[agentId]/settings.ts | Per-account overrides (enabled_roles, effort, entry_paths) |
| POST | /api/agents/{id}/cancel-cooldown | pages/api/agents/[agentId]/cancel-cooldown.ts | Cancels post-job cooldown |
| POST | /api/agents/{id}/shares | pages/api/agents/[agentId]/shares.ts | Cross-account agent sharing |
Database Layer
- Schema + types:
website/src/lib/db/types.ts - Queries:
website/src/lib/db/queries.ts - DB init:
website/src/lib/db/index.ts(SQLite viabetter-sqlite3)
Dispatch System
- Entry point:
website/src/lib/dispatch.ts—enqueueDispatch()— all dispatch paths funnel here - Temporal orchestrator:
website/src/lib/temporal/client.ts—startPipelineJob()picks up queue items every 15-30s - Task queue:
sidebutton-orchestratoron Temporal (127.0.0.1:7233)
Agent-Side Server (runs on each VM, port 9876)
- Source:
packages/server/src/server.ts(in theoss/sidebuttonrepo) - Portal calls these endpoints on each agent:
GET /health— status, browser_connected, claude_running, cooldown, dependency_versionsGET /api/running-workflows— active workflows on the agentGET /api/screenshot— browser screenshot as base64 PNGPOST /api/config/apply— receives env vars +.mcp.jsonPOST /api/system/reboot— triggerssudo reboot(requiresSIDEBUTTON_AGENT_TOKEN)
Data Model
agents Table
| Column | Type | Description |
|---|---|---|
| id | INTEGER PK | Auto-increment |
| user_id | INTEGER FK | Owner user |
| name | TEXT UNIQUE(user_id, name) | Hostname (e.g. "sidebutton-agent-1") |
| display_name | TEXT | Optional friendly name |
| ip | TEXT | Public IPv4 (updated on heartbeat) |
| status | TEXT | online / busy / error / offline |
| capabilities | TEXT (JSON) | Role strings: ["se","qa","sd"] |
| effort_level | TEXT | Default: max / high / medium |
| last_seen_at | DATETIME | Updated on heartbeat/health check |
| sb_token | TEXT | Bearer token for portal→agent auth |
| dns_name | TEXT | Cloudflare DNS: {name}.agents.sidebutton.com |
| agent_env | TEXT | Env var block pushed via apply-config |
| entry_paths | TEXT (JSON) | EntryPath[] — working dirs with .mcp.json |
| dependency_versions | TEXT (JSON) | { claude_code, node, npm, sidebutton } |
| created_at | DATETIME | Registration time |
agent_snapshots Table
| Column | Type | Description |
|---|---|---|
| agent_id | INTEGER FK | References agents.id |
| metrics | TEXT (JSON) | { cpu_pct, mem_used_mb, mem_total_mb, swap_used_mb, swap_total_mb, load_1m, cpu_cores, disk_pct } |
| processes | TEXT (JSON) | { sidebutton, chrome_count, claude_code_count, claude_code_cpu } |
| activity | TEXT | active / idle / unknown |
| screenshot_path | TEXT | Filesystem path to PNG |
| session_id | TEXT | Claude Code session ID |
| created_at | DATETIME | Snapshot timestamp |
agent_events Table
Tracks tool calls and status transitions per agent session.
agent_shares Table
Cross-account sharing: UNIQUE(agent_id, account_id).
agent_settings Table
Per-account overrides: enabled_roles (JSON array), effort_level, enabled_entry_paths (JSON array).
Architecture & Data Flow
Agent Registration (Boot)
Agent VM boots → systemd starts SideButton on :9876
→ POST /api/agents/heartbeat (X-Agent-Name + Bearer token)
→ getOrCreateAgent() upserts agent row, sets IP + last_seen_at
→ Background: upsertAgentDns() updates Cloudflare A record
Health Check (Portal → Agent)
GET /api/agents/health (on page load):
→ For each agent with IP (parallel, 8s timeout):
→ fetch http://{ip}:9876/health
→ fetch http://{ip}:9876/api/running-workflows
→ Status priority:
1. Workflows running → busy
2. Active DB job AND Claude running → busy
3. Browser disconnected → error
4. Healthy + idle → online
5. Unreachable → offline
→ setAgentStatus() updates DB
Fleet Status Polling (Client-Side)
1. SSR renders agent list from DB
2. Client calls runHealthCheck() + pollFleetStatus() on load
3. Adaptive interval: 5s (busy/cooldown) or 30s (idle)
4. Pauses on hidden tab, resumes + immediate refresh on focus
5. GET /api/agents/fleet-status → DOM update in-place (no reload)
6. Updates: badges, metrics, screenshots, versions, queue counts, header stats
Screenshot Flow
Refresh button → POST /api/agents/{id}/refresh
→ Pings agent health + GET http://{ip}:9876/api/screenshot
→ Agent Chrome extension captures tab → base64 PNG
→ Portal saves to data/screenshots/{name}.png
→ Inserts agent_snapshots row
→ Client updates thumbnail with cache-bust ?t= param
Modal: click thumbnail → full-size overlay, close via Esc/X/click-outside
Job Dispatch (Run Workflow Modal)
"Run Workflow" / "Run Job" click → CustomEvent 'open-workflow-modal'
→ Left panel: pipelines grouped by role (OPS, SE, QA, SD)
→ Right panel: params, hint, effort toggle, agent pills, entry path
→ POST /api/queue → enqueueDispatch() creates queue_items row
→ eligible_at = now + 15s (minimum dispatch delay)
→ Temporal sweeps queue every 15-30s → creates job → dispatches to agent
→ 'close-workflow-modal' event → pollFleetStatus() after 2s
Online Detection
Heartbeat-based, 5-minute threshold (HEARTBEAT_THRESHOLD_MS = 300,000ms)
isOnline = status !== 'offline' AND last_seen_at within 5 minutes
No WebSocket — purely heartbeat + polling
Key Query Functions
| Function | Purpose |
|---|---|
getOrCreateAgent(userId, name, ip) | Upsert on heartbeat (INSERT ... ON CONFLICT) |
getAgentsByAccountId(accountId) | Owned + shared agents (UNION query) |
getLatestAgentSnapshot(agentId) | Most recent snapshot (metrics + screenshot) |
getActiveJobForAgent(agentId) | Running/waiting job assigned to agent |
getQueuedItemsForAgent(agentId) | Queued items targeting agent (limit 5) |
setAgentStatus(agentId, status) | Updates status + last_seen_at |
insertAgentSnapshot(data) | Creates new snapshot row |
deleteAgent(agentId) | Cascades: events → shares → job_steps nullify → delete |
Metric Thresholds (Color Coding)
| Metric | Warn | Danger | Computation |
|---|---|---|---|
| CPU | 50% | 80% | cpu_pct direct |
| MEM | 70% | 85% | mem_used_mb / mem_total_mb * 100 |
| LOAD | 50% | 80% | load_1m / cpu_cores * 100 (capped 100%) |
| DISK | 70% | 85% | disk_pct direct |
| SWAP | 30% | 60% | swap_used_mb / swap_total_mb * 100 |
Colors: >= danger → text-red-400, >= warn → text-amber-400, else text-slate-400
Issue Triage
| Symptom | Layer | Where to Fix |
|---|---|---|
| Agent "Offline" but VM running | Portal API | health.ts — check timeout, IP routing, sb_token |
| Metrics not updating | Portal + Agent | Agent /health response; fleet-status.ts snapshot parsing |
| Screenshot stale/missing | Portal API | refresh.ts — browser_connected, screenshot timeout, fs write |
| "busy" stuck after job done | Portal API | health.ts status logic — claudeRunning should be false |
| Run Workflow modal empty | Portal SSR | agents.astro — check getPipelinesForAccount |
| Agent not appearing | Portal API | heartbeat.ts — agent_token, X-Agent-Name, IP detection |
| Queue not dispatching | Dispatch | dispatch.ts dedup / temporal/client.ts connection |
| Shared agent missing caps | Portal SSR | agents.astro L44-70 — enabled_roles is dispatch filter, not display |
Development Setup
cd website && npm install && npm run dev # Astro dev server
- SQLite:
DATABASE_PATHenv var; screenshots in siblingscreenshots/dir - Temporal: must be running at
127.0.0.1:7233for dispatch - Agent registration:
POST /api/agents/heartbeatwithAuthorization: Bearer {SIDEBUTTON_AGENT_TOKEN}+X-Agent-Name - Auth: Auth0 for portal login; API token auth for agent↔portal
Gotchas
- No WebSocket for fleet — polling only (5s active, 30s idle)
- Status is multi-source — health response + running-workflows + DB job + claude_running flag, in priority order
- Shared agents strip mcp_json — entry_paths served without mcp_json cross-account (security)
- fleet-status vs health — fleet-status reads DB (fast, polling); health pings agents (slow, refresh). Both called on page load
- 15s dispatch delay — minimum
eligible_atoffset before Temporal picks up queue items - Screenshot storage — PNGs on filesystem (
data/screenshots/{name}.png), not in DB - 5-min heartbeat threshold — agent appears offline if no API call within 5 minutes
- Agent deletion cascades — events, shares, job_step nullify, then delete. No soft-delete