Knowledge Pack Files

SideButton Marketing Website Knowledge Pack Files

Browse the source files that power the SideButton Marketing Website MCP server knowledge pack.

Available free v1.0.3 Browser

$ sidebutton install sidebutton.com

Download ZIP

portal-agents/_roles/se.md

10.9 KB

Responsibility Map

Portal Page (Astro SSR)

Page: website/src/pages/portal/agents.astro — SSR-rendered fleet overview
Components:
- website/src/components/portal/AgentRow.astro — agent card (screenshot, metrics, status badges, actions)
- website/src/components/portal/RunWorkflowModal.astro — two-panel modal (workflow picker + config)
- website/src/components/portal/ShareModal.astro — share agents across accounts
- website/src/components/portal/StatusBadge.astro — online/offline/error indicator
- website/src/components/portal/PortalLayout.astro — portal wrapper with sidebar
- website/src/components/portal/PortalSidebar.astro — sidebar nav (Chat, Agents, Jobs, Queue, Workflows, Settings)
Owns: fleet view SSR, client-side polling, screenshot modal, metric display, version tracking

API Routes (Astro SSR endpoints)

Method	Path	File	Purpose
GET	`/api/agents/fleet-status`	`pages/api/agents/fleet-status.ts`	Lightweight polling — all agents with metrics, snapshots, active jobs, queue counts, cooldown
GET	`/api/agents/health`	`pages/api/agents/health.ts`	Pings each agent at `http://{ip}:9876/health`, updates DB status
POST	`/api/agents/heartbeat`	`pages/api/agents/heartbeat.ts`	Boot-time registration — agent sends name+token, portal upserts row + DNS
POST	`/api/agents/{id}/refresh`	`pages/api/agents/[agentId]/refresh.ts`	On-demand ping + screenshot grab, returns fresh data
GET	`/api/agents/screenshot/{name}`	`pages/api/agents/screenshot/[name].ts`	Serves PNG from `data/screenshots/{name}.png`
POST	`/api/agents/{id}/reboot`	`pages/api/agents/[agentId]/reboot.ts`	Sends reboot command to agent VM
POST	`/api/agents/{id}/apply-config`	`pages/api/agents/[agentId]/apply-config.ts`	Pushes env vars + `.mcp.json` to agent
GET	`/api/agents/{id}/events`	`pages/api/agents/[agentId]/events.ts`	Agent event log (tool calls, status changes)
GET/POST	`/api/agents/{id}/settings`	`pages/api/agents/[agentId]/settings.ts`	Per-account overrides (enabled_roles, effort, entry_paths)
POST	`/api/agents/{id}/cancel-cooldown`	`pages/api/agents/[agentId]/cancel-cooldown.ts`	Cancels post-job cooldown
POST	`/api/agents/{id}/shares`	`pages/api/agents/[agentId]/shares.ts`	Cross-account agent sharing

Database Layer

Schema + types: website/src/lib/db/types.ts
Queries: website/src/lib/db/queries.ts
DB init: website/src/lib/db/index.ts (SQLite via better-sqlite3)

Dispatch System

Entry point: website/src/lib/dispatch.ts — enqueueDispatch() — all dispatch paths funnel here
Temporal orchestrator: website/src/lib/temporal/client.ts — startPipelineJob() picks up queue items every 15-30s
Task queue: sidebutton-orchestrator on Temporal (127.0.0.1:7233)

Agent-Side Server (runs on each VM, port 9876)

Source: packages/server/src/server.ts (in the oss/sidebutton repo)
Portal calls these endpoints on each agent:
- GET /health — status, browser_connected, claude_running, cooldown, dependency_versions
- GET /api/running-workflows — active workflows on the agent
- GET /api/screenshot — browser screenshot as base64 PNG
- POST /api/config/apply — receives env vars + .mcp.json
- POST /api/system/reboot — triggers sudo reboot (requires SIDEBUTTON_AGENT_TOKEN)

Data Model

agents Table

Column	Type	Description
id	INTEGER PK	Auto-increment
user_id	INTEGER FK	Owner user
name	TEXT UNIQUE(user_id, name)	Hostname (e.g. "sidebutton-agent-1")
display_name	TEXT	Optional friendly name
ip	TEXT	Public IPv4 (updated on heartbeat)
status	TEXT	`online` / `busy` / `error` / `offline`
capabilities	TEXT (JSON)	Role strings: `["se","qa","sd"]`
effort_level	TEXT	Default: `max` / `high` / `medium`
last_seen_at	DATETIME	Updated on heartbeat/health check
sb_token	TEXT	Bearer token for portal→agent auth
dns_name	TEXT	Cloudflare DNS: `{name}.agents.sidebutton.com`
agent_env	TEXT	Env var block pushed via apply-config
entry_paths	TEXT (JSON)	`EntryPath[]` — working dirs with `.mcp.json`
dependency_versions	TEXT (JSON)	`{ claude_code, node, npm, sidebutton }`
created_at	DATETIME	Registration time

agent_snapshots Table

Column	Type	Description
agent_id	INTEGER FK	References agents.id
metrics	TEXT (JSON)	`{ cpu_pct, mem_used_mb, mem_total_mb, swap_used_mb, swap_total_mb, load_1m, cpu_cores, disk_pct }`
processes	TEXT (JSON)	`{ sidebutton, chrome_count, claude_code_count, claude_code_cpu }`
activity	TEXT	`active` / `idle` / `unknown`
screenshot_path	TEXT	Filesystem path to PNG
session_id	TEXT	Claude Code session ID
created_at	DATETIME	Snapshot timestamp

agent_events Table

Tracks tool calls and status transitions per agent session.

agent_shares Table

Cross-account sharing: UNIQUE(agent_id, account_id).

agent_settings Table

Per-account overrides: enabled_roles (JSON array), effort_level, enabled_entry_paths (JSON array).

Architecture & Data Flow

Agent Registration (Boot)

Agent VM boots → systemd starts SideButton on :9876
  → POST /api/agents/heartbeat (X-Agent-Name + Bearer token)
  → getOrCreateAgent() upserts agent row, sets IP + last_seen_at
  → Background: upsertAgentDns() updates Cloudflare A record

Health Check (Portal → Agent)

GET /api/agents/health (on page load):
  → For each agent with IP (parallel, 8s timeout):
    → fetch http://{ip}:9876/health
    → fetch http://{ip}:9876/api/running-workflows
    → Status priority:
      1. Workflows running → busy
      2. Active DB job AND Claude running → busy
      3. Browser disconnected → error
      4. Healthy + idle → online
      5. Unreachable → offline
    → setAgentStatus() updates DB

Fleet Status Polling (Client-Side)

1. SSR renders agent list from DB
2. Client calls runHealthCheck() + pollFleetStatus() on load
3. Adaptive interval: 5s (busy/cooldown) or 30s (idle)
4. Pauses on hidden tab, resumes + immediate refresh on focus
5. GET /api/agents/fleet-status → DOM update in-place (no reload)
6. Updates: badges, metrics, screenshots, versions, queue counts, header stats

Screenshot Flow

Refresh button → POST /api/agents/{id}/refresh
  → Pings agent health + GET http://{ip}:9876/api/screenshot
  → Agent Chrome extension captures tab → base64 PNG
  → Portal saves to data/screenshots/{name}.png
  → Inserts agent_snapshots row
  → Client updates thumbnail with cache-bust ?t= param

Modal: click thumbnail → full-size overlay, close via Esc/X/click-outside

Job Dispatch (Run Workflow Modal)

"Run Workflow" / "Run Job" click → CustomEvent 'open-workflow-modal'
  → Left panel: pipelines grouped by role (OPS, SE, QA, SD)
  → Right panel: params, hint, effort toggle, agent pills, entry path
  → POST /api/queue → enqueueDispatch() creates queue_items row
    → eligible_at = now + 15s (minimum dispatch delay)
    → Temporal sweeps queue every 15-30s → creates job → dispatches to agent
  → 'close-workflow-modal' event → pollFleetStatus() after 2s

Online Detection

Heartbeat-based, 5-minute threshold (HEARTBEAT_THRESHOLD_MS = 300,000ms)
  isOnline = status !== 'offline' AND last_seen_at within 5 minutes
  No WebSocket — purely heartbeat + polling

Key Query Functions

Function	Purpose
`getOrCreateAgent(userId, name, ip)`	Upsert on heartbeat (INSERT ... ON CONFLICT)
`getAgentsByAccountId(accountId)`	Owned + shared agents (UNION query)
`getLatestAgentSnapshot(agentId)`	Most recent snapshot (metrics + screenshot)
`getActiveJobForAgent(agentId)`	Running/waiting job assigned to agent
`getQueuedItemsForAgent(agentId)`	Queued items targeting agent (limit 5)
`setAgentStatus(agentId, status)`	Updates status + last_seen_at
`insertAgentSnapshot(data)`	Creates new snapshot row
`deleteAgent(agentId)`	Cascades: events → shares → job_steps nullify → delete

Metric Thresholds (Color Coding)

Metric	Warn	Danger	Computation
CPU	50%	80%	`cpu_pct` direct
MEM	70%	85%	`mem_used_mb / mem_total_mb * 100`
LOAD	50%	80%	`load_1m / cpu_cores * 100` (capped 100%)
DISK	70%	85%	`disk_pct` direct
SWAP	30%	60%	`swap_used_mb / swap_total_mb * 100`

Colors: >= danger → text-red-400, >= warn → text-amber-400, else text-slate-400

Issue Triage

Symptom	Layer	Where to Fix
Agent "Offline" but VM running	Portal API	`health.ts` — check timeout, IP routing, sb_token
Metrics not updating	Portal + Agent	Agent `/health` response; `fleet-status.ts` snapshot parsing
Screenshot stale/missing	Portal API	`refresh.ts` — browser_connected, screenshot timeout, fs write
"busy" stuck after job done	Portal API	`health.ts` status logic — claudeRunning should be false
Run Workflow modal empty	Portal SSR	`agents.astro` — check `getPipelinesForAccount`
Agent not appearing	Portal API	`heartbeat.ts` — agent_token, X-Agent-Name, IP detection
Queue not dispatching	Dispatch	`dispatch.ts` dedup / `temporal/client.ts` connection
Shared agent missing caps	Portal SSR	`agents.astro` L44-70 — enabled_roles is dispatch filter, not display

Development Setup

cd website && npm install && npm run dev   # Astro dev server

SQLite: DATABASE_PATH env var; screenshots in sibling screenshots/ dir
Temporal: must be running at 127.0.0.1:7233 for dispatch
Agent registration: POST /api/agents/heartbeat with Authorization: Bearer {SIDEBUTTON_AGENT_TOKEN} + X-Agent-Name
Auth: Auth0 for portal login; API token auth for agent↔portal

Gotchas

No WebSocket for fleet — polling only (5s active, 30s idle)
Status is multi-source — health response + running-workflows + DB job + claude_running flag, in priority order
Shared agents strip mcp_json — entry_paths served without mcp_json cross-account (security)
fleet-status vs health — fleet-status reads DB (fast, polling); health pings agents (slow, refresh). Both called on page load
15s dispatch delay — minimum eligible_at offset before Temporal picks up queue items
Screenshot storage — PNGs on filesystem (data/screenshots/{name}.png), not in DB
5-min heartbeat threshold — agent appears offline if no API call within 5 minutes
Agent deletion cascades — events, shares, job_step nullify, then delete. No soft-delete