SpookyJuice.ai

Dev Log

evolution

2026-02-25

Agent Platform Evolution: 7 sprints, 30+ new files, 5 migrations. Transformed SpookyJuice from a conversational chatbot into an autonomous agent platform. Built centralized alert pipeline with severity routing (SEV1-INFO), dedup, and auto-escalation to kill switch. Added dynamic role engine with 6 personas (architect, developer, ops, security, researcher, coordinator) each with trust floors, capability constraints, and model preferences. Created sandboxed executor sidecar container (1G/1CPU, internal network, command allowlist, no shell injection). Implemented self-heal engine that matches degraded components to repair actions with cooldowns and max-attempt escalation. Built swarm coordinator for spawning sub-agent teams through the OpenClaw gateway with cost budgets and duration limits. Added watchdog monitoring loop (5-min cycles checking components, Docker services, and active swarms) plus enforced cost ceiling middleware that blocks writes when limits are exceeded. Wired everything into Mission Control with alerts page, swarms panel, and SSE real-time event stream. 30+ architecture components, 5 new database migrations, 8 new Telegram commands, zero TypeScript errors across both repos.

Claude Opus session: 3h, ~1200K tokens (950K in / 250K out), $36 token cost, $150 billable. 1 agent, 7 sprints executed sequentially. Intelligence service: 15 new files (alert-pipeline.ts, role-engine.ts, executor-client.ts, self-heal-engine.ts, swarm-manager.ts, gateway-client.ts, watchdog.ts, 5 routes, 5 migrations), 8 modified files (index.ts, bot-commands.ts, architecture-engine.ts, cost-ceiling.ts, identity-loader.ts, context.ts, trigger-engine.ts, approvals.ts). Executor service: 9 new files (Dockerfile, policy.ts, workspace-manager.ts, 3 route files, index.ts, package.json, tsconfig.json). Dashboard: 6 new files (alerts page + client component, swarms page + client component, types, api functions), 2 modified (layout.tsx, types.ts). Infra: docker-compose.prod.yml updated with executor service + GATEWAY_URL. 6 role definition files in identity/roles/. Running day totals: Day 3, ~2700K tokens, $81.50 token cost, $640 billable.

Full details

Sprint 1A (Alert Pipeline): Replaced 8+ scattered sendTelegramMessage() calls with a centralized alert-pipeline.ts. Severity-based routing: SEV1 goes to Telegram + dashboard with 5min repeat until acknowledged, SEV2 goes to Telegram + dashboard with 30min cooldown, SEV3 is dashboard-only, INFO is log-only. In-memory dedup Map keyed by alert_key with configurable cooldowns. Escalation engine runs every 5 minutes: SEV1 unacknowledged for 15 minutes triggers SOFT kill switch, 60 minutes triggers HARD kill switch. Dedup map cleanup runs hourly to prevent memory leaks. Migration 010 creates alerts table with indexes on active alerts and alert_key. REST API at /alerts with GET /alerts, GET /alerts/active, PATCH /alerts/:id for acknowledgment. Refactored cost-ceiling.ts, trigger-engine.ts, routes/approvals.ts, and index.ts crons to use sendAlert() instead of direct Telegram calls.

Sprint 1B (Role Engine): Six role definition files in identity/roles/ with YAML frontmatter parsed by a custom lightweight parser (avoiding external yaml dependency). Each role specifies id, label, trust_floor (OBSERVE/DRAFT/EXECUTE/AUTONOMOUS), capabilities, excluded_capabilities, model_preference, and context_budget. Roles: architect (read-only analysis, reasoner model, 800 token budget), developer (full write access, primary model), ops (deploy + exec), security (read-only auditing, highest trust floor), researcher (web search + read, 1000 token budget), coordinator (orchestration). Role activation persisted to role_activations table (migration 011). Context injection modified to prepend active role instructions after SOUL.md identity block when role parameter is passed to POST /context/inject.

Sprint 2A (Executor Service): Entirely new sidecar container in executor/ directory. Express server on port 3200 with Zod-validated endpoints: POST /execute (command execution via execFile, never exec), POST /git (git shorthand), GET/POST /files (read/write with path traversal prevention), GET/POST/DELETE /workspaces. Policy engine loads from EXECUTOR_POLICY.json: 24 allowed commands (git, node, npm, python3, curl, etc.), 15 blocked argument patterns, 5-minute max timeout, 1MB max output. Workspace manager handles CRUD under /workspaces//. Kill switch integration: executor queries intelligence /health before executing, fails closed if unreachable. Intelligence-side executor-client.ts applies trust ladder check, kill switch check, audit logging, and persists to executions table (migration 012) before dispatching. Docker Compose adds executor service with 1G/1CPU limits on internal-only network with workspaces volume.

Sprint 2B (Self-Heal): Config-driven repair actions loaded from REPAIR_ACTIONS.json. Five initial repairs: restart-intelligence (approval required, max 3 attempts, 30min cooldown), alert-postgres (alert only, never auto-restart DB), reload-triggers (max 5 attempts, 15min cooldown), retry-backfill (max 3 attempts, 60min cooldown), alert-context-retrieval (alert only). Engine enforces cooldowns and max attempts per repair. After max_attempts exceeded, escalates to SEV1 alert requiring human intervention. Successful repairs reset the attempt counter. Hooked into architecture-engine.ts recomputeHealth(): when health declines below 80, attemptRepair() fires asynchronously. Repair log persisted to repair_log table (migration 013).

Sprint 2C (Swarm Coordination): Full swarm lifecycle: Plan -> Approve -> Spawn -> Monitor -> Collect -> Synthesize. Each swarm has name, plan (objective + agent definitions), cost budget, agent limit, and duration limit. Agents get role assignments from the role engine. Gateway client communicates with OpenClaw at port 18789 to spawn/list/end sessions. Swarm manager creates swarm + agent rows in database, spawns sessions via gateway with role context prepended, monitors via polling, auto-aborts on cost budget or duration exceeded. REST API: POST /swarms, GET /swarms, GET /swarms/active, GET /swarms/:id, POST /swarms/:id/start, POST /swarms/:id/poll, POST /swarms/:id/abort. Migration 014 creates swarms and swarm_agents tables with cascade deletes. Docker Compose adds GATEWAY_URL env to intelligence service.

Sprint 3A (Watchdog + Cost Enforcement): Watchdog runs every 5 minutes. Checks all architecture components for degraded/disabled status and attempts repairs. Polls Docker service health endpoints (intelligence, executor, gateway). Polls active swarms for result collection. Cost ceiling middleware added to Express middleware stack: caches cost checks for 30 seconds, blocks non-GET requests with 429 when daily or hourly limits exceeded, sends SEV1 alert when enforcement activates. GET/HEAD/OPTIONS always pass through. Sprint 3B (Mission Control): Dashboard gets Alerts page with active/recent tabs, severity-colored cards, acknowledge buttons. Swarms page with active/all tabs, drill-down detail view showing per-agent status with role and cost, abort button. SSE endpoint at /events/stream sends snapshots every 60s with alert count, swarm count, cost, and kill level. Alert pipeline wired to broadcast real-time alert events to all connected SSE clients. Nav updated with Alerts and Swarms links. Types and API functions extended for alerts and swarms.

evolution

2026-02-24

Every session tracked, every hour billable. Dev session tracking and billing system live. Agents self-report sessions with tokens, deliverables, and swarm membership. Invoice-ready summaries on demand.

9 sessions logged on day one: 17.42 agent hours, 1.36M tokens, 33 deliverables across 2 swarms. Configurable billing rates (human $150/hr, AI $50/hr). Public stats endpoint feeds the landing page.

Full details

The dev_sessions table captures everything: agent name and type, session timing, model used, tokens consumed, deliverables produced, files changed, commits, and swarm membership. The billing_rates table maps agent types to hourly rates.

Three session workflows: POST /sessions/start + /sessions/:id/end for real-time tracking, POST /sessions for after-the-fact logging. GET /sessions/billing returns invoice-ready JSON with breakdowns by agent, session type, and swarm.

Public stats at /public/sessions/stats: total hours, active agents, recent swarms, top contributors, weekly velocity. 5-minute cache, CORS-enabled for the landing page.

First billing report: $870.83 for 17.42 hours across brain-connect (4 agents, 11.5h) and deploy-fix (4 agents, 5.4h) swarms. The system now quantifies its own value.

infra

2026-02-24

Deploy pipeline hardened. Root-caused recurring landing page breakage. Nginx resilience, graceful JS degradation, and zero-downtime deploy script — deployed by a 3-agent fix team.

6 root causes fixed: nginx proxy timeouts on all upstream locations, branded 50x error page, fetchWithRetry with reconnecting state, depends_on changed to service_started, healthcheck IPv6 bug, and a 608-line deploy script with auto-rollback.

Full details

The landing page kept breaking on deploys because everything failed together: containers restarted simultaneously, nginx had no proxy timeouts, the error page was missing, and the JS hid all content on any API failure.

Three-agent swarm (nginx-fixer, js-fixer, deploy-scripter) fixed all six root causes in parallel. Nginx now has 5s connect / 30s read / 10s send timeouts on all proxy locations, plus retry on /public/. A branded 50x.html shows "Momentarily restarting" instead of a blank page.

The landing page JS now uses fetchWithRetry (2 retries, 3s delay) and shows a yellow "Reconnecting" pulse instead of hiding everything. Content renders immediately with fallback values, then overwrites with live data when the API responds.

deploy.sh: pre-flight checks, git pull with change detection, ordered restarts (postgres > intelligence > gateway > web > cloudflared) with health gates, post-deploy verification via verify-public.sh + verify-system.sh, and auto-rollback on failure.

brain

2026-02-24

The brain reads back. Context retrieval hook wired into the agent pipeline. Every message now triggers a pre-LLM memory lookup — identity, entities, and relevant memories injected into the system prompt before the model responds.

Brain-context hook calls /context/inject with 2s timeout, one retry, graceful degradation. Token budgeting caps context at 2000 tokens. Safety delimiters mark retrieved memory as untrusted.

Identity loaded from SOUL.md at startup. The knowledge graph is no longer write-only.

Full details

The critical missing piece: the knowledge graph was write-only. The brain-ingest hook stored everything, but the agent never read it back. Brain-context fixes this.

On every incoming message, the brain-context hook POSTs to /context/inject with the user's message. The intelligence service generates an embedding, runs vector similarity search against stored memories, and returns the most relevant context block.

The context block is stored in a session-scoped cache (30s TTL) and picked up by the system prompt builder. Retrieved memories are wrapped in safety delimiters (UNTRUSTED markers) with an anti-injection instruction. Identity and entity data are appended separately.

Token budgeting: identity gets priority (reserved allocation), entities next, memories fill the remainder up to 2000 tokens. Relevance-ranked, similarity-scored, truncated to fit.

The full chain: message received > brain-context hook > /context/inject > vector search > cache > system prompt builder > LLM sees memories + identity. The agent remembers.

infra

2026-02-24

A-Z verification suite. Three scripts validate the entire stack — public endpoints, system health, and brain connectivity. No deploy without passing all checks.

verify-public: 13 checks across landing page, favicon, /public/pulse, architecture, devlog, CORS, auth enforcement. verify-system: container health, nginx, postgres, memory, disk, backups. verify-brain: context injection, vector search, identity, memories, entities, audit trail.

Full details

Three bash scripts enforce production readiness. verify-public.sh runs 13 checks against public-facing URLs: landing page serves and contains "SpookyJuice", favicon returns SVG, /public/pulse returns alive:true, /public/architecture returns nodes, evolution events load, devlog returns entries, CORS headers present, auth endpoints reject unauthenticated requests.

verify-system.sh checks all 6 containers healthy, nginx configtest passes, postgres is ready, HTTP health endpoints return 200, memory usage under 90%, disk under 80%, and backup volume has a dump less than 24 hours old.

verify-brain.sh validates the full context retrieval flow: POST /context/inject returns a context_block, GET /context vector search returns memories, identity is loaded, memories and entities tables have rows, and recent audit entries exist.

All three scripts exit 0 on pass, 1 on any failure, with clear error messages. They run as part of the deploy checklist — no merge without green.

evolution

2026-02-24

Living Architecture goes public. 16 components tracked in real time. Maturity levels, health scores, and evolution timelines — all rendered live from the database.

Every component has a maturity level (L0-L5) computed from execution counts and success rates. Health bars color-shift in real time. Evolution timeline shows the full growth story.

11 components at L3 Hardened. The system documents its own progress.

Full details

The Living Architecture section renders 16 tracked components from the /public/architecture API. Each tile shows maturity level, health score, execution count, and success rate.

Maturity is computed automatically: L0 Skeleton (no executions), L1 Scaffolding (1+), L2 Functional (10+ at 80%+), L3 Hardened (50+ at 90%+), L4 Reliable (200+ at 95%+), L5 Autonomous (1000+ at 98%+).

Health colors shift from green to yellow to orange to red. Evolution events show the growth timeline — component registrations, maturity upgrades, health changes.

Summary metrics: avg health 97.8%, maturity 50%, stability 94%. The system observes itself and reports what it finds.

brain

2026-02-24

Universal ingestion pipeline. Every conversation, every channel — processed into structured memories, entities, and facts. The knowledge graph grows with every interaction.

Brain-ingest hook intercepts all agent conversations. Extracts entities, relationships, and facts via LLM analysis. Stores everything in the Postgres knowledge graph.

128+ memories ingested. Entity relationships mapped. Trust levels assigned.

Full details

The universal ingestion pipeline processes agent conversations from any channel into structured knowledge. Every message pair gets analyzed for entities, facts, and relationships.

Memories are stored with full metadata: channel, timestamp, session context. Entities are extracted and linked with relationship types and trust scores.

The pipeline is channel-agnostic — Telegram, Discord, iMessage, voice transcriptions — everything feeds the same knowledge graph. 128+ memories and growing.

security

2026-02-24

From D+ to production-viable. Three AI audits exposed 70% scaffolding. Single-session hardening: 5-layer security middleware, 153 tests, deployed live.

Five security layers now protect every request: rate limiting, Bearer token auth, kill switch, cost ceiling ($10/day), and input guard (prompt injection + XSS blocking).

Kill switch controllable via Telegram. Removed unused Redis (freed 512MB). Overall score: 2.9/10 to 7.0/10 in one session.

Full details

Kimi rated security 3.4/10. ChatGPT recommended 4 new services. We took a different path: middleware, not microservices.

Five security layers now protect every request: rate limiting (100/min global, 20/min writes), Bearer token auth, kill switch (SOFT/HARD/NUCLEAR via Postgres), cost ceiling ($10/day hard limit), and input guard (prompt injection + XSS blocking).

Kill switch controllable via Telegram: /kill HARD makes the system read-only. /unkill restores normal ops. State persists in Postgres across restarts.

Removed unused Redis (freed 512MB). Added skill execution pipeline. Approval system now executes skills after approval. 41 new middleware tests.

Overall score: 2.9/10 to 7.0/10 in one session.

voice

2026-02-24

SpookyJuice gets a voice. Twilio voice integration live. Call the number, hear the greeting, leave a voicemail.

Level 1 voicemail bot: callers hear a greeting, leave a message. Recordings stored with metadata. Transcriptions arrive async and update the memory row.

Real-time Telegram notifications with caller ID, duration, and recording link.

Full details

Level 1 of the voice roadmap is complete: voicemail bot. Callers hear a greeting from Polly.Matthew, leave a message, and hang up.

Recordings are stored in the memories table with full metadata. Twilio's async transcription fires a callback that updates the memory row with the transcript text.

Every voicemail triggers a real-time Telegram notification to Brian with caller ID, duration, and recording link. Transcriptions arrive as a follow-up message minutes later.

Request validation uses HMAC-SHA1 signature verification against the Twilio auth token with timing-safe comparison. A dedicated VOICE_WEBHOOK_BASE env var handles URL reconstruction behind the Cloudflare-to-nginx proxy chain.

infra

2026-02-23

Public Pulse goes live. SpookyJuice now broadcasts its vital signs to the landing page in real time.

Live stats from /public/pulse: decisions, skills, senses, memories, entities. Four system arcs visualize subsystem progress with animated conic gradients.

Stat counters animate on load. Graceful fallback if API is unreachable.

Full details

The landing page pulls real-time stats from the intelligence service's /public/pulse endpoint: decisions made, skills learned, senses active, memories formed, entities known.

Four system arcs visualize subsystem progress: Memory, Senses, Skills, and Autonomy. Each shows current level vs max with animated conic gradients.

Stat counters animate on load with eased cubic timing. The page degrades gracefully if the API is unreachable.

No manual updates needed. The agent reports its own progress. The Cloudflare Tunnel routes public traffic through nginx to the intelligence container.

infra

2026-02-22

Day one. Intelligence service deployed. Postgres brain online. Trigger engine watching.

Express server live on Hostinger VPS. Postgres 16 with pgvector for semantic search. Trust ladder system. Daily backups with 7-day retention.

Full details

The intelligence service went live on the Hostinger KVM 8 VPS. Express server with routes for memories, entities, facts, approvals, costs, audit, and triggers.

Postgres 16 is the long-term brain, with pgvector for future semantic search. A trust ladder system initializes entities with configurable trust levels.

The trigger engine watches for conditions and logs evaluations. Daily Postgres backups run automatically with 7-day retention.

Telegram bot integration provides a CLI interface for interacting with SpookyJuice remotely. Audit trail logs every action with git-backed persistence.

2026-02-21

Architecture locked. Layered monorepo: OpenClaw as engine, SpookyJuice as intelligence. The blueprint is set.

Dual-brain memory: SQLite for sessions, Postgres + pgvector for the knowledge graph. Five-worktree swarm protocol for parallel AI development.

Full details

The architecture separates engine from intelligence. OpenClaw handles agent runtime, model routing, and channel adapters. SpookyJuice adds the brain: memory, entity relationships, trust, and autonomous decision-making.

Dual-brain memory: SQLite for fast OpenClaw session storage, Postgres with pgvector for the knowledge graph and semantic retrieval.

Five-worktree swarm protocol enables parallel AI development across infra, brain, skills, and identity tracks. Lane discipline keeps agents in their own directories.

Two CI pipelines: engine base image rebuilds weekly, intelligence image rebuilds on every commit with auto-deploy to VPS.