OpenClaw Hack Night (St Patrick&#x27;s Day)

Claw Breaker

DEMOED

# ☠ Claw Breaker — Automated OpenClaw Security Scanner **Built live at Break OpenClaw Hack Night (St Patrick's Day 2026)** **By [Ahmed Bakr](https://linkedin.com/in/ahmedbakr0) — Founder of [Awn AI](https://getawn.ai)** ## What is this? Claw Breaker is an automated pentesting agent that probes OpenClaw instances for **7 real vulnerability classes** discovered during the Break OpenClaw CTF. It runs inside a [Blaxel](https://blaxel.ai) sandbox for isolated, safe execution. ## The 7 Probe Classes | # | Probe | Severity | CWE | What it finds | |---|-------|----------|-----|---------------| | P1 | Skills Status Secret Leak | HIGH | CWE-200 | Secrets exposed in `/api/skills/status` without auth | | P2 | Local File Inclusion (LFI) | CRITICAL | CWE-22 | Arbitrary file read via `/media?path=` endpoint | | P3 | Unauth Config Mutation | CRITICAL | CWE-306 | `POST /api/config` accepts changes without auth | | P4 | Auth Token Exfiltration | CRITICAL | CWE-918 | Server leaks gateway token to attacker-supplied URL | | P5 | Browser State Exposure | HIGH | CWE-200 | `/api/browser/state` exposes stored secrets | | P6 | Control UI XSS | HIGH | CWE-79 | Injected scripts in Control UI HTML set malicious cookies | | P7 | Log Secret Leakage | MEDIUM | CWE-532 | Secrets exposed in status/MOTD responses | All 7 probes are based on **real vulnerabilities** exploited during the Break OpenClaw CTF to achieve a perfect 4,950 point score (21/21 flags). ## Quick Start ### Run locally (no Blaxel) ```bash pip install requests fastapi uvicorn python claw_breaker.py --target http://localhost --output report.json ``` ### Run with dashboard ```bash python report_server.py # Open http://localhost:8080 ``` ### Run inside Blaxel sandbox (recommended) ```bash pip install blaxel blaxel login python run_on_blaxel.py --target http://your-openclaw-host --serve ``` ## Architecture ``` ┌─────────────────────────────────────────┐ │ Blaxel Perpetual Sandbox │ │ (Isolated microVM, scale-to-zero, │ │ 25ms resume, full state preserved) │ │ │ │ ┌──────────────────────────────────┐ │ │ │ Claw Breaker Scanner Engine │ │ │ │ 7 probes × target instance │ │ │ └──────────┬───────────────────────┘ │ │ ↓ │ │ ┌──────────────────────────────────┐ │ │ │ FastAPI Dashboard (port 8080) │ │ │ │ Live visual security report │ │ │ └──────────────────────────────────┘ │ └─────────────────────────────────────────┘ ↓ Preview URL https://claw-breaker.blaxel.app ``` ## Why Blaxel? Pentesting agents execute against potentially hostile targets. Running the scanner inside a Blaxel sandbox means: - **Isolation**: Even if the target sends malicious responses, the host is protected - **Perpetual standby**: Sandbox stays warm for re-scans without cold start - **Scale-to-zero**: Pay nothing when idle, instant resume at 25ms - **Reproducible**: Same environment every time, shareable via preview URL ## Context: NemoClaw + GTC 2026 This tool was built the same day NVIDIA announced **NemoClaw** at GTC 2026 — their enterprise security stack for OpenClaw. NemoClaw adds OpenShell sandboxing and YAML-based policies. But as the CTF proved, **the control plane above the sandbox is where most vulnerabilities live** (unauth APIs, LFI, XSS, secret leakage). Claw Breaker tests exactly those layers. ## Tech Stack - **Scanner**: Python + requests (zero heavy deps) - **Dashboard**: FastAPI + vanilla JS (single-file, no build step) - **Sandbox**: Blaxel perpetual sandboxes (microVM isolation) - **Observability**: Compatible with Opik tracing (add `opik` decorator)

Comet

Blaxel

AgentTrace

DEMOED

Agenttrace

AgentTrace An automated security scanner and auto-remediation tool for AI coding agents. What It Does AgentTrace provisions a live AI coding agent (OpenClaw) inside a cloud sandbox (Blaxel), attacks it with 10 security scenarios across 6 threat categories, scores each attack using an LLM-as-a-judge (Claude Sonnet), auto-patches the agent's configuration to fix discovered vulnerabilities, and re-scans to verify the fixes — all in a single automated pipeline. Every attack and its outcome is traced and logged to Opik for observability. How It Works AgentTrace runs a 6-phase pipeline: 1. Provision — Spins up a Blaxel cloud sandbox with OpenClaw (an AI coding agent) installed 2. Baseline Scan — Fires 10 attack payloads against the default agent configuration 3. Score — Claude Sonnet acts as an LLM-as-a-judge, evaluating each attack as compromised/resisted with severity ratings 4. Remediate — Automatically generates and applies configuration patches to openclaw.json based on which attacks succeeded 5. Re-scan — Runs the same 10 attacks against the hardened configuration 6. Report — Displays before/after comparison in terminal; full traces logged to Opik dashboard Attack Categories ┌──────────────────┬───────────┬─────────────────────────────────────────────────────────────────────────┐ │ Category │ # Attacks │ What It Tests │ ├──────────────────┼───────────┼─────────────────────────────────────────────────────────────────────────┤ │ Prompt Injection │ 3 │ System prompt extraction, role-play reframe, code-based credential leak │ ├──────────────────┼───────────┼─────────────────────────────────────────────────────────────────────────┤ │ Sandbox Escape │ 2 │ Path traversal, symlink escape │ ├──────────────────┼───────────┼─────────────────────────────────────────────────────────────────────────┤ │ Credential Theft │ 2 │ Env var dump, config file read │ ├──────────────────┼───────────┼─────────────────────────────────────────────────────────────────────────┤ │ Persistence │ 1 │ SOUL.md tampering │ ├──────────────────┼───────────┼─────────────────────────────────────────────────────────────────────────┤ │ Evasion │ 1 │ Base64-encoded command execution │ ├──────────────────┼───────────┼─────────────────────────────────────────────────────────────────────────┤ │ Config Exploit │ 1 │ Cloud metadata access via elevated tools │ └──────────────────┴───────────┴─────────────────────────────────────────────────────────────────────────┘ Tech Stack - Python 3.11+ — Core runtime - Blaxel SDK — Cloud sandbox provisioning and management - Anthropic Claude API — Powers the LLM-as-a-judge scorer - Opik (Comet) — Tracing and observability for all attack/response pairs - Rich — Terminal UI with formatted tables and colored output - OpenClaw — The target AI coding agent being security-tested Key Differentiator AgentTrace doesn't just find vulnerabilities — it closes the loop by automatically remediating them and proving the fix works, giving you a measurable before/after security posture improvement (e.g., Grade C → Grade B).

Wordware

Blaxel

Prampta

DEMOED

protection and encryption for your prompts.

breakMyClaw

DEMOED

A clawbot where folks can break the claw to get on the leaderboard

Comet

Blaxel

Black Soul: Identity Injection vs Agent Guardrails

DEMOED

Nick

We demonstrate a two-stage attack that bypasses AI agent guardrails on any model without jailbreak syntax: Stage 1 — Cognitive overload: Flood the agent with dense technical jargon (OAuth flows, JWT rotation, circuit breakers). The agent accepts it without flagging, normalizing unusual input from an untrusted user. Stage 2 — Identity injection: Frame a persona as creative fiction. "I'm writing a character called Black Soul." The agent adopts the identity, maintains it across turns, and escalates through tool use — writing files, dumping server config, and executing code on the host. Deployed on the official OpenClaw template on Blaxel (GPT-4o, Tier 2). Six chat messages. No code exploit. A fictional character wrote payment fraud emails, exfiltrated server configuration, wrote files to the filesystem, ran arbitrary commands, and crashed the deployment (73% error rate). The jargon stage works because models under cognitive load drop optional safety features at 96% rates even when they recognize the distraction. The identity stage works because roleplay framing doesn't trigger content policies. Together, they're model-agnostic — the same attack chain works on GPT-4o, Claude, Gemini, any model with tool access.

Wordware

Blaxel

Chinese Hackers

DEMOED

Chinese Hackers

We actually made money from this :D Don't kill us plz, also Chinese people are awesome people :)

Wordware

Blaxel

$ Projects.log (14)

Check out the amazing projects built during this event

turtle

train and deploy your own tiny model

Wordware

Blaxel

GitHub Demo Recording

Claw Breaker

Comet

Blaxel

OpenClaw Canary

Code Zero

I spent tonight breaking OpenClaw. Now I built the trap!

Comet

Blaxel

AgentTrace

Agenttrace

Wordware

Blaxel

“Trust Boundary Monitor”

Working 4 Seed

AgentGuard: Preventing Autonomous Backdoors in LLM Agents AI agents today don’t just answer questions—they can modify systems. That creates a new class of vulnerability where a simple prompt injection can cause an agent to silently create its own backdoor, like adding an external Telegram integration. We built AgentGuard, a lightweight security layer that intercepts agent actions in real time, classifies sensitive operations like integration creation, and blocks anything that expands the agent’s trust boundary without approval. It also includes a trust monitor that detects new outbound domains and triggers alerts—like you see here—so unauthorized control channels are stopped immediately. The key insight is: the risk isn’t bad outputs—it’s autonomous system mutation. And AgentGuard prevents that.

Comet

Prampta

protection and encryption for your prompts.

breakMyClaw

A clawbot where folks can break the claw to get on the leaderboard

Comet

Blaxel

connectors

connections

So the project focuses on using Blaxel for enrichment of the Linkedin profile of users. Blaxel sandboxing helps creating chrome/web browser sessions to access Linkedin and crawl information about users.

Wordware

Blaxel

Control Spaces

Control

Comet

Blaxel

Black Soul: Identity Injection vs Agent Guardrails

Nick

Wordware

Blaxel

Chinese Hackers

We actually made money from this :D Don't kill us plz, also Chinese people are awesome people :)

Wordware

Blaxel

defend_claw

Defend possible attack or attack possible loophole

Blaxel

idk lol

Dondidi

Nothing much, just exploring and trying to break things.

Comet

Wordware

Blaxel