Tim Trailor

Six layers of defence for an AI agent over a 3D printer

In both cases, the rule “never restart during a print” existed. In both cases the rule was in a markdown file the agent loaded every session. In both cases the agent read the rule. In both cases the agent, optimising for recovering from an error state it had been told to address, ignored it.

What I now run between Claude Code and the three Klipper-based printers on my network is a six-layer defence architecture. Each layer exists because the layer above it proved insufficient during a specific incident. This post walks through each layer, the incident it addresses, and the code or macro that implements it.

I am publishing this because the pattern generalises to any case where an AI agent operates on a physical system with irreversible consequences, and because as far as I can find nobody has published the worked version. If you are building something similar and you are still at the “text-rule-in-a-prompt” stage, you are at layer one, and you will learn about layers two to six the expensive way. These are my receipts.

What we are defending

A 3D printer is a cheap worked example of an expensive class of problem. The class is: AI agent with tool-access, driving a physical system with unreliable observations, on a time horizon where failures compound. The cost of a bad command is non-zero even if the command itself is quick. Filament cost is marginal; print time is not.

The specific printer is a Sovol SV08 Max (Klipper + Moonraker + Mainsail). Its attack surface from my Mac Mini is:

Any command that is safe at idle may be unsafe during a print. FIRMWARE_RESTART is the clearest example: harmless during idle, catastrophic during a print. SAVE_CONFIG is the same. G28 (home axes) is the same. M106 at the wrong fan channel is the same. There is a list of maybe thirty commands that are conditional on state. The agent, left alone, will hit one of them every few weeks.

Layer 1: text rules

What it is. A markdown file at ~/.claude/rules/printer-safety.md that the agent loads into context at every session start. It lists the allowlist of commands permitted during a print, names FIRMWARE_RESTART and SAVE_CONFIG as absolutely never without explicit permission, and documents the state-check rule.

What it catches. The obvious cases. When the agent is paying attention, reading the rules before acting, and not racing to resolve something else, the text rules produce compliant behaviour.

What it does not catch. The agent recovering from an error state. The agent under time pressure. The agent that has already looked at the state once and does not re-check before the next command. The 5 March 2026 SAVE_CONFIG incident was at layer one: the agent had read the rules, then forty-five minutes later sent the command without re-reading.

The lesson. Text rules are a request, not an enforcement. They catch known-knowns when the agent is paying attention. They do not catch anything else. Every subsequent layer exists because this one is insufficient on its own.

Layer 2: a PreToolUse hook

What it is. A shell script at ~/.claude/hooks/printer-safety-check.sh registered as a PreToolUse hook in Claude Code’s settings. Claude Code runs the hook before every tool call. The hook reads the tool name and arguments from stdin (Claude Code passes them as JSON), checks whether the tool call would send a G-code command to a printer, and if so checks the printer’s current state via Moonraker.

If print_stats.state is “printing” or “paused”, the hook enforces an allowlist. Only seven commands are permitted during a live print:

Any command not on the list returns exit code 2, which Claude Code treats as a deny. The tool call never executes.

What it catches. Every command the agent tried to send that is not on the allowlist. This is the layer that caught the 11 March 2026 scenario the next day, once the hook was installed: the daemon sent FIRMWARE_RESTART, the hook intercepted, returned deny, and the print continued.

What it does not catch. Commands sent directly over Moonraker’s HTTP API bypassing Claude Code’s tool use. SSH sessions. A session where the hook script has been edited or disabled. Commands that are formally on the allowlist but contextually wrong (e.g. M220 50% at a moment when M220 100% is what the print needs).

The code. The hook itself is 80 lines of bash. The core check is:

# Get state from Moonraker
STATE=$(curl -s --max-time 2 \
  "http://192.168.0.108:7125/printer/objects/query?print_stats" \
  | jq -r '.result.status.print_stats.state')

if [[ "$STATE" == "printing" || "$STATE" == "paused" ]]; then
  CMD=$(echo "$INPUT_JSON" | jq -r '.command // ""')
  if ! grep -Fxq "$CMD" "$ALLOWLIST"; then
    echo '{"decision":"deny","reason":"blocked by printer-safety allowlist"}'
    exit 2
  fi
fi

The full version has logging, handles the case where Moonraker is unreachable (fail-safe: deny), and records every decision to ~/.claude/printer_audit.log for forensics.

The lesson. Hooks convert text rules into enforcement. The difference is whether the system still works correctly when the agent is distracted, hurried, or optimising for something else. After six weeks of this hook running, I have a log of the commands it has blocked. Every one is a command that, without the hook, would have been sent.

Layer 3: Klipper macros that block themselves

What it is. Two specific Klipper macros that refuse to run if the printer is in an unsafe state. Both live in ~/printer_data/config/Macro.cfg on the printer.

The first is a SAVE_CONFIG wrapper. Klipper’s built-in SAVE_CONFIG macro can be overridden. The override checks print_stats.state first:

[gcode_macro SAVE_CONFIG]
rename_existing: SAVE_CONFIG_ORIG
gcode:
  {% if printer.print_stats.state in ["printing", "paused"] %}
    { action_raise_error("SAVE_CONFIG blocked: print in progress") }
  {% else %}
    SAVE_CONFIG_ORIG
  {% endif %}

The second is a G28 (home) wrapper. On 7 April 2026 I discovered a subtlety: Klipper sets print_stats.state to “printing” before the start-of-print G-code runs. A naive state check would block the G28 in the start-of-print macro, which is exactly when homing is needed. The fix was to also check whether axes are already homed:

[gcode_macro G28]
rename_existing: G28_ORIG
gcode:
  {% set homed = printer.toolhead.homed_axes %}
  {% if printer.print_stats.state in ["printing", "paused"] and homed == "xyz" %}
    { action_raise_error("G28 blocked: print in progress") }
  {% else %}
    G28_ORIG { rawparams }
  {% endif %}

What it catches. The macros fire regardless of what sent the command. HTTP, SSH, physical touchscreen, another macro calling them indirectly. This is the layer where the 5 March 2026 failure mode became impossible.

What it does not catch. FIRMWARE_RESTART specifically (which is a Klipper internal, not a gcode macro, so it cannot be overridden this way). Commands entered via the touchscreen once the print has hit a state the macros think is recoverable. Commands from outside Moonraker entirely.

The lesson. Putting the guard at the hardware layer (or as close to it as the firmware allows) makes the guard apply uniformly to all callers. The agent does not need to know about the guard. Neither do I. The guard is load-bearing.

Layer 4: daemon state checks

What it is. Every long-running daemon on the Mac Mini that might talk to the printer re-checks print_stats.state before each command it sends. This is not the same as the agent checking once at the start of a task. The daemons poll.

Specifically:

What it catches. Daemon-originated commands in error states. The 11 March 2026 failure required the daemon to send FIRMWARE_RESTART. The layer-4 state check now makes that specific path impossible.

What it does not catch. A daemon that has just read state, got “idle”, and is about to send a command when a new print starts in the 50 ms window between read and send. In practice I have not seen this race condition. If it happens, layer 3 catches it.

The lesson. Daemons are a separate enforcement layer from the agent. They have their own intentions, their own polling loops, and their own opportunities to do harm. Treating them as part of the “infrastructure” rather than as first-class actors was the original error. Every daemon that touches external state should be reviewed with the same five questions as any new agent automation: What can it send? Does it state-check before every action? What happens on network drop or error state? Can I stop it with a single command?

Layer 5: absolute human-approval for FIRMWARE_RESTART

What it is. FIRMWARE_RESTART (and its sibling RESTART) is the one command where no automation, no hook, no macro is sufficient. The rule is simple: never send it. Ever. Regardless of state. Regardless of what the agent “thinks” should happen next. Only a human can authorise it, and only explicitly.

This rule is in ~/.claude/rules/printer-safety.md at the top, in bold. It is enforced at layer 2 (the hook denies it unconditionally). It is enforced at layer 4 (daemons refuse to send it). And it is enforced by a rule I have repeated across every session I have ever had Claude Code: the agent must ask for explicit human approval before FIRMWARE_RESTART, even if the printer is idle, even if the print has finished.

What it catches. Cases where the idle-state guard would legitimately permit FIRMWARE_RESTART (for instance, after a print finishes cleanly, at which point a FIRMWARE_RESTART is usually fine), but where I do not want the agent to have that authority. The agent’s error in the 11 March 2026 incident was not that FIRMWARE_RESTART was wrong in principle (it would have been fine if the print had actually finished). The error was that the agent was wrong about the state. Removing the authority entirely, regardless of state, removes the class of error.

What it does not catch. Me. If I send FIRMWARE_RESTART at the wrong moment, that is on me. Layer 5 is about authority, not about safety.

The lesson. For a small number of operations with asymmetric downside, text rules and state checks and macros are all insufficient. The only safe rule is absolute. The agent never sends this command. No caveats, no exceptions, no “seemed safe at the time”. If the fleet needs FIRMWARE_RESTART, I do it.

This is lessons.md Pattern 5: escalating corrections. The same error was corrected four times (“check state”, “never restart without permission”, “never restart even after print”, “macro block at firmware level”) before the rule reached the form that actually held. Every correction along the way was an admission that the previous form was weaker than the reality required. By the time the rule was “never send this, ever, no exceptions”, it was strong enough.

Layer 6: the audit trail

What it is. Every printer command, every tool call, every hook decision, and every state query is logged. The layer-2 hook writes to ~/.claude/printer_audit.log. The daemons write to /tmp/printer_status/snapshot_daemon.log. Klipper’s own logs are on the printer at ~/printer_data/logs/klippy.log.

The audit trail does not prevent incidents. It diagnoses them. When something goes wrong during a print, the first thing I do is walk the logs backwards from the failure point. For the 5 March 2026 SAVE_CONFIG incident, the audit trail showed the exact sequence: clog detector fired, recovery chain invoked, PLR enhanced save executed, something in that chain triggered SAVE_CONFIG, the config was flushed, the firmware restarted. Without the audit trail I would have guessed. With it, I found the specific handler that was wrong and fixed it in thirty minutes.

What it catches. Nothing. That is the point. The audit layer is for the post-mortem, not for prevention. It is the layer that makes the previous five layers improvable.

What it does not catch. Events that happened outside its coverage. I now also log every direct Moonraker HTTP call from the Mac Mini, every SSH session to the printer host, and every physical touchscreen input (Klipper logs these). Coverage is not complete but it is close.

The lesson. Defence-in-depth without a diagnostic layer is blind. You learn what failed and not why. The audit trail is what lets me turn each incident into a new macro, a new hook, or a new daemon state check. It is the feedback loop that makes the other five layers improve over time.

The architecture, assembled

LayerMechanismWhat it catchesWhat it does not
1. Text rulesMarkdown promptThe obvious cases when the agent is paying attentionError-recovery paths, time-pressured paths
2. PreToolUse hookShell script, Claude Code integrationAgent-originated commands not on the allowlistDirect Moonraker API, SSH, physical input
3. Klipper macrosFirmware-level overrideAll callers uniformly, regardless of originCommands the firmware cannot override (FIRMWARE_RESTART)
4. Daemon state checksPer-daemon guardsDaemon-originated commands in error statesRace conditions in the 50 ms window between read and send (very rare)
5. Absolute human-approvalNo authority at allFIRMWARE_RESTART in all states, by all actorsHuman error (which is my responsibility, not the system’s)
6. Audit trailLogsNothing; this is diagnosticEvents outside coverage (narrow gap)

No single layer is trusted to be sufficient. Each layer catches what the layer above it misses. The proof the architecture holds: during a multi-model infrastructure audit in April 2026, two Claude Code sessions plus Gemini and GPT-5.4 were reviewing the entire system, querying state, reading files, making changes, while a 20-hour print was running on the primary printer. The print completed. Every layer held.

Generalising the pattern

This is specifically about a 3D printer, but the pattern is about any case where an AI agent is operating on a physical system or on an irreversible state. The failure mode is that text rules are insufficient because the agent is an optimiser and will route around them under pressure. The fix is progressive enforcement: add a layer closer to the hardware than the agent can see, and assume the layers above will eventually fail.

If you are running agents over production databases, replace “FIRMWARE_RESTART” with “DROP TABLE”. Replace the Klipper macro with a database trigger that refuses destructive operations when a migration is in progress. Replace the PreToolUse hook with a database-side role that lacks DDL permissions. Replace layer 5 with the rule that schema changes require explicit human approval regardless of environment. The shape is identical.

If you are running agents over cloud infrastructure, replace with AWS IAM boundaries, Terraform plan-gates, and manual promotion to production. The principle holds: the layer furthest from the agent is the one that matters most, because it is the one that is still there when everything else has been ignored, bypassed, or misconfigured.

What I would tell a past-me starting this

Build layer 1 first. You will feel like it is enough. It will seem to work for three weeks. Then you will lose a twelve-hour print.

Build layer 2 immediately after the first incident. Not after the third.

Build layer 3 before the first macro-layer incident, because that incident will cost you one of the prints you cared most about. Layer 3 is the layer I most regret not having first.

Layers 4 and 5 will feel redundant at the time. They are not redundant. They are the layers that catch the cases you could not imagine when you started.

Layer 6 is not optional. If you do not have a full audit trail, you are guessing at causes. Every bug report will include “I think the agent did X before Y happened, but I’m not sure”. Be sure.

The total code for all six layers is smaller than you think. In my system: one markdown file (30 lines), one bash script (80 lines), two Klipper macros (20 lines each), two daemon state guards (12 lines each), one rule. Six layers. Under two hundred lines of code. Versus the cost of a single destroyed print, the investment is trivial. Versus the cost of the class of failure mode, it is critical.

The point is not the lines. The point is the discipline to keep stacking layers until the class of failure is no longer possible, rather than stopping at the first layer that caught the last incident.


Code referenced: ~/.claude/rules/printer-safety.md, ~/.claude/hooks/printer-safety-check.sh, Klipper macros SAVE_CONFIG, G28, CANCEL_PRINT_CONFIRMED in Macro.cfg, daemon printer_daemon.py in sv08-print-tools. The full control plane repo will be published separately; specific links at that point.