What Actually Happens on the Wire When an LLM Acts on Its Own

An autonomous agent is, from the outside, a process that decides what to do and then makes network calls to do it. The reasoning is invisible and the prompts are private, but the moment the agent reaches for a tool, an API, a database, or another model, it has to put bytes on the wire. That is the one surface where every action becomes observable, attributable, and enforceable, no matter which framework, SDK, or model produced it.

Most teams instrument the model call and call it a day. They log the chat completion, maybe the tool-call arguments, and assume that covers the agent's behavior. It does not. The interesting and dangerous things happen between the tool decision and the side effect, on the network, where the only honest record of what the agent did (as opposed to what it said) lives.

The gap between the transcript and the traffic

A model transcript tells you the agent intended to call send_email. The wire tells you it opened a TLS connection to smtp.unknown-domain.tld, authenticated with a key that should never have left your vault, and shipped 4 KB of base64 that decodes to a customer list. Those are very different facts, and only one of them is ground truth.

The gap exists because agents do not only act through their declared tools. A Python dependency makes its own HTTP request. A browser tool follows a redirect you never approved. A retry library re-issues a call after a human said no. A sub-agent spins up and inherits the same credentials. Every one of those bypasses the tidy tool-call log, and every one of them is a call on the wire.

What an agent's outbound traffic actually looks like

Capture the egress from a moderately busy agent for an hour and you will see four broad classes of traffic:

Model calls. Requests to inference endpoints. High volume, usually to a known set of hosts, and the place most monitoring stops.
Tool and API calls. The agent's declared capabilities: ticketing, payments, search, internal services. This is where intent becomes a side effect.
Implicit calls. Traffic no one designed: telemetry from an SDK, a package phoning home, a transitive dependency resolving a URL embedded in untrusted content.
Exfiltration-shaped calls. Outbound requests whose shape — a large body, an unusual destination, an encoded payload — is the tell, regardless of which tool nominally initiated them.

The first class is well understood. The last three are where incidents come from, and they are precisely the classes that a model-side guardrail cannot see, because by the time the bytes are on the wire the model is no longer in the loop.

Why the network boundary is the honest one

Security boundaries are valuable in proportion to how hard they are to bypass. A prompt-level filter sits inside the agent's trust zone: anything that compromises the agent — an indirect prompt injection, a poisoned tool response, a confused sub-agent — sits at the same level and can route around it.

The network boundary is different. It is out of band: the agent does not get to choose whether its packets traverse it, and it cannot see or disable the policy applied there. That is the same reason a default-deny firewall is more trustworthy than an application asking itself politely not to misbehave. For agents, the proxy on the egress path is the floor that every other control stands on.

Inference boundary vs. network boundary

These are complementary, not competing. The inference boundary is good at judging meaning — is this prompt an attack, is this output unsafe. The network boundary is good at enforcing consequence — this connection is allowed, this one is not, this one needs a human. You want both; you cannot substitute one for the other.

What to enforce on the wire

Once you treat egress as the control point, a small set of policies covers most of the risk:

Default-deny destinations. An allowlist of hosts the agent may reach, with everything else blocked and logged. Most exfiltration dies here, because the attacker's endpoint is not on the list.
Action gating. Some calls are irreversible — a payment, a delete, an outbound message. Those should pause for human approval rather than execute on the agent's say-so.
Payload awareness. Watch for secrets and bulk data leaving in request bodies, even toward allowed hosts, and redact or block before they go.
Tamper-evident logging. A signed record of every outbound call, written outside the agent's reach, so the audit trail survives a compromised agent.

None of these require the model's cooperation, which is the point. They hold whether the agent is well-behaved, jailbroken, or actively working against you.

Designing for the day the agent is wrong

The mental shift is to stop asking “is this agent trustworthy?” and start asking “what can this agent reach, and what happens when it tries to reach something it shouldn't?” That question only has a satisfying answer at the network layer, because that is where reach is actually defined. An agent can be told anything in its prompt; what it can do is bounded by where its packets are allowed to go.

This is the premise Agent G is built on. It sits as a zero-trust egress proxy in front of your agents, sees every outbound call regardless of which tool or library made it, and applies policy-as-code guardrails on the wire — block, allow, or escalate to a human — with sub-2ms overhead and a full, out-of-band log of everything that happened.

If you are running agents in production and cannot answer “what did they actually do on the network last night?” with confidence, that is the gap to close first. See how Agent G works, or request access to the private beta.