The interesting thing about agentic systems isn't the loop. It's the tools. An LLM that can call APIs, query databases, and write to records is doing actual work — and creating actual liability. In a regulated environment, the architecture has to make some kinds of work easy and other kinds impossible.
A simple taxonomy
Categorize every tool a regulated agent can call into three buckets:
Read-only, low-stakes. Knowledge retrieval, document lookup, summarization. Allow freely. Validate for accuracy.
Read-only, high-stakes. PII access, customer data lookup, internal records. Allow under audit. Log every call. Restrict by role.
Write or external. Sending email, updating records, triggering downstream systems. Either disallow entirely, or require human-in-the-loop approval for every call. There is no defensible third option in most regulated workflows.
What the architecture has to enforce
Not the prompt. Architecture. A tool the model "shouldn't" call must be a tool it cannot call — restricted at the tool registry, not asked nicely in a system prompt. Prompt-level guardrails are evaluation criteria, not security boundaries.
The audit log must include: which tool was invoked, with what arguments, by which agent run, on behalf of which user, with what outcome. If you can't produce that record, you can't pass review.
The thing nobody wants to hear
Agentic systems shine on the workflows where you can let the model take small, recoverable actions. They struggle exactly where regulation is the strictest, because the consequences of error are the highest. The reasonable answer is to scope agentic systems narrowly — let them do the read-heavy work and route the consequential decisions to human review with an LLM-prepared brief.
Working with us: Rizmi Labs designs agentic AI architectures for banks and insurers that hold up under examination. Get in touch to discuss.