Skip to main content

← All Insights

Insight

Human-Led AI Is an Engineering Problem

·By Colin Thrasher

Last week I argued that agentic AI is failing on execution, not capability — and that the advantage has moved to workflow design, governance architecture, and decision-rights clarity.

That last phrase deserves more.

Most of the conversation around human-led AI is still happening at the level of principle. Humans should stay in the loop. Agents should augment, not replace. Judgment should remain with people. All true. None of it is the engineering.

The engineering is decision rights.

In any operationally complex business — and especially in regulated ones — every workflow is a sequence of decisions made by people with defined authority. Who can approve what. Who must escalate. Where the gates sit. When a discrepancy is a typo, when it's a risk flag, when it's grounds for a hold. Most organizations have lived with these decision rights for decades, even if they've never formally written them down.

When you put agents inside that workflow, you don't get to skip the question. You need to answer it explicitly.

What can the agent prepare? What can it commit? Where does the gate sit? Get any of those wrong and the system either grinds to a halt under its own caution, or quietly does things it had no authority to do. Both failure modes are operationally fatal in the workflows that matter.

OSFI named this directly in its FIFAI II report last month. The AGILE framework's Awareness pillar calls on financial institutions to "establish clear governance guidelines for agentic AI, defining where human approval is required and where autonomous agents can operate safely." That's decision-rights engineering, and it's now in a regulator's published framework.

What the framework names at the principle level, it leaves as an exercise for the reader at the engineering level. That gap is where most current deployments sit.

Take third-party risk management — a workflow every regulated Canadian financial institution is required to run, whether under OSFI B-10 or its provincial equivalent — and one mid-sized institutions may be under-resourced to handle well. An agent intakes a new vendor package: SOC 2 report, financials, insurance certificate, data processing addendum. It classifies the vendor by risk tier based on data access and operational criticality. It flags a qualified SOC 2 opinion and a missing breach notification clause in the DPA.

It does not approve the vendor. It does not waive the gap. Those decisions sit with the risk committee, scoped by tier. What the agent prepares versus what the committee decides — and what triggers escalation rather than auto-routing to the next stage — is the engineering. Skip it, and a vendor gets onboarded with a control gap that shows up in the next regulatory exam.

FIFAI II also flags the consumer-impacting AI use cases: product recommendations, credit adjudication, underwriting, investment advice. Every one of those is the same shape as the TPRM example — a sequence of decisions where some can be agent-prepared and some must remain human-decided, and where the boundary between them is where the regulatory and operational risk actually sits.

In the workflows we're building inside right now, every agent boundary is a decision-right boundary. The protocol that says what the agent can extract but not interpret. The maker-checker pattern that separates preparation from approval. The escalation path that surfaces a flag without resolving it. None of that is the principle of human-led AI. All of it is the engineering required to make the principle real.

It's also how 'AI hallucination' can be misdiagnosed when it surfaces inside agentic workflows. Bad model. Bad data. Edge case. The model didn't fail. The decision-rights architecture did. An agent given authority it shouldn't have been given will eventually use it. An agent denied the right to escalate will eventually paper over a gap. The remediation isn't a better model. It's a better boundary.

Identity for agents is the next layer of the same problem. AGILE's Innovation pillar names it: each agentic model needs a distinct digital identity. The framework lists what that requires — cryptographic proof of legitimacy, explicit delegated authority for action. Traditional IAM was built for humans and services. It does not handle "this agent, in this workflow, on behalf of this person, for this task, with this scope, expiring in this many minutes." The Cloud Security Alliance's 2026 survey on autonomous AI agents found that only 33% of organizations modify access policies in real time — what the report calls embedded, identity-bound enforcement. Almost no organization has solved this yet. Most haven't started.

The advantage goes to operators who treat decision rights as the design problem they are.

The next 18 to 24 months will sort the field on exactly this. Demos will get sharper. Models will get more capable. None of that will close the gap between principle and engineering. The operators who pull ahead will be the ones who can show — concretely, by workflow, by gate, by escalation path — how their operating models preserve human authority while removing operational drag.

That's the work behind the slogan.

If your organization is working through what human-led AI requires inside your workflows, I'm open to the conversation.

Colin Thrasher, Founder & CEO, Orchestrive.ai