Trusting Your Agent Is Overrated

Lower the stakes, not your standards.

Jun 24, 2026

I work with an AI agent that gets better every week. Let's call him Alex, for simplicity. Alex handles coordination, drafts emails, summarizes requests, scans calendars. About once a week he surprises me with how cleanly he handles something he’s never seen before.

But he’s not perfect. Sometimes he needs the same reminder twice. Sometimes he replies when he shouldn’t, or stays quiet when he should speak up. Three steps forward, one step back.

At the end of every week I run the same calculation. Should Alex get more responsibility?

Every “yes” expands what Alex can do for me. Every “yes” also expands what Alex can do to me.

This isn’t a new problem. Managers and parents run a version of it every day: how much responsibility is the right amount, and how much is too much, too fast? The variable underneath the decision is always the same: Trust.

I spent years at Dropbox, where the first company value was “be worthy of trust.” We repeated it because the math was brutal: a single breach is a company-ending event. Trust takes years to earn and a moment to vaporize. And the way we’ve been taught to build it is starting to break.

Trust Underlies Every Transaction

Step back from AI for a second. Almost nothing happens without trust. Money is a trust instrument. So are contracts, reviews, credentials, handshakes, marriage, and employment. We built every one of them because trust is what lets two parties act when neither has complete information about the other.

Trust is the binding constraint on adoption. A product can be better, cheaper, and more elegant than its alternatives and still lose because people don’t trust it. When something is brand new it has almost no trust to draw on, and earning it has always been slow and expensive.

In the AI era, where products change weekly, wield enormous power, and act more autonomously and more opaquely than anything before them, earning trust is the central problem.

The Trust Equation

The cleanest framework I’ve found comes from Maister, Green, and Galford in The Trusted Advisor (2000):

Credibility: Do you know what you’re talking about?
Reliability: Do you actually follow through?
Intimacy: Do you promote emotional safety and security?
Self Orientation: Are you in this for yourself, or for me?

This equation is elegant and has aged well… for humans.

Because that’s who it was built for: lawyers, consultants, doctors, bankers. People you choose carefully, work with for years, and judge slowly. In that world trust accumulates, expensively, over time.

But the equation is missing something. It measures how much trust a person supplies. It says nothing about how much the situation demands. Trusting a teenager to change your tires isn’t the same as trusting a surgeon to operate on your child. Both require trust, but the bar is wildly different.

So let’s stretch the equation. Think of trust as a ratio between the trust supplied (by a person, a product, or an agent) and the trust demanded by the action in front of you. You act when supply clears demand: T_S > T_D.

Supplied Trust (T_S) is built upon:

Credibility (C): Brand, credentials, reputation, track record. The accumulated signal that you know what you’re doing.
Evidence (E): Outputs you can experience directly. Demos, free trials, sandboxes.
Alignment (A): The sense that the other party’s incentives are compatible with yours.

Demanded Trust (T_D) is built upon:

Stakes (K): How bad is it if this goes wrong, and how hard is it to undo?
Opacity (O): Can I see inside the system while it runs, and reconstruct why it did what it did afterward?

So action, the actual decision to proceed or adopt, scales with the ratio:

The original equation maps almost perfectly onto the top of this one. Credibility stays. Reliability becomes Evidence. Intimacy is a flavor of Alignment, and Self-Orientation is just its inverse.

What the original missed is the bottom: trust isn’t only supplied by an actor, but demanded by a circumstance.

The Evolution of Trust In The AI Era

Put a pre-AI decision (hiring a consultant) next to an AI-era one (handing an agent your inbox), and every term in the ratio moves.

Credibility doesn’t suddenly stop mattering but it matters less. The signal has gotten noisier. Fake Amazon reviews have been a problem for a decade. While that predates mainstream AI, AI pours fuel on this issue. When you can drop Tom Cruise into a movie he was never in and have it look real, “I saw it with my own eyes” stops being proof.

Evidence is cheap and instant, but never comprehensive. An agent can do the right thing ninety-seven times and, on the ninety-eighth, confidently send the wrong email to your most important customer.

Alignment is the headline question in the era of AI. The canonical worry is the paperclip maximizer. Tell an agent to build the most efficient paperclip factory and, taken literally, it might start a war to clear the way. While it may be doing exactly what you asked for on the surface, you were actually never aligned on the things that matter. At SPC, we’re now explicit enough about this that our agents name a human as their “manager.”

Opacity thickened. Traditional software is deterministic, the same input causes the same output, and when it breaks you can step through it line by line. AI doesn’t work that way. When Alex tells me he’s learned from a mistake and won’t repeat it, I can’t set a breakpoint. I can only ask him how he knows, and then decide whether to believe his answer.

In the AI world the trust variables are fuzzier, the error bars wider, and the components of trust are themselves harder to... trust.

So, Do I Let Alex Run My Inbox?

Back to the question I started with.

The numerator is actually fine. Alex is capable, and I have plenty of evidence of it. The models behind him are credible. He seems well aligned, treating me like his manager.

It’s the denominator that stops me. Email is high-stakes: a misfired message can do real brand damage, especially since the person on the other end doesn’t know Alex is an agent. Alex works around the clock and I can’t watch around the clock. While he’s good at replaying what he did, he’s still murky on the why.

So if you’re building AI tooling and bootstrapping trust, the usual advice is to increase the trust you supply:

Credibility: Borrow trust from partners, customer logos, and real testimonials until you’ve earned your own.
Evidence: Build low-stakes, accurate sandboxes. Be radically transparent about your methodology and your prompts. Show, don’t just tell.
Alignment: Tell me exactly what you’ll do with my data, and how my success becomes yours. An agent saying “yes” is not the same as an agent that wants what I want.

But the real leverage right now is in the denominator, and far more people need to pull on it. Decrease the trust your users have to spend:

Stakes: Build undo into the workflow. Let me edit after sending. Give me a trial. Walk me up a ladder of graduated responsibility instead of asking for the keys on day one. Be crystal clear about read versus write access at every layer.
Opacity: Make the logs easy to pull. Cite sources. Allow instant replay. State your assumptions. Share the spec.

Every point you shave off the denominator is a point of trust your user no longer has to manufacture before they say yes.

Interested in SPC? Apply to join us here.

Discussion about this post

Ready for more?