When to Use AI Agents

Half the AI agent pitches I see are just expensive chatbots with delusions of grandeur. The other half are genuinely useful—but probably don’t need an agent at all.

Here’s the thing: agents aren’t just “better automation.” They’re fundamentally different beasts that excel at messy, unpredictable problems where the right response depends on context that’s constantly shifting. But if your workflow is mostly predictable? You’re probably better off with good old-fashioned automation.

So how do you tell the difference? I’ve been thinking through this problem and landed on a simple framework—not a scoring system or some pseudo-scientific methodology, but a way to think through whether your problem actually needs an agent.

What Actually Makes Something “Agentic”

Real agents do four things well: they observe their environment, reason through problems step-by-step, adapt when things change, and act with minimal babysitting. The key word is “minimal”—if you’re constantly course-correcting an “agent,” it’s probably just automation with extra steps.

The litmus test: could a competent human handle this task if they were dropped into the situation cold, with just the same information your system has access to? If the answer is “probably not without extensive training,” you might not need an agent—you need better processes or clearer requirements.

Five Ways to Think About Agent-Worthy Problems

Instead of scoring pillars, I find it helpful to ask five questions about any potential agent deployment:

1. Does the context actually change?

The question: How often do the relevant facts of your situation shift in ways that matter for decision-making?

Why it matters: This is the biggest separator between agent and automation territory. If your process works the same way regardless of external conditions, automation will be faster, cheaper, and more reliable.

Examples that need agents:

Emergency room triage (patient conditions evolve constantly)
Supply chain management (disruptions, demand spikes, supplier issues)
Investment portfolio rebalancing (market conditions, regulatory changes)

Examples that don’t:

Payroll processing (rules are stable, data format is predictable)
Invoice generation (straightforward template + data merge)
Password reset flows (linear process with clear decision points)

Red flag: If you find yourself saying “it’s always the same except for…” you probably want automation with some conditional logic, not an agent.

2. Are there actually multiple steps that depend on each other?

The question: Does success require chaining together several decisions where the outcome of step 1 affects what you should do in step 2?

Why it matters: True multi-step reasoning is computationally expensive and error-prone. If your process is mostly parallel or independent steps, you’re adding complexity for no gain.

Examples that need agents:

Loan underwriting (credit check results affect which documents to request, which affects risk assessment)
Tech support diagnosis (symptoms determine tests, test results determine solutions)
Content strategy (audience research affects topic selection affects production timeline)

Examples that don’t:

Data backup (same steps every time, no interdependencies)
Email campaigns (segments defined upfront, minimal dynamic adjustment)
Report generation (predictable data sources and formatting)

Red flag: If you can draw your process as a straight line with maybe a few branches, stick with automation.

3. Would a human need to make judgment calls?

The question: Are there places where someone with domain expertise would say “it depends” and need to consider factors that aren’t easily quantified?

Why it matters: Judgment calls are where agents can add real value, but they’re also where they can fail spectacularly. If your domain expertise can be encoded in rules, do that instead.

Examples that need agents:

Content moderation (context matters enormously for edge cases)
Customer service escalation (reading between the lines of complaints)
Resource allocation during crisis (competing priorities, incomplete information)

Examples that don’t:

Tax calculation (complex but rule-based)
Inventory reordering (data-driven with clear thresholds)
Meeting scheduling (constraint satisfaction, not judgment)

Red flag: If your “judgment calls” are really just complex rules you haven’t written down yet, write them down.

4. Do you need it to notice things and act on them?

The question: Should the system be watching for opportunities or problems and taking initiative, or is it fine to wait for human direction?

Why it matters: Proactivity is expensive to build and can go wrong in spectacular ways. Make sure the value of catching things early outweighs the risk of false alarms and misguided actions.

Examples that need agents:

Fraud detection (time-sensitive, patterns change frequently)
Price optimization (competitive landscape shifts constantly)
Infrastructure monitoring (need to catch and respond to issues before they cascade)

Examples that don’t:

Monthly reporting (scheduled work is fine)
Backup verification (reactive checking is sufficient)
User onboarding (triggered by user actions, not system initiative)

Red flag: If the cost of missing something is low, or if false positives would be more annoying than helpful, skip the proactivity.

5. How much does domain expertise actually matter?

The question: Would a smart generalist struggle with this task, or could they figure it out with access to the right information?

Why it matters: Deep domain knowledge is hard to encode and maintain. If your problem doesn’t really require specialized expertise, you might be overcomplicating things.

Examples where expertise matters:

Legal contract review (nuanced understanding of implications and precedent)
Medical diagnosis support (years of training reflected in pattern recognition)
Financial regulatory compliance (complex, evolving rules with high stakes)

Examples where it doesn’t:

Data entry and validation (tedious but not complex)
Appointment scheduling (logistics, not expertise)
Basic customer inquiries (FAQ with search capabilities)

Red flag: If you can train someone to do the task in a few weeks, the domain expertise requirement might not justify agent complexity.

Putting It Together: A Reality Check

Here’s how I actually use this framework: I go through the five questions and count how many get a strong “yes.”

4-5 strong yeses: You’ve probably got a genuine agent use case. The complexity is justified, and simpler approaches will likely fall short.

2-3 yeses: This is the gray area. You might want an agent, or you might want really good automation with some smart components. Consider starting simple and evolving.

0-1 yeses: You probably want automation, not an agent. The agent complexity will cost you more than it delivers.

But here’s the crucial part: even if you score high, that doesn’t mean you should build an agent. You also need to consider whether you have the team, the timeline, the budget, and the risk tolerance for what is inherently a more complex and unpredictable system.

Some Real Examples

Customer service for a SaaS company: Context changes (customer history, product issues), multiple reasoning steps (diagnosis, solution selection, escalation), requires judgment about tone and urgency, needs proactive follow-up, benefits from product expertise. Five yeses—probably worth exploring an agent approach, but start with human-in-the-loop.

E-commerce inventory management: Inventory levels and prices change constantly, requires sequential decisions about ordering and pricing, some judgment about seasonal patterns, benefits from proactive reordering, needs retail expertise. Four yeses—good agent candidate, but be careful about the proactive piece.

Employee onboarding: Mostly standard process, minimal context changes, straightforward step sequence, limited judgment required, reactive rather than proactive, minimal specialized knowledge needed. One yes—stick with a well-designed workflow tool.

Financial portfolio rebalancing: Market conditions change constantly, complex multi-step optimization, requires significant judgment about risk and timing, should be proactive about opportunities, deep financial expertise essential. Five yeses—but also high stakes, so probably want human oversight even with an agent.

When This Framework Fails

This thinking process works for operational workflows where you can define success reasonably clearly. It’s not great for:

Creative tasks where “good” is subjective
One-off projects where you can’t amortize the development cost
Highly regulated environments where explainability requirements are strict
Problems where the failure cost is catastrophic

In those cases, you’re probably better off with human-AI collaboration tools, specialized software, or just keeping humans in the loop.

The Real Question

The framework gives you a way to think through whether you need an agent, but the real question is often simpler: What’s the smallest thing you could build that would actually help?

Sometimes that’s a full autonomous agent. More often, it’s automation with a few smart components, or a human-in-the-loop system, or just better tooling for the people already doing the work.

Don’t let the excitement around agents push you toward complexity you don’t need. The best AI deployment is often the boring one that just works.