Salesforce Agentforce ยท 2026

What 20,000 Enterprise Agent Deployments Taught Us

Lessons from Salesforce's large-scale analysis: why most agents fail post-launch, and what the successful ones do differently.

20,000Deployments analyzed
90%Work happens post-launch
6Core lessons
0 / 6 sections
๐Ÿ—๏ธ
Section 1 of 6
The Four-Layer Enterprise Agent Architecture
โ€บ

Salesforce's framework for enterprise agents isn't a single model โ€” it's a layered stack where each layer has a distinct job. Getting this separation right is the foundation everything else rests on.

EngagementSlack, chat, messaging โ€” where users talk to the agent
AgentAI reasoning and decision-making core
System of WorkCRM, ERP โ€” where actual work gets done
ContextData and metadata that grounds agent actions
A Trust Layer spans the entire stack โ€” supporting multiple LLM providers with integrated guardrails. It is not a bolt-on; it is woven into every layer.

Live Demo โ€” Hover each layer to understand its role

Architecture Stack
๐Ÿ’ฌ
Engagement Layer
Slack ยท Web Chat ยท SMS
The surface the user actually sees. Handles conversation history, UI rendering, and routing requests to the agent layer. Does NOT perform reasoning.
๐Ÿง 
Agent Layer
LLM reasoning ยท Tool calls ยท Sub-agents
Where the LLM lives. Decides what action to take, which tool to call, and when to escalate to a human. Has no direct database access โ€” it calls tools.
โš™๏ธ
System of Work
Salesforce ยท SAP ยท Internal APIs
Where the actual mutation happens: updating a CRM record, placing an order, filing a ticket. The agent requests; this layer executes.
๐Ÿ—„๏ธ
Context Layer
Vector DB ยท Knowledge base ยท Metadata
Retrieval-Augmented Generation (RAG) source. Provides grounding so the agent answers from company data, not hallucinated general knowledge.
Quick Check
Which layer is responsible for actually executing a database write in the Salesforce architecture?
โš–๏ธ
Section 2 of 6
The 90/10 Rule โ€” Launch Is the Beginning, Not the End
โ€บ

Traditional software development spends 90% of effort before launch. Enterprise agents invert this completely. John Kucera from Salesforce: "90% of the work is after you go live to manage and improve the agent."

Teams that treat launch as the finish line consistently fail to scale past pilot. Those that build for post-launch operations from day one are the ones that reach 20,000 deployments.

Real users immediately expose edge cases that never appear in demos โ€” ambiguous phrasing, unexpected intents, requests the agent was never designed to handle. This is where trust is earned or lost.

Demo โ€” Traditional vs. Agent Effort Distribution

Effort Split: Pre-Launch vs. Post-Launch
Traditional SW
Pre-launch
90% before launch, 10% after
Agent SW
Pre-launch
10% before launch, 90% managing & improving after
Quick Check
According to Salesforce's 20,000-deployment analysis, when do most agent failures actually occur?
๐Ÿš€
Section 3 of 6
Pre-Launch: Guardrails, KPIs, and Starting Small
โ€บ

Three disciplines separate successful pre-launch preparation from the teams that scramble after going live.

Resist the temptation to build an agent that handles everything. A focused, narrow agent lets your team learn the feedback cycle before stakes are high. Overcommitting to broad scope early leads to agents that are mediocre everywhere.

Salesforce's pattern: start with one high-value, well-defined use case. Measure it. Expand only after mastering that loop.

Salesforce introduced Agentic Work Units (AWUs) โ€” discrete units that measure meaningful work completion, not just activity (messages sent, API calls made).

For a support agent, the KPI is containment rate: percentage of cases fully resolved without human follow-up. This is a business outcome, not a technical metric.

Bad KPIMessages handled per hour
Good KPICases resolved without escalation

Input guardrails protect data entering the LLM:

  • Secure data retrieval through controlled access layers
  • Zero data retention agreements preventing model training use
  • Trust-boundary hosting keeping sensitive data internal

Output guardrails validate responses before delivery:

  • Tool and sub-agent validation โ€” prevent hallucinated actions
  • Grounding checks โ€” answers must derive from specified sources
  • Content filtering for harmful material

Demo โ€” Agent Lifecycle: From Concept to Operations

Pre-Launch Sequence
Click Play to walk through the pre-launch sequence.
Quick Check
What is an "Agentic Work Unit" (AWU) as introduced by Salesforce?
๐Ÿ”„
Section 4 of 6
Post-Launch: Feedback Loops and Triage Categories
โ€บ

The teams that scaled successfully built tight feedback loops. Not just collecting failures โ€” categorizing them so the fix goes to the right owner immediately.

Salesforce identified four triage categories for agent failures. Each maps to a different owner and a different fix type:

Demo โ€” Failure Funnel: Where Agents Break Down

Failure Categories by Frequency (typical distribution)
Click a category to see the recommended fix.
Speed of feedback loop determined which teams scaled versus which stayed in pilot. The fix type matters, but the routing speed is what separates the 10x teams.
Quick Check
An agent correctly understands the user's intent, but its answer cites outdated policy information. Which triage category does this fall under?
โš ๏ธ
Section 5 of 6
Three Anti-Patterns That Kill Enterprise Agents
โ€บ

These are the recurring mistakes Salesforce observed across thousands of deployments. Each one has a clear, better alternative.

The mistake: Routing every decision through the LLM โ€” including simple deterministic operations like order status lookups.

The fix: Salesforce built Agent Script, a TypeScript framework that enables deterministic control flow alongside probabilistic LLM reasoning. Not every step needs AI โ€” use code where the logic is known.

Rule of thumb: if you could write an if/else to handle it, you probably should.

The mistake: Using capitalization and emphasis ("NEVER do X", "ALWAYS check Y") to enforce business rules in system prompts. This does not reliably modify LLM behavior.

The fix: Encode business rules as conditional logic in code. Geographic restrictions, compliance rules, escalation thresholds โ€” these belong in the System of Work layer, not as text constraints in a prompt.

The mistake: Passing full API responses and entire documents as context. An insurance company was feeding full policy PDFs into every query.

The fix: Right-size context to only what the LLM needs for this specific query. The insurance company reduced to relevant sections only โ€” both latency and accuracy improved.

Demo โ€” Policy Encoding: Prompt vs. Code

Click a rule to see how it should be implemented
Restrict agent to users in North America only
PROMPT โœ—
Escalate if order value exceeds $10,000
PROMPT โœ—
Never reveal SSN in agent responses
PROMPT โœ—
Click a rule above to see how Salesforce recommends implementing it.
Quick Check
Salesforce's Agent Script is designed to solve which specific anti-pattern?
๐Ÿ”ฎ
Section 6 of 6
Multi-Agent Orchestration and What's Next
โ€บ

As individual agents mature, the next frontier is multi-agent systems โ€” where a parent agent coordinates multiple specialized sub-agents, each owning a narrower problem.

Hierarchy3-level: orchestrator โ†’ specialist โ†’ sub-specialist
BenefitSimpler instructions per agent, smaller context windows
PatternParent routes, children execute, parent synthesizes

Beyond chat interfaces, Salesforce sees agents being deployed for multi-session workflows โ€” tasks that span days, background automation triggered by events, and multi-channel deployments across web, phone, email, and Slack.

The disciplines that don't change: start small, measure outcomes, build tight feedback loops, encode policies in code, keep context lean. Models and tooling evolve; these principles remain constant.

Demo โ€” Context Window Efficiency: Fat vs. Lean

Insurance company case study โ€” policy document retrieval
Before (full doc)
~50,000 tokens
After (relevant sections)
~3,500 tokens
โ€”
Latency change
โ€”
Accuracy change
โ€”
Token cost
Quick Check
In a three-level multi-agent hierarchy, what is the role of the parent (orchestrator) agent?

All 6 Lessons Unlocked ๐ŸŽ‰

You've covered the full playbook: architecture, the 90/10 rule, pre-launch disciplines, feedback loops, anti-patterns, and multi-agent futures. Ship agents that earn trust post-launch.