Skip to main content
Practical AI

Dealing with increasingly complicated agents

54 min episode · 2 min read
·

Episode

54 min

Read time

2 min

AI-Generated Summary

Key Takeaways

  • Agent Security Model: Any tool exposed to an LLM becomes accessible to anyone controlling LLM input through prompt injection, requiring deterministic authorization controls outside the model itself.
  • Password Attack Analogy: Jailbreaking resembles password cracking - focus on limiting attempt frequency rather than perfect blocking, using guardrails as detection signals to suspend suspicious users after multiple triggers.
  • Code-Then-Execute Pattern: Generate execution plans before untrusted data enters context, using data flow analysis to enforce tool policies based on input source trustworthiness - most promising security design pattern.
  • Complexity Explosion: Modern agent workflows mix multiple untrusted data sources in single LLM contexts, where any malicious component can compromise the entire system through cross-contamination attacks.

What It Covers

Donato Capitella from Reversec explains how AI agents accessing external tools create massive security vulnerabilities, requiring new design patterns beyond traditional LLM red teaming approaches.

Key Questions Answered

  • Agent Security Model: Any tool exposed to an LLM becomes accessible to anyone controlling LLM input through prompt injection, requiring deterministic authorization controls outside the model itself.
  • Password Attack Analogy: Jailbreaking resembles password cracking - focus on limiting attempt frequency rather than perfect blocking, using guardrails as detection signals to suspend suspicious users after multiple triggers.
  • Code-Then-Execute Pattern: Generate execution plans before untrusted data enters context, using data flow analysis to enforce tool policies based on input source trustworthiness - most promising security design pattern.
  • Complexity Explosion: Modern agent workflows mix multiple untrusted data sources in single LLM contexts, where any malicious component can compromise the entire system through cross-contamination attacks.

Notable Moment

Capitella demonstrates how attackers can inject malicious emails into support ticket databases, later triggering phishing responses when legitimate customers submit related queries through the system.

Know someone who'd find this useful?

You just read a 3-minute summary of a 51-minute episode.

Get Practical AI summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Practical AI

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

You're clearly into Practical AI.

Every Monday, we deliver AI summaries of the latest episodes from Practical AI and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime