Increasingly Complicated Agents

Dealing with increasingly complicated agents

Oct 16, 202555 min

AI Summary

→ WHAT IT COVERS Donato Capitella from Reversec explains how AI agents accessing external tools create massive security vulnerabilities, requiring new design patterns beyond traditional LLM red teaming approaches. → KEY INSIGHTS - **Agent Security Model:** Any tool exposed to an LLM becomes accessible to anyone controlling LLM input through prompt injection, requiring deterministic authorization controls outside the model itself. - **Password Attack Analogy:** Jailbreaking resembles password cracking - focus on limiting attempt frequency rather than perfect blocking, using guardrails as detection signals to suspend suspicious users after multiple triggers. - **Code-Then-Execute Pattern:** Generate execution plans before untrusted data enters context, using data flow analysis to enforce tool policies based on input source trustworthiness - most promising security design pattern. - **Complexity Explosion:** Modern agent workflows mix multiple untrusted data sources in single LLM contexts, where any malicious component can compromise the entire system through cross-contamination attacks. → NOTABLE MOMENT Capitella demonstrates how attackers can inject malicious emails into support ticket databases, later triggering phishing responses when legitimate customers submit related queries through the system. 💼 SPONSORS [{"name": "Shopify", "url": "shopify.com/practicalai"}, {"name": "Fabi", "url": "fabi.ai"}, {"name": "Agency", "url": "agency.org"}] 🏷️ AI Security, Prompt Injection, Agent Architecture, Penetration Testing

Read Full Summary Listen

Featured On 1 Podcast

Practical AI

All Appearances

Dealing with increasingly complicated agents

AI Summary

Never miss Increasingly Complicated Agents's insights