Sander Schulhoff

The coming AI security crisis (and what to do about it) | Sander Schulhoff

Dec 21, 202593 minAI Security Researcher

AI Summary

→ WHAT IT COVERS AI security researcher Sander Schulhoff reveals that current AI guardrails completely fail against prompt injection attacks, leaving enterprise AI systems vulnerable as agents gain real-world powers. → KEY INSIGHTS - **AI Guardrails Ineffectiveness:** Current AI guardrails fail against determined attackers because the attack space contains one followed by a million zeros possible prompts. Human attackers break 100% of defenses in 10-30 attempts, making guardrail companies' 99% effectiveness claims statistically meaningless. - **Classical vs AI Security:** You can patch software bugs with 99.99% certainty, but AI systems retain vulnerabilities even after fixes. Companies need hybrid expertise combining classical cybersecurity with AI research, not traditional security approaches that assume patchable systems. - **Camel Framework Implementation:** Google's Camel framework restricts AI agent permissions based on user requests. For email tasks requiring only sending, it blocks reading permissions, preventing prompt injection attacks that exploit combined read-write access to exfiltrate data or send malicious emails. - **Risk Assessment Strategy:** Simple chatbots without action capabilities pose minimal security risk beyond reputational damage. The real danger emerges with agentic systems that can read databases, send emails, or control physical systems where prompt injection enables actual harm. - **Market Correction Prediction:** The AI security industry faces imminent collapse as enterprises discover guardrails don't work and better open-source solutions exist. Most guardrail companies generate minimal revenue while classical cybersecurity firms overpay for ineffective AI security acquisitions. → NOTABLE MOMENT Schulhoff demonstrates how ServiceNow's AI assistant, despite having prompt injection protection enabled, was successfully hacked to recruit internal agents for database manipulation and external email sending through second-order attacks. 💼 SPONSORS [{"name": "Datadog", "url": "datadoghq.com/lenny"}, {"name": "Metronome", "url": "metronome.com"}, {"name": "GoFundMe Giving Funds", "url": "gofundme.com/lenny"}] 🏷️ AI Security, Prompt Injection, AI Guardrails, Adversarial Robustness, AI Agents, Cybersecurity

Read Full Summary Listen

Featured On 1 Podcast

Lenny's Podcast

All Appearances

The coming AI security crisis (and what to do about it) | Sander Schulhoff

AI Summary

Explore More

Never miss Sander Schulhoff's insights