
AI Finds 70% of Smart Contract Exploits | Alpin Yukseloglu
BanklessAI Summary
→ WHAT IT COVERS Alpin Yukseloglu, investment and research partner at Paradigm, presents findings from EVM Bench — a benchmark co-authored with OpenAI measuring AI agents' ability to detect, patch, and exploit smart contract vulnerabilities. Top models jumped from under 20% to over 70% exploit detection in six months, reshaping crypto security assumptions. → KEY INSIGHTS - **AI exploit capability trajectory:** Frontier models went from finding roughly 12–13% of critical smart contract bugs to over 70% within six months — a jump that occurred partly between drafting and publishing the EVM Bench paper. GPT-5.3 Codex now matches or approaches the collective output of human auditors on fund-draining vulnerabilities sourced from Code Arena audit contests. Expect a superhuman AI auditor within six to eight months. - **False positive elimination via verifiability:** Previous AI auditing tools produced high false positive rates, making them impractical. EVM Bench solves this by running exploits against a production-grade EVM environment loaded with real chain state. If the agent claims a bug exists, it must produce a working proof-of-concept that drains funds from the contract — reducing false positives to near zero and making AI audit results actionable. - **Long-tail contract risk:** Low-TVL protocols on EVM-compatible chains like Binance Smart Chain face the highest near-term exploit risk. These contracts were historically sheltered because the maximum extractable value was too small to attract skilled attackers. As inference costs drop below the value of exploiting even small contracts, AI agents will systematically collect this long tail — making security investment non-optional regardless of protocol size. - **Crypto's verifiability accelerates AI training:** Crypto code is among the most verifiable software in existence — agents can deploy contracts, assert state changes, and confirm exploits without human labelers. This creates a tight training signal that accelerates model improvement faster than most software domains. Paradigm expects models to develop strong crypto capabilities with less direct training data than initially anticipated, compressing the timeline to superhuman performance. - **Offense-defense arms race framing:** The near-term security outcome depends on whether white-hat or black-hat actors access frontier AI capabilities first. Paradigm's strategic response is embedding crypto benchmarks directly inside model labs — EVM Bench is now running inside OpenAI — to ensure defensive tooling develops alongside offensive capability. Protocols housing significant TVL should begin proactive AI-assisted auditing now rather than waiting for the first AI-attributed exploit. - **Agency over singularity anxiety:** When facing uncertainty about AI's trajectory, Yukseloglu recommends replacing speculative theorizing with direct experimentation at the frontier. Both full acceptance and full denial of AI risk produce the same passive outcome. The practical alternative is running experiments, engaging model labs directly, and shipping within 24 hours of inception — speed over cohesion is the correct operating mode when the frontier remains experimentally unknowable. → NOTABLE MOMENT Yukseloglu describes a counterintuitive dynamic: Solana's prevalence of closed-source contracts, typically seen as a disadvantage, may actually accelerate AI model development on that stack. Contracts absent from public training data provide cleaner, uncontaminated evaluation signals — potentially giving closed-source ecosystems an unexpected edge in AI capability benchmarking. 💼 SPONSORS [{"name": "Galaxy", "url": "https://galaxy.com/bankless"}, {"name": "Euphoria", "url": "https://euphoria.finance"}, {"name": "Bricks", "url": "None detected"}, {"name": "Bitget", "url": "None detected"}, {"name": "The DeFi Report", "url": "None detected"}] 🏷️ Smart Contract Security, AI Exploit Detection, EVM Bench, Paradigm Research, DeFi Risk, Crypto AI Integration