AI Finds 70% of Smart Contract Exploits | Alpin Yukseloglu
Episode
61 min
Read time
3 min
Topics
Relationships, Investing, Fundraising & VC
AI-Generated Summary
Key Takeaways
- ✓AI exploit capability trajectory: Frontier models went from finding roughly 12–13% of critical smart contract bugs to over 70% within six months — a jump that occurred partly between drafting and publishing the EVM Bench paper. GPT-5.3 Codex now matches or approaches the collective output of human auditors on fund-draining vulnerabilities sourced from Code Arena audit contests. Expect a superhuman AI auditor within six to eight months.
- ✓False positive elimination via verifiability: Previous AI auditing tools produced high false positive rates, making them impractical. EVM Bench solves this by running exploits against a production-grade EVM environment loaded with real chain state. If the agent claims a bug exists, it must produce a working proof-of-concept that drains funds from the contract — reducing false positives to near zero and making AI audit results actionable.
- ✓Long-tail contract risk: Low-TVL protocols on EVM-compatible chains like Binance Smart Chain face the highest near-term exploit risk. These contracts were historically sheltered because the maximum extractable value was too small to attract skilled attackers. As inference costs drop below the value of exploiting even small contracts, AI agents will systematically collect this long tail — making security investment non-optional regardless of protocol size.
- ✓Crypto's verifiability accelerates AI training: Crypto code is among the most verifiable software in existence — agents can deploy contracts, assert state changes, and confirm exploits without human labelers. This creates a tight training signal that accelerates model improvement faster than most software domains. Paradigm expects models to develop strong crypto capabilities with less direct training data than initially anticipated, compressing the timeline to superhuman performance.
- ✓Offense-defense arms race framing: The near-term security outcome depends on whether white-hat or black-hat actors access frontier AI capabilities first. Paradigm's strategic response is embedding crypto benchmarks directly inside model labs — EVM Bench is now running inside OpenAI — to ensure defensive tooling develops alongside offensive capability. Protocols housing significant TVL should begin proactive AI-assisted auditing now rather than waiting for the first AI-attributed exploit.
What It Covers
Alpin Yukseloglu, investment and research partner at Paradigm, presents findings from EVM Bench — a benchmark co-authored with OpenAI measuring AI agents' ability to detect, patch, and exploit smart contract vulnerabilities. Top models jumped from under 20% to over 70% exploit detection in six months, reshaping crypto security assumptions.
Key Questions Answered
- •AI exploit capability trajectory: Frontier models went from finding roughly 12–13% of critical smart contract bugs to over 70% within six months — a jump that occurred partly between drafting and publishing the EVM Bench paper. GPT-5.3 Codex now matches or approaches the collective output of human auditors on fund-draining vulnerabilities sourced from Code Arena audit contests. Expect a superhuman AI auditor within six to eight months.
- •False positive elimination via verifiability: Previous AI auditing tools produced high false positive rates, making them impractical. EVM Bench solves this by running exploits against a production-grade EVM environment loaded with real chain state. If the agent claims a bug exists, it must produce a working proof-of-concept that drains funds from the contract — reducing false positives to near zero and making AI audit results actionable.
- •Long-tail contract risk: Low-TVL protocols on EVM-compatible chains like Binance Smart Chain face the highest near-term exploit risk. These contracts were historically sheltered because the maximum extractable value was too small to attract skilled attackers. As inference costs drop below the value of exploiting even small contracts, AI agents will systematically collect this long tail — making security investment non-optional regardless of protocol size.
- •Crypto's verifiability accelerates AI training: Crypto code is among the most verifiable software in existence — agents can deploy contracts, assert state changes, and confirm exploits without human labelers. This creates a tight training signal that accelerates model improvement faster than most software domains. Paradigm expects models to develop strong crypto capabilities with less direct training data than initially anticipated, compressing the timeline to superhuman performance.
- •Offense-defense arms race framing: The near-term security outcome depends on whether white-hat or black-hat actors access frontier AI capabilities first. Paradigm's strategic response is embedding crypto benchmarks directly inside model labs — EVM Bench is now running inside OpenAI — to ensure defensive tooling develops alongside offensive capability. Protocols housing significant TVL should begin proactive AI-assisted auditing now rather than waiting for the first AI-attributed exploit.
- •Agency over singularity anxiety: When facing uncertainty about AI's trajectory, Yukseloglu recommends replacing speculative theorizing with direct experimentation at the frontier. Both full acceptance and full denial of AI risk produce the same passive outcome. The practical alternative is running experiments, engaging model labs directly, and shipping within 24 hours of inception — speed over cohesion is the correct operating mode when the frontier remains experimentally unknowable.
Notable Moment
Yukseloglu describes a counterintuitive dynamic: Solana's prevalence of closed-source contracts, typically seen as a disadvantage, may actually accelerate AI model development on that stack. Contracts absent from public training data provide cleaner, uncontaminated evaluation signals — potentially giving closed-source ecosystems an unexpected edge in AI capability benchmarking.
You just read a 3-minute summary of a 58-minute episode.
Get Bankless summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Bankless
Is $LIT Cheap? | Will Price and Flip
Jun 9 · 60 min
Deep Questions with Cal Newport
Is Claude Mythos “Terrifying”? | AI Reality Check
Apr 16
More from Bankless
Venice is Here to Win: How a Private AI Company Plans to Take On OpenAI and Anthropic
Jun 8 · 58 min
The Meb Faber Show
200 Years of Markets in 60 Minutes (Deutsche Bank’s Jim Reid) | #618
Feb 13
Books, tools, and gear mentioned in this episode
SignalCast may earn commission on purchases via these links. As an Amazon Associate, SignalCast earns from qualifying purchases.
Tools
“GPT-5.3 Codex now matches or approaches the collective output of human auditors on fund-draining vulnerabilities sourced from Code Arena audit contests.”
- EVM BenchRecommended
by Paradigm
“Alpin Yukseloglu, investment and research partner at Paradigm, presents findings from EVM Bench — a benchmark co-authored with OpenAI measuring AI agents' ability to detect, patch, and exploit smart contract vulnerabilities.”
More from Bankless
We summarize every new episode. Want them in your inbox?
Is $LIT Cheap? | Will Price and Flip
Venice is Here to Win: How a Private AI Company Plans to Take On OpenAI and Anthropic
ROLLUP: Bitcoin’s Confidence Game | Bitmine’s ETH Bet | Token Rotation | U.S. Perps
Capitol Hill War Stories from a DC Lobbyist Who’s Seen It All (SBF, Gensler, Elizabeth Warren)
"ZODL is to Zcash What Coinbase Was to Bitcoin" | Josh Swihart on ZEC’s Awakening
Similar Episodes
Related episodes from other podcasts
Deep Questions with Cal Newport
Apr 16
Is Claude Mythos “Terrifying”? | AI Reality Check
The Meb Faber Show
Feb 13
200 Years of Markets in 60 Minutes (Deutsche Bank’s Jim Reid) | #618
Cognitive Revolution
Jun 3
Nested Learning: Ali Behrouz on the Quest for Continual Learning & Illusion of AI Architectures
BiggerPockets Money Podcast
Mar 6
Does More Money REALLY Buy Happiness? | Matt Killingsworth
On Purpose with Jay Shetty
Feb 23
Jessie Inchauspé: 90% of Pregnant People Are Missing THIS Nutrient (Follow THIS Simple Diet To Reduce Glucose Spikes & Protect Your Baby’s Brain & Metabolism)
Explore Related Topics
This podcast is featured in Best Crypto Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Investing & Markets Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Bankless.
Every Monday, we deliver AI summaries of the latest episodes from Bankless and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime