Skip to main content
DP

Devi Parikh

Devi Parikh**visual-based Browser Navigation**scouts Architecture Combines APIs and Browser**post-training Progression Maximizes Model Capability**background Agents Require Hierarchical Tool Management
1episode
1podcast

We have 1 summarized appearance for Devi Parikh so far. Browse all podcasts to discover more episodes.

Featured On 1 Podcast

Top resources Devi Parikh mentions

Books, tools, and gear cited across podcast appearances. Ranked by frequency.

SignalCast may earn commission on purchases via affiliate links on each resource page.

All Appearances

1 episode

AI Summary

→ WHAT IT COVERS Devi Parikh, co-founder of Utori, explains how AI browser agents will replace manual web interactions through proactive monitoring and automation, starting with Scouts, their product that monitors websites for user-specified information changes. → KEY INSIGHTS - **Visual-based browser navigation:** Training models on website screenshots rather than DOM information proves more reliable and generalizable across different sites, solving challenges like date pickers that plagued DOM-based approaches with constant edge cases requiring site-specific solutions. - **Scouts architecture combines APIs and browser automation:** The system uses 80-90 MCP servers for structured data access but spins up remote browsers with custom-trained navigator models for information behind forms, optimizing for coverage first then precision in user-facing reports. - **Post-training progression maximizes model capability:** Utori trains QwQ models through supervised fine-tuning, then rejection sampling, then reinforcement learning to achieve reliable browser automation while keeping costs lower than using third-party API providers for their production workloads. - **Background agents require hierarchical tool management:** When orchestrating 80-90 tools, reliability breaks down if all tools are available simultaneously. Sub-agents with access to specific tool subsets enable scalable multi-agent workflows that adapt based on real-time web information. → NOTABLE MOMENT Parikh reveals that despite initial assumptions, consuming web pages visually like humans rather than parsing underlying code proved essential for building reliable browser agents, as identical-looking pages often have completely different underlying structures. 💼 SPONSORS [{"name": "Capital One", "url": null}, {"name": "Agency (Linux Foundation)", "url": "https://agency.org"}] 🏷️ Browser Automation, AI Agents, Web Scraping, Multimodal AI

Never miss Devi Parikh's insights

Subscribe to get AI-powered summaries of Devi Parikh's podcast appearances delivered to your inbox weekly.

Start Free Today

No credit card required • Free tier available