Skip to main content
Software Engineering Daily

Small AI Models with Yoeven Khemlani

40 min episode · 2 min read
·

Episode

40 min

Read time

2 min

Topics

Artificial Intelligence

AI-Generated Summary

Key Takeaways

  • Small model strategy: Train 70B parameter models instead of 400B by specializing for single use cases, enabling deployment on A100 GPUs rather than requiring H100s, reducing infrastructure costs while maintaining 97-98% accuracy for specific tasks like structured web scraping.
  • Prompt engine architecture: Routes prompts across five models simultaneously, uses mixture of agents technique where smaller models judge outputs, then converges on best answer. By tenth execution, system locks to single optimal model, ensuring consistency while initially guaranteeing quality through consensus.
  • GPU-poor methodology: Build all models to run on accessible hardware (A100, A10G) rather than premium GPUs, prioritizing deployability for enterprise self-hosting over raw performance. This distribution strategy enables customers to deploy on AWS, Azure, GCP, or on-premise infrastructure without restrictions.
  • Developer experience principle: Design SDK with complete TypeScript typing so developers never need documentation for basic usage. NPM install provides intuitive API structure modeled after Stripe, where method names and parameters are self-explanatory, reserving docs only for advanced configurations.

What It Covers

Yoeven Khemlani explains how Jigsawstack builds specialized small AI models (70B parameters) for backend automation tasks like web scraping, OCR, and translation, achieving 98% accuracy while remaining deployable and cost-efficient at $1.40 per million tokens.

Key Questions Answered

  • Small model strategy: Train 70B parameter models instead of 400B by specializing for single use cases, enabling deployment on A100 GPUs rather than requiring H100s, reducing infrastructure costs while maintaining 97-98% accuracy for specific tasks like structured web scraping.
  • Prompt engine architecture: Routes prompts across five models simultaneously, uses mixture of agents technique where smaller models judge outputs, then converges on best answer. By tenth execution, system locks to single optimal model, ensuring consistency while initially guaranteeing quality through consensus.
  • GPU-poor methodology: Build all models to run on accessible hardware (A100, A10G) rather than premium GPUs, prioritizing deployability for enterprise self-hosting over raw performance. This distribution strategy enables customers to deploy on AWS, Azure, GCP, or on-premise infrastructure without restrictions.
  • Developer experience principle: Design SDK with complete TypeScript typing so developers never need documentation for basic usage. NPM install provides intuitive API structure modeled after Stripe, where method names and parameters are self-explanatory, reserving docs only for advanced configurations.

Notable Moment

Jigsawstack benchmarked their OCR model against Mistral's self-proclaimed world's best OCR and found significant performance gaps, demonstrating that specialized small models from focused startups can outperform rushed releases from well-resourced companies attempting to enter adjacent markets.

Know someone who'd find this useful?

You just read a 3-minute summary of a 37-minute episode.

Get Software Engineering Daily summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Software Engineering Daily

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best Cybersecurity Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Software Engineering Daily.

Every Monday, we deliver AI summaries of the latest episodes from Software Engineering Daily and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime