Skip to main content
SK

Shinji Kim

Selectstar Founder Shinji Kim Explains How**automated Metadata Collection**three-layer Metadata Architecture**cost Optimization Through Usage Tracking**mcp Server for AI Workflows
1episode
1podcast

We have 1 summarized appearance for Shinji Kim so far. Browse all podcasts to discover more episodes.

Featured On 1 Podcast

All Appearances

1 episode

AI Summary

→ WHAT IT COVERS SelectStar founder Shinji Kim explains how automated metadata platforms solve data discovery challenges by analyzing query logs to build knowledge graphs, enabling AI agents to generate accurate SQL through popularity scores, lineage tracking, and semantic models. → KEY INSIGHTS - **Automated metadata collection:** SelectStar parses SQL query logs to track which tables join together, join conditions, and usage frequency across users, creating a knowledge graph without manual documentation that reveals actual data relationships and trust signals through behavioral patterns. - **Three-layer metadata architecture:** Physical assets form layer one, usage signals like popularity and lineage comprise layer two, and business context including semantic models and metrics definitions make layer three. This structure enables AI to find correct datasets and generate accurate queries. - **Cost optimization through usage tracking:** Organizations reduce cloud warehouse billing by identifying unused tables and unviewed BI dashboards through popularity metrics. Combining lineage with usage data reveals which data models consume resources without delivering value to end users or downstream systems. - **MCP server for AI workflows:** SelectStar's Model Context Protocol server provides four tools—metadata search, asset details, lineage traversal, and impact analysis—that enable AI agents in Claude and Cursor to generate queries with higher accuracy by accessing popularity scores and example queries. → NOTABLE MOMENT Kim reveals that foundation models trained on world data fail against real enterprise databases because messy data with similar table names, denormalized structures, and multi-level calculations causes hallucinations that example queries and popularity context prevent. 💼 SPONSORS None detected 🏷️ Data Discovery, Metadata Management, Text-to-SQL, Semantic Layer

Never miss Shinji Kim's insights

Subscribe to get AI-powered summaries of Shinji Kim's podcast appearances delivered to your inbox weekly.

Start Free Today

No credit card required • Free tier available