AI Summary
→ WHAT IT COVERS SelectStar founder Shinji Kim explains how automated metadata platforms solve data discovery challenges by analyzing query logs to build knowledge graphs, enabling AI agents to generate accurate SQL through popularity scores, lineage tracking, and semantic models. → KEY INSIGHTS - **Automated metadata collection:** SelectStar parses SQL query logs to track which tables join together, join conditions, and usage frequency across users, creating a knowledge graph without manual documentation that reveals actual data relationships and trust signals through behavioral patterns. - **Three-layer metadata architecture:** Physical assets form layer one, usage signals like popularity and lineage comprise layer two, and business context including semantic models and metrics definitions make layer three. This structure enables AI to find correct datasets and generate accurate queries. - **Cost optimization through usage tracking:** Organizations reduce cloud warehouse billing by identifying unused tables and unviewed BI dashboards through popularity metrics. Combining lineage with usage data reveals which data models consume resources without delivering value to end users or downstream systems. - **MCP server for AI workflows:** SelectStar's Model Context Protocol server provides four tools—metadata search, asset details, lineage traversal, and impact analysis—that enable AI agents in Claude and Cursor to generate queries with higher accuracy by accessing popularity scores and example queries. → NOTABLE MOMENT Kim reveals that foundation models trained on world data fail against real enterprise databases because messy data with similar table names, denormalized structures, and multi-level calculations causes hallucinations that example queries and popularity context prevent. 💼 SPONSORS None detected 🏷️ Data Discovery, Metadata Management, Text-to-SQL, Semantic Layer
