Back Issues This Week → Calendar → Current Issue → Popular →

All issuesVolume 325, Issue 4IT NewsAI

Tracking And Ranking AI Agents

techstrong.ai, Friday, April 25th, 2025

In this Techstrong AI video, Galileo CTO, Atin Sanyal, dives into why the capabilities of artificial intelligence (AI) agents will need to be continuously tracked and ranked.

In this Techstrong AI interview, Mike Vizard talks with Atin Sanyal, CTO and co-founder of Galileo, about their new AI agent leaderboard designed to benchmark agent performance for real-world, industrial use cases.

Sanyal explains that Galileo's mission grew out of the challenges he observed at Uber and Apple, where a lack of robust AI evaluation often led to production failures. Instead of relying solely on academic benchmarks, Galileo's leaderboard evaluates agents across 25 industry-specific tasks, using proprietary metrics to reveal surprising differences between models' practical performance and their academic reputations.

more →  ·  More from AI →