Videos » Why 80% Accuracy Fails in AI Agents: Inside Amazon AGI Labs

Why 80% Accuracy Fails in AI Agents: Inside Amazon AGI Labs

Posted by admin
At NeurIPS 2025, we sat down with Deniz Birlikci, Member of Technical Staff at Amazon AGI Labs, to discuss the cutting edge of web agents. Deniz dives deep into the challenges of Reinforcement Learning (RL) in sparse environments, explaining why "mocking" a website to 80% fidelity isn't enough to train reliable models. He also introduces Amazon’s newly released Nova Act model, explains why agents should be viewed as "stacks" rather than just models, and paints a picture of a future where agents act as the API for the entire web. A huge thank you to Lambda for sponsoring the SAIL booth at NeurIPS 2025 and making these interviews possible. In this video: 00:00 - Introduction & Deniz’s role at Amazon AGI Labs 00:32 - The state of RL research at NeurIPS 2025 01:20 - Unreliable environments & the challenge of sparse rewards 02:49 - Why 80% website fidelity creates "noise" and failure 04:51 - Launching Amazon Nova Act: Building for enterprise & developers 06:00 - The "Agent Stack": Why the model is only one piece of the puzzle 07:02 - The Future: Agents as Connective Tissue (APIs) vs. Assistants 08:52 - The startup culture inside Amazon AGI SF Labs #NeurIPS2025 #ArtificialIntelligence #WebAgents #ReinforcementLearning #AmazonAGI #NovaAct #MachineLearning #Lambda #SAIL
Posted December 30, 2025
click to rate

Embed  |  81 views