Google researchers improve reinforcement learning by having their AI play Pong

Deep reinforcement learning — an AI training technique that employs rewards to drive software policies toward goals — has been tapped to model the impact of social norms, create AI that’s exceptionally good at playing games, and program robots that can recover from nasty spills. But despite its versatility, reinforcement learning (or “RL,” as it’s typically abbreviated) has a showstopping shortcoming: It’s inefficient. Training a policy requires lots of interactions within a simulated or real-world environment — far more than the average person needs to learn a task.

To remedy it somewhat in the video gaming domain, researchers at Google recently proposed a new algorithm — Simulated Policy Learning, or SimPLe for short — which uses game models to learn quality policies for selecting actions. They describe it in a newly published preprint paper (“Model-Based Reinforcement Learning for Atari“) and in documentation accompanying the open-sourced code.

Unlock premium content and VIP community perks with GB M A X!
Join now to enjoy our free and premium membership perks.

Join Now

Already a member? Sign in