When you play chess or Go, you have perfect information. You know everything about the current state and past history of every piece on the board. When you play poker, or, let’s say, StarCraft, you have imperfect knowledge of both the current state and the past history of the game. Which is closer to life? Imperfect knowledge is, sadly (or blessedly), the truth of human existence for now. Google’s DeepMind is now leading its most powerful artificial intelligence engine deep into our world of ignorance. The result might be a new army of bots that are far more prepared for the “messiness of the real world.”
Reinforcement learning is often achieved using Markov decision processes, or MDPs. An MDP takes a particular state (S) and provides probabilities for moving from state to state for a series of actions (a).
Reinforcement learning is a type of machine learning where bots (or artificial intelligences, or behavioral models) are trained to act based on a current state of affairs in a specific environment where the bot has limited knowledge of the world they live in. These bots will develop a (really, really big) handbook for life that gives them some policies to follow when, in the StarCraft II example, you’ve got almost no health, you’re rich in diamonds, and there’s a high probability you’re about to come face to face with Arcturus Mengsk.
Unlock premium content and VIP community perks with GB M A X! Join now to enjoy our free and premium perks.