Читать книгу The Innovation Ultimatum - Steve Brown - Страница 37

Solving Complex Problems by Learning from Experience

Some challenges—optimizing a system with many variables or programming a robot to walk on two legs—are too difficult, too complicated, or too laborious to tackle with traditional computers. AI solves some of these tricky problems using a technique called reinforcement learning.

Reinforcement learning is a branch of machine learning that uses a system of digital rewards and punishments as part of its training process. Reinforcement learning systems solve previously intractable problems through an iterative process of experimentation. The AI tries a range of strategies and learns the best way to approach a problem through an intelligent form of trial and error. It's like harnessing digital evolution.

Reinforcement learning teaches computers to perform complex optimizations, control complex equipment, and to play games really, really well. In 2018, researchers trained an AI to play the classic Sega console game, Sonic the Hedgehog. Sonic has two simple controls: run and jump. An AI was trained with the video game display as input and the game controls as the output. In reinforcement learning, AIs have an additional input known as a reward function. As the AI trains it tries to optimize the reward function. Game points increase the reward, and the reward decreases substantially if Sonic loses a life. At first, the AI plays terribly. Over time, the AI optimizes its model to run and jump at just the right moment, score maximum points, and keep the adorable blue hedgehog alive. The AI does not learn based on simple timing; it learns from what is happening on the screen, so it can succeed on game levels it has not seen before.

The most regularly cited example of reinforcement learning is DeepMind's AlphaGo system. DeepMind, a subsidiary of Alphabet, built AlphaGo to play the ancient Chinese game of Go. Winning strategies for Go are opaque; even grand masters can't always describe why they choose some of the moves they make—they say the move just “feels right.” There are more possible configurations for pieces on a Go game board than there are atoms in the universe. To build a machine that understands the nuances and subtle strategies of this complex game is a monumental challenge.

AlphaGo was not taught game strategies. It developed its own strategies through observation of many human versus human games. In March 2016, AlphaGo played 18-times world champion Lee Sedol, the best (human) Go player in the world. AlphaGo beat the legendary player, four games to one. To win, AlphaGo deployed several new strategies that went against hundreds of years of received wisdom among expert players. By observing AlphaGo's approach, human players have improved their play. This story offers an important lesson. Rather than consider AI a threat to our unique humanity and our value within the workplace, we might instead think of AI as a sophisticated partner, one that boosts our skills and that ultimately elevates our humanity.

In 2017, DeepMind's next machine, named AlphaGo Zero, became a master Go player by playing millions of games against itself inside a simulation. It developed game strategies through practice rather than by observing human play. AlphaGo Zero now thrashes the original AlphaGo machine and is unassailable by all human grand masters.

Reinforcement learning isn't just used to play games. Researchers at Warsaw University used reinforcement learning to train bipedal robots to walk more efficiently. The AI that controls the robots varies the combinations of movements made by the robot's motors and experiments with different walking strategies. The robot's AI gains a small electronic reward for strategies that speed the overall efficiency and pace of the walk. With this approach, roboticists achieved more efficient and natural-looking walking motions for their robots. One robot learned to walk almost twice as fast as it could using the best initial walking strategy programmed by its human creator.

AI's ability to learn from experience is used to solve many business problems, including complex optimizations. AIs optimize traffic control systems, industrial chemical reactions, advertising bids, industrial automation, supply chain flow, product design, warehouse operations, inventory levels, yields, trading strategies, wind turbine controls, medication doses, smart grids, and commercial HVAC systems. Reinforcement learning also teaches AIs to drive. Like humans, AIs learn to drive by practicing. They drive real cars in real-world conditions but also drive millions of miles inside realistic software simulations. In part, Tesla AIs learn to drive from sensor data gathered while owners are driving.

Подняться наверх