Читать книгу The Creativity Code - Marcus du Sautoy - Страница 12
‘Beautiful. Beautiful. Beautiful’
ОглавлениеIt was with a sense of existential anxiety that I fired up the YouTube channel broadcasting the matches that Sedol would play against AlphaGo and joined 280 million other viewers to see humanity take on the machines. Having for years compared creating mathematics to playing the game of Go, I had a lot on the line.
Lee Sedol picked up a black stone and placed it on the board and then waited for the response. Aja Huang, a member of the DeepMind team, would play the physical moves for AlphaGo. This, after all, was not a test of robotics but of artificial intelligence. Huang stared at AlphaGo’s screen, waiting for its response to Sedol’s first stone. But nothing came.
We all stared at our screens wondering if the program had crashed! The DeepMind team was also beginning to wonder what was up. The opening moves are generally something of a formality. No human would think so long over move 2. After all, there was nothing really to go on yet. What was happening? And then a white stone appeared on the computer screen. It had made its move. The DeepMind team breathed a huge sigh of relief. We were off! Over the next couple of hours the stones began to build up across the board.
One of the problems I had as I watched the game was assessing who was winning at any given point in the game. It turns out that this isn’t just because I’m not a very experienced Go player. It is a characteristic of the game. Indeed, this is one of the main reasons why programming a computer to play Go is so hard. There isn’t an easy way to turn the current state of the game into a robust scoring system of who leads by how much.
Chess, by contrast, is much easier to score as you play. Each piece has a different numerical value which gives you a simple first approximation of who is winning. Chess is destructive. One by one pieces are removed so the state of the board simplifies as the game proceeds. But Go increases in complexity as you play. It is constructive. The commentators kept up a steady stream of observations but struggled to say if anyone was in the lead right up until the final moments of the game.
What they were able to pick up quite quickly was Sedol’s opening strategy. If AlphaGo had learned to play on games that had been played in the past, then Sedol was working on the principle that it would put him at an advantage if he disrupted the expectations it had built up by playing moves that were not in the conventional repertoire. The trouble was that this required Sedol to play an unconventional game – one that was not his own.
It was a good idea but it didn’t work. Any conventional machine programmed on a database of accepted openings wouldn’t have known how to respond and would most likely have made a move that would have serious consequences in the grand arc of the game. But AlphaGo was not a conventional machine. It could assess the new moves and determine a good response based on what it had learned over the course of its many games. As David Silver, the lead programmer on AlphaGo, explained in the lead-up to the match: ‘AlphaGo learned to discover new strategies for itself, by playing millions of games between its neural networks, against themselves, and gradually improving.’ If anything, Sedol had put himself at a disadvantage by playing a game that was not his own.
As I watched I couldn’t help feeling for Sedol. You could see his confidence draining out of him as it gradually dawned on him that he was losing. He kept looking over at Huang, the DeepMind representative who was playing AlphaGo’s moves, but there was nothing he could glean from Huang’s face. By move 186 Sedol had to recognise that there was no way to overturn the advantage AlphaGo had built up on the board. He placed a stone on the side of the board to indicate his resignation.
By the end of day one it was: AlphaGo 1 Humans 0. Sedol admitted at the press conference that day: ‘I was very surprised because I didn’t think I would lose.’
But it was game 2 that was going to truly shock not just Sedol but every human player of the game of Go. The first game was one that experts could follow and appreciate why AlphaGo was playing the moves it was. They were moves a human champion would play. But as I watched game 2 on my laptop at home, something rather strange happened. Sedol played move 36 and then retired to the roof of the hotel for a cigarette break. While he was away, AlphaGo on move 37 instructed Huang, its human representative, to place a black stone on the line five steps in from the edge of the board. Everyone was shocked.
The conventional wisdom is that during the early part of the game you play stones on the outer four lines. The third line builds up short-term territory strength on the edge of the board while playing on the fourth line contributes to your strength later in the game as you move into the centre of the board. Players have always found that there is a fine balance between playing on the third and fourth lines. Playing on the fifth line has always been regarded as suboptimal, giving your opponent the chance to build up territory that has both short- and long-term influence.
AlphaGo had broken this orthodoxy built up over centuries of competing. Some commentators declared it a clear mistake. Others were more cautious. Everyone was intrigued to see what Sedol would make of the move when he returned from his cigarette break. As he sat down, you could see him physically flinch as he took in the new stone on the board. He was certainly as shocked as all of the rest of us by the move. He sat there thinking for over twelve minutes. Like chess, the game was being played under time constraints. Using twelve minutes of your time was very costly. It is a mark of how surprising this move was that it took Sedol so long to respond. He could not understand what AlphaGo was doing. Why had the program abandoned the region of stones they were competing over?
Was this a mistake by AlphaGo? Or did it see something deep inside the game that humans were missing? Fan Hui, who had been given the role of one of the referees, looked down on the board. His initial reaction matched everyone else’s: shock. And then he began to realise: ‘It’s not a human move. I’ve never seen a human play this move,’ he said. ‘So beautiful. Beautiful. Beautiful. Beautiful.’
Beautiful and deadly it turned out to be. Not a mistake but an extraordinarily insightful move. Some fifty moves later, as the black and white stones fought over territory from the lower left-hand corner of the board, they found themselves creeping towards the black stone of move 37. It was joining up with this stone that gave AlphaGo the edge, allowing it to clock up its second win. AlphaGo 2 Humans 0.
Sedol’s mood in the press conference that followed was notably different. ‘Yesterday I was surprised. But today I am speechless … I am in shock. I can admit that … the third game is not going to be easy for me.’ The match was being played over five games. This was the game that Sedol needed to win to be able to stop AlphaGo claiming the match.