Category: Programming

SmartOthello

My Othello app is now available in the App Store — check it out at smartothello.com. Even if you’re not interested in Othello/Reversi, it will give you an idea of the future direction of my Go apps. And next time you play Go and somebody asks whether that’s Othello, at least now you have an app you can recommend.

SmartOthello is 100% Swift: it was a perfect way to learn Swift while building up code I can reuse for my Go apps. It’s also my first app to support Game Center, including achievements and leaderboards. My experience with Swift has been really good; my experience with Game Center less so.

SmartOthello is also a reboot in terms of user interface. The clean design that Scott Jensen came up with for Othello will definitely influence the Swift version of SmartGo. For example, the games list sliding in from the left leaves more room for the board on the iPad; the ability to turn off the status bar again provides more room and less distraction.

The tutorial in SmartGo Player uses Go Books under the hood, so the Swift version of Go Books is up next. Yes, this conversion is taking a while, but I’m planning to live with these apps for many more years. After launching my first Swift app, I’m more convinced than ever that the investment is worth it.

Othello

That separate Swift project I hinted at in December? Time to announce what it is: an app for Othello (also known as Reversi).

Why Othello?

As a two-player board game, Othello is similar enough to Go that much of the Swift code for an Othello app can be reused for Go. But Othello apps are a dime a dozen in the App Store: who needs another one? Well, you do — you deserve better than the current crop of Othello apps.

Relevant experience

Most people associate me with only one game: Go. However, I do have a bit of history with Othello.

  • Computer Othello: My first Othello program played in a tournament in Santa Cruz in 1981, long before I first made it to the USA. My work on Othello got Prof. J. Nievergelt to introduce me to Go, and my Ph.D. thesis included a chapter on Othello (“Smart Game Board: a Workbench for Game-Playing Programs, with Go and Othello as Case Studies”).
  • Human Othello: I was Swiss Othello Champion in 1983, 84, 85, and 89, and United States Othello Champion in 1992. My tournament experience includes six Othello World Championships: Paris (1983 & 1988), Warsaw (1989), Stockholm (1990), New York (1991), and Barcelona (1992).

Unique combination

So yes, combining years of iPhone development, user interface experience from SmartGo, and expert knowledge of Othello, I do think I have something unique to bring to a crowded field of Othello apps.

I’ve been working on SmartOthello with designer Scott Jensen (@_scottjensen); it’s making good progress, and I have just started limited beta testing. I’m very excited about how it’s turning out, and what it means for the future of my Go apps.

More later. Meanwhile, you can sign up for news about SmartOthello at smartothello.com, and follow @smartOthello on Twitter or Facebook.

PS: I played in an Othello tournament in Los Angeles in March: 4 wins and 6 losses, definitely a bit rusty. At least I scored a 33-31 win against former World Champion Ben Seeley.

Wishful Thinking

Lee Sedol’s strategy in game 4 worked brilliantly (well explained in the excellent Go Game Guru commentary). It took AlphaGo from godlike play to kyu-level petulance. When it no longer saw a clear path to victory, it started playing moves that made no sense.

AlphaGo optimizes its chance of winning, not its margin of victory. As long as that chance of winning was good, this worked well. When the chance of winning dropped, AlphaGo’s quality of play fell precipitously. Why?

Ineffective threats

The bad moves that AlphaGo played include moves 87 and 161: threats that just don’t work, as they can easily be refuted, and either lose points, or at least reduce future opportunities. When AlphaGo plays such a move, it’s smart enough to find the correct local answer and figure out that the move doesn’t actually work. However, the Monte Carlo Tree Search component (MCTS) will also look at other moves that don’t answer that threat, as there is always a chance that the opponent plays elsewhere. Thus AlphaGo sees a non-zero chance that this threat actually works, and the way MCTS calculates the statistics it thinks that this increases its chance of winning.

Of course, the opposite is true. Playing a threat that can easily be refuted is just wishful thinking. The value network would figure out that such an exchange actually makes the position worse, but it doesn’t know that it should override the Monte Carlo simulations in this case.

Adjusting komi

One way to avoid this effect is to internally adjust the komi until the program has a good chance of winning. This causes the program to play what it thinks are winning moves, while in fact it will lose by the few points you artificially adjusted the score. If the opponent makes a mistake, the program might regain a real winning position later. (SmartGo uses this technique; it also helps play more reasonable moves in handicap games.)

For AlphaGo, that technique won’t work well: as I understand it, the value network is trained to recognize whether positions are good for Black or for White, not by how many points a player is ahead.

Known unknowns

Another idea is to look at the source of uncertainty in MCTS. The Monte Carlo winning percentages are based on statistics from the playouts, and there are many uncertainties in that process due to the random nature of the playouts and the limited nature of the search. The more moves you look at, the smaller the unknowns become, and the statistical methods used to figure out which moves to explore more deeply and how to back up results in the search tree try to minimize these uncertainties.

However, whether the opponent will answer a threat is a yes-or-no decision; it should not be treated like a statistical unknown. In that case, you want to back up the results in the tree using minimax, not percentages. Something for the DeepMind team to work on before they challenge Ke Jie, so AlphaGo won’t throw another tantrum.

AlphaGo Don’t Care

AlphaGo is badass. Like the honey badger, AlphaGo just don’t care.

Lee Sedol may have underestimated AlphaGo in game 1, but he knew what he was up against in game 2. I watched Michael Redmond’s commentary during the game, then Myungwan Kim’s commentary this morning. The Go Game Guru commentary is also very helpful.

The tenuki at move 13: Professionals always extend at the bottom first? AlphaGo don’t care. It builds a nice position at the top instead.

The peep at move 15: This is usually played much later in the game, and never without first extending on the bottom. AlphaGo don’t care. It adds 29 later, and makes the whole thing work with the creative shoulder hit of 37. It even ends up with 10 points of territory there.

With 64 and 70, Lee Sedol made his group invulnerable to prepare for a fight at the top. AlphaGo don’t care, it just builds up its framework, and then shows a lot of flexibility in where it ends up with territory.

Lee Sedol threatens the territory at the top with 166? AlphaGo don’t care, it just secures points in the center instead. Points are points, it doesn’t matter where on the board they are.

What can Lee Sedol do in the next games? I think he needs to get a complicated fight going early in the game, start ko fights, in general increase the complexity. But I fear AlphaGo just won’t care.

Four More Games

AlphaGo’s victory over Lee Sedol last night was stunning. I’m still gathering my thoughts and trying to figure out what happened.

The game analysis at Go Game Guru has been very helpful. But I have to wonder whether I can trust the commentary — maybe AlphaGo knew what it was doing?

Move 80 was described by Younggil An 8p as ‘slack’. I wonder whether AlphaGo at that point already calculated that it was winning, and that eliminating the aji (latent possibilities) in that area would be the best way to reduce the risk of losing. I would love to know more about AlphaGo’s evaluation of that move.

AlphaGo demonstrated that it’s good at fighting, and would not back down from a fight. It also showed excellent positional judgement and timing, managing to invade on the right side with 102, get just enough out of that fight, and end with sente to play the huge move of 116 to take the upper left corner. And it’s not letting up in the endgame once it sees a path to victory. We have not seen any ko fights yet, but there’s no reason to believe AlphaGo couldn’t handle those well.

For the remaining games, I think Lee Sedol must establish a lead by mid game at the latest to have a chance of winning. As the game gets closer to the end, there are fewer moves to consider, and there are fewer moves for the Monte Carlo playouts to reach the end of the game, so AlphaGo will just get stronger.

Move 7 was a new move. At least, it’s not in the GoGoD database of 85,000 professional game records. With SmartGo’s side-matching, only two games (both played in 2013) match that right-side position. He probably tried to make sure AlphaGo couldn’t just rely on known patterns, but that gambit didn’t pay off. I don’t think Lee Sedol will try a similar move tonight.

There are four more games; I would not count Lee Sedol out yet. He now knows what AlphaGo can do, and won’t underestimate it again. We have some very exciting games to look forward to.

Late Nights with AlphaGo

Google has announced times and time limits for Lee Sedol’s match against AlphaGo. Games start at 13:00 (UTC+9), which means they’ll start in the evening the day before in the US:

  • Tuesday March 8: 8 p.m. PST, 11 p.m. EST
  • Wednesday March 9: 8 p.m. PST, 11 p.m. EST
  • Friday March 11: 8 p.m. PST, 11 p.m. EST
  • Saturday March 12: 8 p.m. PST, 11 p.m. EST
  • Monday March 14: 9 p.m. PST, midnight EST (DST!)

And Michael Redmond 9p will be commenting in English. Mark your calendars and stock up on popcorn.

Time limits are two hours per player, plus 3×1-minute byo-yomi. Thus after basic time is up, players can use up to a minute for every move; three times they can spend an extra minute. The article says each game is thus expected to last 4-5 hours, but if AlphaGo uses its full two hours (instead of playing very fast as in the Fan Hui match), it could easily go longer. Be prepared for some late nights.

I’m glad they increased the time limits from the Fan Hui match; this should be a very exciting match. Just one new tidbit since my Lee Sedol vs AlphaGo blog post — on the Computer Go mailing list, Aja Huang commented today: “We are still preparing hard for the match. … AlphaGo is getting stronger and stronger.”

Lee Sedol vs AlphaGo

Google’s AlphaGo beat the European Go champion Fan Hui in October, and Google has challenged Lee Sedol to a five-game match in March. How can we assess his chances?

Analysis of October games

As an amateur 3 dan, I can understand much of what’s going on in the five games AlphaGo played against Fan Hui, but professional analysis reveals subtleties and deeper issues:

Also, this PDF by the British Go Association includes some history of computer Go, more background on the match, and an analysis with comments by Hajin Lee 3p.

The conclusions I draw from all of this:

  • Fan Hui made a number of mistakes that Lee Sedol is unlikely to make.
  • While AlphaGo played very well, it did make some mistakes in those five games. Also, Fan Hui did win two unofficial games against AlphaGo (sadly unpublished).
  • AlphaGo’s reading (looking ahead many moves to determine whether a plan will work or not) is very strong.
  • AlphaGo sometimes mimics the play of professional players and follows standard patterns that may not be optimal in that specific situation. Professional players are more creative and will vary their play more based on subtle differences in other parts of the board.
  • AlphaGo may not have a nuanced enough understanding of the value of sente (having the initiative).
  • AlphaGo doesn’t show deep understanding of why a move is played, or the far-reaching effects of a move.

So I’m confident that the October version of AlphaGo was weaker than Lee Sedol. And those five games only give us a limited view of AlphaGo; there are probably more weaknesses to be discovered. Ko was only played once; AlphaGo did well, but we don’t know how it will do in a complex, protracted ko fight. We don’t know how it will do when the fighting gets more complex. We don’t know how it will do when the board is more fluid and multiple local positions are left unresolved.

March version of AlphaGo

Google has five months to improve AlphaGo. So what can they do?

During the Deep Blue match, engineers could make adjustments to Deep Blue’s algorithm between games, such as fixing a bug after game 1. For AlphaGo, it may not be as easy. For example, move 31 in game 2 (shown below) was a mistake by AlphaGo (see Myungwan Kim’s analysis). It’s the right local move under the right circumstances, but AlphaGo doesn’t have a sufficient understanding of that board position. How can they fix that issue? Feeding that position to the neural network won’t help balance out what it has learned in 30 million other positions. There’s no quick fix for that kind of mistake.

Blog alphago game 2 move 31

Avoiding specific mistakes is easier. AlphaGo was not using an opening library in October, but Google could easily add that by March, making it at least possible to adjust play between games so Lee Sedol won’t be able to exploit a particular joseki mistake in multiple games.

There are many other improvements Google can make before March:

  • Google can refine AlphaGo’s neural networks. They used 30 million positions to train the value network for the Fan Hui match — they can use 100 million for the Lee Sedol match.
  • They can add extra training for the rollout policy. And then feed the improved rollout policy into the training of the value network.
  • They can fine-tune the balance between rollouts and the value network.
  • They can throw more computing power at the match itself. And the match will likely have longer time limits, so AlphaGo can calculate more during the opponent’s time.
  • And more. Google has a strong team that wants to win, and they’ll have other ideas up their sleeves.

Google has also expressed confidence, and they chose the opponent and timing for this next match. So I’d expect the March version of AlphaGo to be significantly stronger than the one that played last October. Its reading is going to be even better. Its assessment of the global position will be improved.

The main question is whether those improvements will be enough to remedy or at least balance out the weaknesses seen in the October version. Google made a huge leap forward with AlphaGo, creating a qualitative difference in computer play, not just a quantitative one. It’s hard to tell what another five months of work and neural network training can do.

Conclusion

The AlphaGo from last October was very strong, but probably not strong enough to beat Lee Sedol. With five months of work, I think AlphaGo is going to be a different beast in March, and it will be a very exciting match. If I had to bet? Lee Sedol will lose a game or two but win the match.