KataGo and ChatGPT Failures

AlphaGo was an amazing breakthrough and very impressive in its ability to win against professional go players. It was really surprising that evaluating a go position using neural nets and machine learning would work as well as it did. And yet it kept beating professionals.

But now, some cracks are showing. In their excellent paper “Adversarial Policies Beat Superhuman Go AIs” (https://goattack.far.ai), Tony Wang, Adam Gleave, et al. are using an adversarial approach to figure out techniques that work against KataGo. And indeed: after a few tries, I managed to kill a huge group and win, and KataGo did not see it coming until too late. It doesn’t realize that its circular group that surrounds a dead group with a large eye needs enough outside liberties to actually remove that dead group.

Katago losing
KataGo doesn’t realize the dead white group has more liberties than the surrounding black group

This technique takes advantage of KataGo (1) not knowing enough about liberties and (2) not knowing enough about the topology of blocks on the go board. When training its model, there are special inputs only for 1, 2, or 3 liberties. There’s also no explicit concept of adjacent blocks; the neural net has to learn that concept from the board position. And the way the neural net is trained, it doesn’t create good enough abstractions for those concepts. In normal games, this is sufficient to play better than any human, but in corner cases, it falls apart.

I would argue that blocks of stones, liberty counts, and race to capture are an essential part of the underlying model you need when playing go (see e.g. Richard Hunter’s book “Counting Liberties and Winning Capturing Races”). And machine learning (at least the way we’re doing it now) is not a great way to build that model. You’ll end up with gaps of knowledge and approximations that will fall down at critical points.

Against that background, the failures of ChatGPT make more sense. Machine learning didn’t build a model of the world, it just learned to put words together in a way that seems to make sense. Often impressive, but a lot of recent examples demonstrate that it doesn’t actually understand what’s going on.

And machine learning for self-driving cars is also based on lots of inputs, but only a very limited model of the world. Like KataGo, it will fail in corner cases. And that’s scary.