The only limitation of both old models like HHM and the current chatbots is that they don't have a concept of the state of the chess board.
I'm not so sure. I was recently working on some font rendering software with ChatGPT to parse the glyphs of TrueType files. The glyphs are described with contours called bezier curves which are described with loops of (x,y) points called contours. I had parsed out the (x,y) points for the contours of the character 'a' in some rare font, and I gave ChatGPT these points (but I didn't tell it what character they were from). It suddenly occured to me to ask a side question of whether it could recognize this character from these contour points, and it could. It said it "looked" like the letter 'a'.
How in the world can a language model do that? The data had been normalized in an unusual way and the font was very rare. There is zero chance it had memorized this data from the training set. It absolutely must have visualized the points somehow in its "latent space" and observed that it looked like an 'a'. I asked it how it knew it was the letter 'a', and it correctly described the shape of the two curves (hole in the middle, upward and downward part on right side) and its reasoning process.
There is absolutely more going on in these Transformer-based neural networks than people understand. It appears that they have deduced how to think in a human-like way and at human levels of abstraction, from reading millions of books. In particular they have learned how to visualize like we do from reading natural language text. It wouldn't surprise me at all if they can visualize a chess board position from a sequence of moves.