Analysing 1,583,444 Chess Games Using Small Data • Peter Ellis Jones

Recently I’ve been working on a chess engine in Rust. After completing the move generation part of the engine (source code now available here), I’m now working on evaluation. Evaluation answers the question: given two chess positions, which is better and by how Much? It’s possibly the most important part of a chess engine. It gives the engine it’s character as well as being the determining factor in it’s performance.

In chess, you win by checkmating the opponent’s king. Therefore given two chess positions, one where your opponent is checkmated, and the other where she isn’t, the first is objectively better as you’ve won the game. However for all other positions, we can only evaluate them using heuristics ie by making educated guesses about how likely they are to help you checkmate your opponent.

Some examples of heuristics used by modern chess engines are:

A queen is worth nine pawns
A pawn near the other side of the board is good because it’s easier to promote it to a queen
Two bishops on different coloured squares are better than two bishops on the same coloured squares as they won’t block each other
Knights are good when the board is crowded since they can jump over pieces

This is just a small list, but modern chess engines typically evaluate hundreds of features. For example, contributors to the open source chess engine Stockfish are constantly tweaking these heuristics to try and give the engine an edge in thousands of simulated games.

To get an idea of what heuristics might matter for my chess engine, I decided to try and look at some historical data. King Base is a free chess database of over a million real games played in competitions between top players (ELO rating greater than 2000), played from 1990 until now. The games are all in Portable Chess Notation so I made a quick parser in rust and extracted some stats to CSV. I then loaded the 150mb file into Google’s Big Query to ask some questions on the resulting 1,582,444 games.

In this article I’m just looking at basic data like player ELO and game length. In the next article I’ll look at positional features and how they relate to game outcomes.

Is it better to play as white?

Yes! White wins 36.42% whereas black only wins 28.08% of the time.

How much does ELO difference matter?

ELO is the standard rating system for chess. As you can see from the chart below, when players are evenly matched, you can win about 25% of the time. To take this up to 50% you need a +140 ELO score difference.

How long do games last?

This chart looks at the number of moves (technically, half-moves) for games based on the game outcome. The spikes at 80 half-moves is to do with time controls — the answer given to me on chess.stackexchange.com is that players are often short on time leading up to 40 moves, 60 moves, and so on depending on the particular rules of the tournament. Once time pressure is relaxed at the eg, 40 move mark, players are more likely to consider resigning.

Do good players draw more often?

Yes! From the chart below you can see that draws are much more frequent between high-ranking players.

That’s all for now. Thanks to Plotly for their amazing (free!) online charting software.