How to optimize a build for a chess engine?

An update: I built a new box. 13900k, RTX 3080, 64GB of DDR5-4800. I can get right around 27 million nodes/second in Stockfish 15.1, which feels pretty damn insane. It's still bizarrely not that much better than the 20 million nodes/second that a 9900k should be getting.
Hold your horses, 27 is 35% faster than 20. That is significant.
 
  • Like
Reactions: continuum

BillFoster

Ars Tribunus Militum
2,221
Obvious question, are all the cores busy, and what clock speeds/temps are you seeing?
I'm fully utilizing 24 cores over 24 threads of Stockfish. The 13900k is thermally throttling, and is around 300W TDP. Stockfish is utilizing 44GB of RAM for the hash table. P-cores are running at 5.5Ghz.

It's a pretty insane level of computation.
 

BillFoster

Ars Tribunus Militum
2,221
Is it any better with 32 threads, to use 16 threads on the hyperthreaded P cores, and run another 16 threads on the E cores? I'd expect it to be more power efficient for the integer parts of the computation, which might help if you are thermally limited.
From what I've read, Stockfish doesn't do well with hyperthreading. I'm not sure about the technical details, but nobody does it so I assume it's not very performant.
 

BillFoster

Ars Tribunus Militum
2,221
I reckon the bottleneck is the memory.
It's possible, I suppose. But I'm using DDR5-4800. That's not super fast, but it's not like crazy slow, either.

Yet some benchmark sites (https://openbenchmarking.org/test/pts/stockfish) report a 13900k getting almost 68 million nodes/second. That makes no sense. Okay, maybe I'm not using super fast benchmark memory. But is going to DDR5-6000 going to give me 2.5x performance? I severely doubt it.

It just doesn't make sense to me.

EDIT: And I am running the memory in dual channel mode. It's 32GB x 2.
 

BillFoster

Ars Tribunus Militum
2,221
Coming in realllllly late here.... Assuming you are a member of chess.com have you tried the Stockfish analysis if a game/position you upload?

Edit: in playing with this. it looks like it is running on my laptop in Javascript within the browser. What you have is most likely vastly better. :)
I am a chess.com member, and I've used their Javascript Stockfish analysis. I mean it's better than nothing, but yes I can get something like 10x as many nodes/second running Stockfish though Chessbase.

EDIT: If it matters, I also use Chessbase because I often do parallel analysis through Stockfish and also LCZero. They occasionally differ, so I like to have both. And LCZero is really light on the CPU, heavy on GPU, so running them in parallel is not really a problem.
 

isaacweber613

Smack-Fu Master, in training
1
I am a chess.com member, and I've used their Javascript Stockfish analysis. I mean it's better than nothing, but yes I can get something like 10x as many nodes/second running Stockfish though Chessbase.

EDIT: If it matters, I also use Chessbase because I often do parallel analysis through Stockfish and also LCZero. They occasionally differ, so I like to have both. And LCZero is really light on the CPU, heavy on GPU, so running them in parallel is not really a problem.
Could you please share which chest program you are using to run stockfish on your computer? I'm asking cuz I also like chess.
 

Cornfed

Smack-Fu Master, in training
1
I use Chessbase 17...every day. Largely for analysis of games I've played on Playchess, chess.com or lichess. Love it!

I am also about to get a new computer. I am seriously thinking of the AMD Ryzen 7945HX, the laptop 'equilvilant' of the desktop 7950x. I want the lower power draw and many reviews pegs things like Cinebench 23 close to the desktop CPU version even when using a lot less power/heat. As with the Intel, diminishing returns set in real quick in these high end processors and you are mostly burning energy for very little benefit.

I considered the (non X) version of the 7900 desktop as with it you get the efficiency goodness ...and could always overclock (with proper cooling- don't need a watercooler with it out of the box) if one wished, but kind of miss having a laptop.
 

malor

Ars Legatus Legionis
16,093
Yet some benchmark sites (https://openbenchmarking.org/test/pts/stockfish) report a 13900k getting almost 68 million nodes/second. That makes no sense.
Since this thread was resurrected anyway, I'll go ahead and reply to this 18-month-old comment: my guess would be that your cooler isn't keeping up. You mention that you're thermally throttled, so a better cooler is certain to improve performance. How much performance, I can't tell, but you'll gain at least something if you can keep the chip under its throttle temp.

That plus faster DDR5 might improve your benchmark results, maybe even dramatically, but I strongly doubt you'd triple performance. That person might be using liquid nitrogen cooling or something similarly weird. Or they could just be cheating. People are weird like that.
 

streaming_pro

Smack-Fu Master, in training
2
Also have you checked which CPU compiled version of stockfish you are using?

I assume a 13900k should be the AVX2 version rather than the standard 64bit one. An AMD 7900 can get 20M n/s with the standard 64bit version, or 36M with AVX2, and peaks as high as 40M with AVX512.
How many percent faster is ryzen 7950x than i9 13900k?