The Zen Thread

hobold

Ars Tribunus Militum
2,657
Zen 5 announcement on AM5 a bit boring thus far. The same model line-up with 16, 12, 8, and 6 cores, now with Ryzen 9000 branding. An unexciting but solid 16% IPC improvement as per AMD's numbers, at the same clock speeds. But anything less than 16 cores now at lower, saner, TDP. No X3D models announced yet; presumably an ace up the sleeve waiting for whatever Intel will announce later this year.

Things are a bit more exciting with server and mobile Zen 5, where efficiency improvements directly lead to more performance and higher core counts.

Of the benchmarks AMD mentioned, gaming looked on par with Ryzen 7000X3D, so upcoming 9000X3D models are bound to be very nice, whenever they will eventually show up. The high outlier at >30% speedup was indeed an AVX-512 workload, as I had guessed earlier.

A new mainboard tier X870(E) with mandatory USB4 will come; mid-range seems to retain the B650 branding, but new models with USB4 seem to be waiting in the wings.

I am not disappointed like some other folks who were more hyped up. IMHO it is remarkable that double-digit IPC improvements are still possible even when transistors barely shrink anymore these days. Hopefully we'll learn more details about Zen 5's AVX-512 implementation soon. I have had some success toying with std::experimental::simd (a proposed C++ programming interface for using the various SIMD extensions of various processor families). It's looking good, so maybe towards the end of the year I'll be doing more SIMD programming on a Ryzen 9000 machine of my own.
 

Paladin

Ars Legatus Legionis
32,552
Subscriptor
Yeah, honestly, if they could confidently say that they have eliminated a lot of potential security and stability issues, worked out the memory training and stability issues and worked to make motherboards both cheaper and more realiable with their chips and had just a decent increase in performance and efficiency, I would be raving about them.
 

mpat

Ars Praefectus
5,951
Subscriptor
A brilliant move of course. There isn't power for 12 mega-turbo cores anyway. AMD has totally crushed Intel on this, if you ask me.
Depends on the inter-core latency. Presumably this has been done by making 3 clusters of 4 cores. Desktop Zen 5 has 8 or 16 cores, most likely still in clusters of 8 cores like Zen 3 and 4. If you remember the Zen 3 launch, use cases where you needed more than 4 cores on one task were improved by moving to one cluster of 8.

Other than that, I agree that it is likely more power-efficient.
 

Aeonsim

Ars Scholae Palatinae
1,057
Subscriptor++
The doubled AVX-512 performance will likely be useful for some of my workloads. I assume they've upgraded the AVX-512 engines from the previous half-width (using 256b width I believe, so 2 cycles per operation) to full width? It will be interesting to see if they can run these at full clock speed. A very different direction of travel from Intel which appears to be pulling AVX-512 back from everything that isn't server focused (lunar-lake apparently removes the hardware rather than disabling it).
 
The doubled AVX-512 performance will likely be useful for some of my workloads. I assume they've upgraded the AVX-512 engines from the previous half-width (using 256b width I believe, so 2 cycles per operation) to full width? It will be interesting to see if they can run these at full clock speed. A very different direction of travel from Intel which appears to be pulling AVX-512 back from everything that isn't server focused (lunar-lake apparently removes the hardware rather than disabling it).
Pretty sure it was 1 cycle per operation throughput but it used both EUs so you couldn't dual issue.

Intel's new E cores now have 512 bit FMA, similar to Zen 4, which is hopefully a step towards finally unifying everything with AVX10. Maybe Zen 6 and Panther Lake finally fix this situation sometime next year and we can get past AVX classic, which will be at least 17 years old by the time both Intel and AMD support its replacement concurrently on consumer desktops (AVX2 launched 11 years ago today, although its just AVX1 with integer instructions).
 

hobold

Ars Tribunus Militum
2,657
Intel's new E cores now have 512 bit FMA, similar to Zen 4, which is hopefully a step towards finally unifying everything with AVX10.
At the current 256bit wide spec, AVX10 is more of a lowest common denominator rather than a unification. I hope std::experimental::simd will gradate out of experimental state sooner rather than later. That would partly unify SSE, AVX, AVX-512, NEON, SVE, AltiVec/VMX and whatever else is out there.
 
At the current 256bit wide spec, AVX10 is more of a lowest common denominator rather than a unification. I hope std::experimental::simd will gradate out of experimental state sooner rather than later. That would partly unify SSE, AVX, AVX-512, NEON, SVE, AltiVec/VMX and whatever else is out there.
AVX10 in the 256 bit flavor is still a massive improvement over AVX1/2, which really antiquated and essentially the lowest common denominator in that most operations are structured as two SSE operations run in parallel. Since all of these devices are going to have 512 bit wide vector units, if you access them using 2x256 AVX10 operations issued per cycle or 1x512 AVX10 operation per cycle is not so interesting. Key thing is just to get rid of the baggage and expose all the capabilities of those vector engines that aren't in AVX1/2 because mid-2000s CPUs didn't have the transistor budgets.
 

hobold

Ars Tribunus Militum
2,657
AVX10 in the 256 bit flavor is still a massive improvement over AVX1/2
Of course the capabilities of AVX-512, and thus AVX10, are on another planet. After all, AVX-512 was the first time that Intel aimed at a fully fledged, compiler targeted SIMD ISA in the tradition of DEC Alpha's "Tarantula" project, which in turn was based on all the experience gained from Cray's original vector supercomputers.

My "lowest common denominator" snide was aimed at Intel's goal of down-sizing AVX-512 enough to make it workable on their ex-Atom, now E-core, microarchitectures. I don't think Intel is doing the right thing, because AMD has already demonstrated by example that full AVX-512 support can be had with 256 bit wide hardware, and at very respectable performance, in Zen4. (And power consumption when running AVX-512 number crunching is not a problem on the Ryzen 7840/8840 laptops that I am toying with.)

Intel could have done the same and avoided the confusion of introducing yet another incompatible ISA / feature level. Especially considering that Intel still has to support AVX-512 on (some of) their datacenter silicon.

There are too many target platforms already, and programmer effort for tuning is already spread too thin. AVX10 does not help the situation. Pat Gelsinger wants "five nodes in four years" and "unquestioned technology leadership". Yet the actions of his company are telegraphing to the world that Intel will not have competitive transistor density, and cannot affort to implement AVX-512 in their client processors? I don't like it.

But yes, in the grand scheme of things, getting AVX10 is much, much better than being stuck on AVX2 forever.
 
My "lowest common denominator" snide was aimed at Intel's goal of down-sizing AVX-512 enough to make it workable on their ex-Atom, now E-core, microarchitectures. I don't think Intel is doing the right thing, because AMD has already demonstrated by example that full AVX-512 support can be had with 256 bit wide hardware, and at very respectable performance, in Zen4. (And power consumption when running AVX-512 number crunching is not a problem on the Ryzen 7840/8840 laptops that I am toying with.)
Zen 4's AVX512 is somewhat limited but still does one 512 bit FMA per cycle. With Crestmont also going to 512 bit wide FMA, it isn't clear what Intel is planning to do with AVX10. Or at least not to me.
 
  • Like
Reactions: continuum

Made in Hurry

Ars Praefectus
4,553
Subscriptor
I do not really need a top of the line machine, but since AM4 computers still can be sold for quite a decent amount of money online these days, i listed my main rig online yesterday and i am going to keep that in my account and probably either build a new one before Christmas or buy a Mac.
I am curious if they have upgraded the integrated GPU on it though and what it can do as a mini-computer might be what i am tempted to get in such case.
 

Demento

Ars Legatus Legionis
13,751
Subscriptor
The only reason to upgrade the desktop Ryzen iGPU is to be able to phase out support for that architecture in the drivers. It's there to provide basic - more basic than Intel - video and nothing more. I expect it will update the next time there's a major change in the Radeon architecture just to keep the drivers in sync.
 
  • Like
Reactions: continuum

NervousEnergy

Ars Legatus Legionis
10,547
Subscriptor
The X3D line segregation seems... suboptimal if they want a big early sales splash. A lot of folks on X3D procs aren't going to upgrade until those chips are out, so I'm a bit surprised they don't lead with them. I'm still on a 5800X3D, and am definitely upgrading to X870E and whatever they're calling the 9K X3D versions (9800X3D?), so I'll be waiting.
 

continuum

Ars Legatus Legionis
94,897
Moderator
The X3D line segregation seems... suboptimal if they want a big early sales splash. A lot of folks on X3D procs aren't going to upgrade until those chips are out,
Given how they've handled 3D V-cache parts in the past I don't think they care, they seem to be set on doing an initial release and then a subsequent 3D V-Cache part release.