The Zen Thread

w00key

Ars Praefectus
5,907
Subscriptor
Whoo! Finally, an actual demo, not just fancy slides and claims.


Arstechnica:

At an event in San Francisco AMD also revealed a few more low-level details of Zen's architecture—and in a multithreaded Blender rendering demo showed that an 8-core/16-thread "Summit Ridge" Zen CPU outperformed an 8C/16T Broadwell-E CPU (presumably the Core i7-6900K) at the same clockspeed.

What clockspeed? Pcworld.com:

AMD’s Summit Ridge SoC (left) running at 3GHz can run a Blender render just as fast as a Core i7-6900K (right) running at 3GHz.


AMD shows it can match or exceed Intel's IPC on floating point / rendering workload. Now they just need to hit a decent clock target. I still don't get why they refuse to introduce a smaller 4C8T or 2C4T part though, I mean, most people are fine with just an i5 (4C4T) for gaming.


The other slides are pretty interesting as well, ditching the semi-shared core concept for fatter cores with hyper-threading, better caching and bandwidth to keep those cores fed, process node upgrade.

If it's not just marketing fluff, they might actually pull off an Nvidia style upgrade who also made the 28mm -> Finfet jump with the 10x0's and delivered a larger than usual performance increase. Pascal isn't that dramatically different from Maxwell, but stack two process nodes of gains on top of it and it's a winner.

The only sad thing is the timeline, 2017, meh. Intel needs a bit of competition for Broadwell-E asap, their pricing is a bit silly.
 
  • Like
Reactions: VectorVoyager

Sunner

Ars Praefectus
4,330
Subscriptor++
  • Like
Reactions: awesometoast

mpat

Ars Praefectus
5,951
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31737601#p31737601:d4o25cpr said:
w00key[/url]":d4o25cpr]AMD shows it can match or exceed Intel's IPC on floating point / rendering workload. Now they just need to hit a decent clock target. I still don't get why they refuse to introduce a smaller 4C8T or 2C4T part though, I mean, most people are fine with just an i5 (4C4T) for gaming.

There is a 4C/8T variant. The leaked test I saw had it beating a Haswell i5, but losing to the Devil's Canyon i7 . Not too shabby - especially if they support things like ECC which Intel locks out of its base chips - but then it could all be faked.

I also think that AMD's positioning here is first that it can sell an 8-core model comparable with Broadwell-E for much less. Intel's pricing on Broadwell-E is insane.

2C/4T is probably only coming in an APU.

[url=http://arstechnica.com/civis/viewtopic.php?p=31737601#p31737601:d4o25cpr said:
w00key[/url]":d4o25cpr]
The other slides are pretty interesting as well, ditching the semi-shared core concept for fatter cores with hyper-threading, better caching and bandwidth to keep those cores fed, process node upgrade.

We know very little about the design, but what we know makes it look a bit like Apple's A7 core (and its successors). 4 ALU/2 AGU (Athlon had 3 of each, Bulldozer cut down to 2 of each) and a focus on cache latencies sounds like a good start for integer performance.
 

mpat

Ars Praefectus
5,951
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31738247#p31738247:3ialk381 said:
U-99[/url]":3ialk381]Friendly reminder for everyone to read 2011's Bulldozer to Dominate Sandybridge thread from 2011 before being overly exuberant on this topic:
https://arstechnica.com/civis/viewtopic ... &t=1142220

That thread is a genuine classic.

[url=http://arstechnica.com/civis/viewtopic.php?p=31738247#p31738247:3ialk381 said:
U-99[/url]":3ialk381]
AMD benchmarks typically show performance that is 1)not replicable, 2)cherry picked, or 3)both not replicable and cherry picked.

I know, but I remain an optimist, because we have never needed some real Intel competition as much as we do now.
 

w00key

Ars Praefectus
5,907
Subscriptor
I know I know, they're always super optimistic, but as far as I know, Bulldozer and friends never actually matched Intel's IPC in any benchmark, cherry picked or not, so there's a reason to be slightly optimistic.


These slides are funny though:

slide.jpg


AMDZenRoadmap_678x452.jpg
 

mpat

Ars Praefectus
5,951
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31740045#p31740045:1jqlerri said:
w00key[/url]":1jqlerri]I know I know, they're always super optimistic, but as far as I know, Bulldozer and friends never actually matched Intel's IPC in any benchmark, cherry picked or not, so there's a reason to be slightly optimistic.

Bulldozer was probably meant to be a speed daemon design, sacrificing IPC for clockspeed, even if AMD didn't actually say so after the fact. That would explain the performance miss - they never hit the clockspeed they thought they would.

But that is just speculation, and doesn't matter now anyway. The good news is that AMD seems to have learned from their mistakes, and if the new design looks a lot like Skylake... well, there are worse designs to ape.
 

Mister E. Meat

Ars Tribunus Angusticlavius
7,241
Subscriptor
I'd be a lot more optimistic about Zen if it were really coming out Q4 like initially promised. I know AMD is sick of being the value player but if they can even get 90% of a Kaby Lake i5 at launch they'll be in good shape. Today, a top of the line AMD FX processor doesn't even meet an i3-6300 in a lot of tasks which makes it tough for them to compete. Even their APU chips, which were arguably ahead in terms of total system performance when they came out, haven't been competitive for over a generations now.
 
I'm pretty optimistic. Even assuming that's a particularly favorable benchmark and the products wind up performing slightly downrange of top of the line (core and thread equivalent) Intel parts, the pricing will probably be significantly better, along with more secondary features being unlocked in the regular processors (like ECC or overclocking). These will be a very good consumer option. The IPC gains, as long as they're even remotely close to what we're seeing here, combined with the process node will make these easier to recommend over more expensive Intel parts than the last half-decade of non-APU parts from AMD.

I'm giddy with what a high-end APU could be with Zen cores, 14nm lithography, and maybe bit of HBM on-die could do.
 

DaveB

Ars Tribunus Angusticlavius
7,274
These slides are funny though:

AMDZenRoadmap_678x452.jpg
The second slide is really hysterical. Rather than the 40% claimed IPC improvement, AMD made the graph look more like a 400% improvement. While I personally really want Zen to be competitive enough for me to actually build a Zen system, AMD has a history of publishing false benchmarks in the past. So like everyone else, I'll wait for actual benchmarks by independent testers and actual pricing before I get excited about Zen.
 
[url=http://arstechnica.com/civis/viewtopic.php?p=31740497#p31740497:31kb9jb6 said:
mpat[/url]":31kb9jb6]15% share of what market? From what I can google up, AMD was at just under 30% of the desktop market on the eve of the Bulldozer launch (according to Gartner), and even higher during the Prescott disaster (before the Conroe launch).
Their heyday was definitely right before the Core2s hit. Mindshare might have propelled them for awhile after that, but as far as comparing their chip's performance and price verse Intel's, they haven't been anywhere near as strong since.
Edit: just adding info, wasn't trying to dispute.
 

Hat Monster

Ars Legatus Legionis
47,680
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31741089#p31741089:1do9unag said:
AndrewZ[/url]":1do9unag]
Intel competes with Intel, and has been doing so for the better part of a decade
I keep hearing this, but honestly, that's not how monopolies work.
AMD has never, ever, had the capacity to fuel the PC market. If you take AMD's shipping figures at their peak, even when outsourcing production to Chartered's SOI process, it was barely 20% of the global PC market.

80% was Intel. This was Intel at its Pentium4 weakest, a CPU only IceStorm loved. The entire market had turned against Intel and AMD Opterons filled server rooms. Yet PCs had to be sold, and AMD could only power 20% of them at best.

How effectively does Intel compete with Intel? This controls Intel's bottom line, and the answer is "As little as possible".

The effect is that Intel is incentivised to offer the "bare minimum" required to get the market to upgrade. There's the incentive to offer "value-add". Overclocking is now a bought and paid for feature, for crying out loud! Chipsets disable parts of themselves based on what CPU's installed. You can quite reasonably upgrade your CPU and have your motherboard unlock features it had all along, but just disabled because you didn't buy the right CPU. That's monopolistic behaviour.

Intel has merged the chipset market, once vibrant and full of choices from VIA, SiS, OPTi, Acer (as ALI), Nvidia, ATI, more, with the CPU market. At times, you'd actively avoid an Intel chipset - of the 430s, only the TX variant was viable, 820 was a legend of shittiness unto itself, the GMA-era had Intel deliberately not pressing IGPs much (any competitor wishing to do so was forbidden by a raft of patents and potential lawsuits) to the point where Intel thought scalar pipelines were good enough (they process R, G and B in sequence rather than parallel). Even as far as the 6-series, Intel managed to ship a fundamentally flawed SATA controller. Buy something else? Hah!

Want an Intel CPU? Awesome. Here's your Intel chipset. This inflates prices of both chipsets and the motherboards containing them and is manifestly anti-consumer. Intel fucks up a chipset, you're fucked too. Want features or use-cases Intel hasn't foreseen? Fuck off. Fuck right off (which is why Intel has practically no presence at all in the Android space).
 

w00key

Ars Praefectus
5,907
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31744267#p31744267:2i6epung said:
Hat Monster[/url]":2i6epung]Want an Intel CPU? Awesome. Here's your Intel chipset. This inflates prices of both chipsets and the motherboards containing them and is manifestly anti-consumer. Intel fucks up a chipset, you're fucked too. Want features or use-cases Intel hasn't foreseen? Fuck off. Fuck right off (which is why Intel has practically no presence at all in the Android space).

Eh, no. Atom was just stuck for way, waaay too long in development. When Bay Trail / Silvermont was released (ie Z3745 in my Yoga Tab 2), it's not a bad chip, even now it's fast enough for everything.

The problem is, that was October 2013, and since then, there have been zero new mobile SoC's released from Intel. That was when the Snapdragon 800 was king, and since then, we've had the 805, 808/810 (ugh), and now the 820's. Samsung's lines are messier, but they released Cortex-A7, A9, A7+A15, A53+A57 based chips in the meantime.


It would be silly to ship a new Silvermont based device now, although Asus did ship a the Zenphone Zoom based on Silvermont in January 2016. Same cores, but running at 4x 2.5Ghz instead of my Yoga Tab's 4x 1.33Ghz.
 

Hat Monster

Ars Legatus Legionis
47,680
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31745823#p31745823:31cgtbri said:
w00key[/url]":31cgtbri]
[url=http://arstechnica.com/civis/viewtopic.php?p=31744267#p31744267:31cgtbri said:
Hat Monster[/url]":31cgtbri]Want an Intel CPU? Awesome. Here's your Intel chipset. This inflates prices of both chipsets and the motherboards containing them and is manifestly anti-consumer. Intel fucks up a chipset, you're fucked too. Want features or use-cases Intel hasn't foreseen? Fuck off. Fuck right off (which is why Intel has practically no presence at all in the Android space).

Eh, no. Atom was just stuck for way, waaay too long in development. When Bay Trail / Silvermont was released (ie Z3745 in my Yoga Tab 2), it's not a bad chip, even now it's fast enough for everything.
My daughter's YouTube device is an Intel/Gigabyte AZ210. It had lots of names in many markets. It's an Android x86 phone, has all the Google stuff, runs off an Atom Z2460 and posted average battery life, average performance, average everything. It was up against 4th generation dual core A9s and kept up with them.

Intel can do it. Intel doesn't want to.

What Intel cannot do is relinquish control. The AZ210's bootloader was cryptographically locked, such that third party Android updates could not be loaded. Most Androids have the ability to unlock their bootloader to allow the user to install third party updates. This builds a community of developers and it's all done open source. Intel refused to allow the bootloader to be unlocked. The same Medfield platform was on a Motorola Razr device, and a ZTE Grande device, both unlockable. Intel's own flagship for the platform was off-limits to developers and enthusiasts. Contrast this with the Android flagships, the Nexus devices.

The AZ210 may or may not have been sold at a loss, accounts differ: Intel seems to be unable to do it at a sensible price, and did the overpriced "Compute Sticks" instead. Intel will not, or cannot, make SoCs at competitive rates, which is why ARM, particularly via Qualcomm, ate its lunch.
 

DaveB

Ars Tribunus Angusticlavius
7,274
Just remind all the professional debaters here the title of this thread is .................. (drum roll)

The Zen Thread

So how about forgetting about your personal hatred for Intel chipsets, Atom / Bay Trail / Silvermont, problems with jail breaking iPhones, what happened 10 years ago and keep this thread on topic. Some of us actually come to this thread for some useful information (or rumors) about Zen.

Thanks in advance for staying on topic as we go forward.
 

mpat

Ars Praefectus
5,951
Subscriptor
Thanks in advance for staying on topic as we go forward.

Fine by me. If we do a bit of a comparison with Skylake, we have:

* ALUs. 4 in Zen, 4 in Skylake.
* AGUs. 2 in Zen. Skylake has 4 units that do something related to that, but they're not as generic: 2 of those ports are dedicated to Store Data and Store Address. Zen also has 2+1 load/store units, with only 2 AGUs to drive them. How does this work? Either the AGUs bottleneck it, or the ALUs can do some simple address calculations.
* FPU. 2 128-bit units, can be combined to work on a single 256-bit instruction. This is less than Skylake, but more or less comparable to Haswell? Note, however, that there is no port collision with the integer units like in Intel's designs.
* Caches. No latencies known, but they look appropriate otherwise, size, bandwidth and wayness etc.
* The basic unit is a 4C/8T segment with 8MB L3. If there is more than one segment on the chip, the L3 is apparently not joined - there has to be some sort of ring for coherency traffic between the segments, but we don't know anything about that.
* Decoding. 4 uops from the x86 decoders vs. 5 in Skylake, but they can accept the same number of x86 instructions (4), and AMD doesn't use as many uops per x86 instruction in some cases. New uop cache, 6 instructions per cycle, looks a lot like Skylake.
* Scheduler. Bigger and better, but no details. It seems that the scheduler can finally do move elimination for GPRs, like Intel has done since forever, which is nice.

All in all, it looks very similar to Skylake. Any other details that I missed?
 
[url=http://arstechnica.com/civis/viewtopic.php?p=31749891#p31749891:x5gzsvej said:
mpat[/url]":x5gzsvej]
Thanks in advance for staying on topic as we go forward.

Fine by me. If we do a bit of a comparison with Skylake, we have:

* ALUs. 4 in Zen, 4 in Skylake.
* AGUs. 2 in Zen. Skylake has 4 units that do something related to that, but they're not as generic: 2 of those ports are dedicated to Store Data and Store Address. Zen also has 2+1 load/store units, with only 2 AGUs to drive them. How does this work? Either the AGUs bottleneck it, or the ALUs can do some simple address calculations.
* FPU. 2 128-bit units, can be combined to work on a single 256-bit instruction. This is less than Skylake, but more or less comparable to Haswell? Note, however, that there is no port collision with the integer units like in Intel's designs.
* Caches. No latencies known, but they look appropriate otherwise, size, bandwidth and wayness etc.
* The basic unit is a 4C/8T segment with 8MB L3. If there is more than one segment on the chip, the L3 is apparently not joined - there has to be some sort of ring for coherency traffic between the segments, but we don't know anything about that.
* Decoding. 4 uops from the x86 decoders vs. 5 in Skylake, but they can accept the same number of x86 instructions (4), and AMD doesn't use as many uops per x86 instruction in some cases. New uop cache, 6 instructions per cycle, looks a lot like Skylake.
* Scheduler. Bigger and better, but no details. It seems that the scheduler can finally do move elimination for GPRs, like Intel has done since forever, which is nice.

All in all, it looks very similar to Skylake. Any other details that I missed?

That's the general take at Anand as well. Looks like we're getting something of an off brand Skylake part, which sounds fantastic by me, given the pricing will likely be better and AMD is less likely to ask a premium for unlocking features already on-die.

I think we'll see some really unique and non-equivalent parts start appearing when APUs are released (maybe with small amounts of HBM as well), and it'll be great to see the laptop market start moving forward again with the competition, particularly in the GPU and compute side of things.
 
4C/8T (with 8MB of L3 cache) seems to be the basic unit for the CPU core so I wouldn't expect anything smaller.

From both the investor presentation and the leaked Hot Chips slides, Zen looks very similar to Intel's Haswel in terms of execution resources and instruction window (without the bottleneck on instruction retirement that was in Bulldozer).

Cache bandwidth (a traditional weakness in AMD CPUs) is between Sandy Bridge and Haswell (a bit better than SB but Haswell/Skylake doubled the BW to 64 bytes per cycle for both L1 & L2 caches while Zen is at 32 bytes per cycle). No info on cache latency yet though.

My guess for performance would be around 90-95% of the IPC of Skylake in integer so it looks like it could be a winner if AMD can clock it high enough (that's a very big if). FPU is much harder to guess, on non AVX code Zen should be better, at least on paper (dedicated scheduler + 4 FPU units).

Hopefully AMD won't disable ECC support in their consumer chips.
 

mpat

Ars Praefectus
5,951
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31750791#p31750791:2gq1kf6o said:
Blue Apple[/url]":2gq1kf6o]4C/8T (with 8MB of L3 cache) seems to be the basic unit for the CPU core so I wouldn't expect anything smaller.

I don't think they can use the same unit anyway - they have to connect the GPU to the memory controllers somehow. Unless they make it a NUMA design, like a discrete GPU. That would be interesting to be sure, but there is no indication that they're going that way.

[url=http://arstechnica.com/civis/viewtopic.php?p=31750791#p31750791:2gq1kf6o said:
Blue Apple[/url]":2gq1kf6o]
From both the investor presentation and the leaked Hot Chips slides, Zen looks very similar to Intel's Haswel in terms of execution resources and instruction window (without the bottleneck on instruction retirement that was in Bulldozer).

Bulldozer was limited by decoding bandwidth, cache size and wayness, having only 2 ALUs, and terrible cache latencies - to start with. But let's not go there.

[url=http://arstechnica.com/civis/viewtopic.php?p=31750791#p31750791:2gq1kf6o said:
Blue Apple[/url]":2gq1kf6o]
Cache bandwidth (a traditional weakness in AMD CPUs) is between Sandy Bridge and Haswell (a bit better than SB but Haswell/Skylake doubled the BW to 64 bytes per cycle for both L1 & L2 caches while Zen is at 32 bytes per cycle). No info on cache latency yet though.

True, but both L1I and L2 is twice the size of Skylake. I agree that the cache latencies are what will make or break this design.

[url=http://arstechnica.com/civis/viewtopic.php?p=31750791#p31750791:2gq1kf6o said:
Blue Apple[/url]":2gq1kf6o]
My guess for performance would be around 90-95% of the IPC of Skylake in integer so it looks like it could be a winner if AMD can clock it high enough (that's a very big if). FPU is much harder to guess, on non AVX code Zen should be better, at least on paper (dedicated scheduler + 4 FPU units).

I agree, the FPU bit is hard to guess. Clearly it will not keep up with Skylake on 256bit vectors, but other than that, it might actually be a little faster - if the cache bandwidth doesn't bottleneck it.

[url=http://arstechnica.com/civis/viewtopic.php?p=31750791#p31750791:2gq1kf6o said:
Blue Apple[/url]":2gq1kf6o]
Hopefully AMD won't disable ECC support in their consumer chips.

++ on this. ECC is the one server feature that I really want.
 

Sebastian

Ars Tribunus Militum
2,544
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31750605#p31750605:2ekf4bxg said:
mpat[/url]":2ekf4bxg]Yes, that is the really interesting part - if AMD an make a 2C/4T with nice integrated graphics and HBM, that would be a fantastic laptop chip.
If they have scaled up the server part
"Naples SoC will spearhead that effort. Naples is a 32-core, 64-thread server SoC.."
Hopefully they'll have a trimmed down version. Honestly though I'm guessing a 4 core is small as it going to get this day and age.
 

w00key

Ars Praefectus
5,907
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31754043#p31754043:p5cwh1uk said:
Sebastian[/url]":p5cwh1uk]
[url=http://arstechnica.com/civis/viewtopic.php?p=31750605#p31750605:p5cwh1uk said:
mpat[/url]":p5cwh1uk]Yes, that is the really interesting part - if AMD an make a 2C/4T with nice integrated graphics and HBM, that would be a fantastic laptop chip.
If they have scaled up the server part
"Naples SoC will spearhead that effort. Naples is a 32-core, 64-thread server SoC.."
Hopefully they'll have a trimmed down version. Honestly though I'm guessing a 4 core is small as it going to get this day and age.

Meh, why have 4 when you can sell 2C/4T or even 2C/2T just fine? Especially on mobile it's a must, every single -U is a dual core model, for power reasons. Also, tiny chip, huge price = huge margin.

For desktop, sure, 4C it is, but even there some cheap out and just buy a pentium, you won't notice the difference unless you game, it's already tons faster than the C2D you just replaced :D
 

redleader

Ars Legatus Legionis
35,019
[url=http://arstechnica.com/civis/viewtopic.php?p=31754225#p31754225:2nnp1okz said:
DrPizza[/url]":2nnp1okz]If Zen is as good as AMD claims, even with a bit of a delay, it should apply some pressure on Broadwell-E.

It would be nice for Intel to actually improve performance on the E series once and awhile. We are basically at the point where the only reason for the E series to exist is for people who need more than 4 DIMMs or more than 4 physical cores. Otherwise they're just pointlessly more expensive previous gen CPUs.
 

Ostracus

Ars Legatus Legionis
27,980
[url=http://arstechnica.com/civis/viewtopic.php?p=31748321#p31748321:3s97bh3p said:
DaveB[/url]":3s97bh3p]Just remind all the professional debaters here the title of this thread is .................. (drum roll)

The Zen Thread

So how about forgetting about your personal hatred for Intel chipsets, Atom / Bay Trail / Silvermont, problems with jail breaking iPhones, what happened 10 years ago and keep this thread on topic. Some of us actually come to this thread for some useful information (or rumors) about Zen.

Thanks in advance for staying on topic as we go forward.

How about features as they relate to security? Intel has SGX for example.
 

mpat

Ars Praefectus
5,951
Subscriptor
It would be nice for Intel to actually improve performance on the E series once and awhile. We are basically at the point where the only reason for the E series to exist is for people who need more than 4 DIMMs or more than 4 physical cores. Otherwise they're just pointlessly more expensive previous gen CPUs.

The third reason would be PCIe lanes for lots of GPUs, but with nVidia killing SLI for more than two cards, that need is disappearing as well. The E series is mostly for e-peen now.

How about features as they relate to security? Intel has SGX for example.

They haven't announced anything on that level, I think. I suppose they still have that TrustZone/special ARM core for security?
 

mpat

Ars Praefectus
5,951
Subscriptor
Other details from some leaked slides:

* Retire bandwidth is 8 per clock for integer and another 8 for float, with a deeper queue.
* L1D bandwidth is 2 128-bit accesses per clock
* L2 is inclusive of L1, but L3 is exclusive of L2. AMD usually makes all caches exclusive, but Intel always makes the L3 inclusive of everything. L3 is 16-way.
* L2 and L3 Caches are described as "faster" than Bulldozer, so latencies are less than that at least. Not that that is hard to do. I would guess that AMD can tolerate an L2 latency in the 14 cycle range or so given that hitrate in L1I at least should be high.
* "Branch fusion" sounds like Intel's macro-op fusion.
 

redleader

Ars Legatus Legionis
35,019
[url=http://arstechnica.com/civis/viewtopic.php?p=31756009#p31756009:2fgxys1p said:
mpat[/url]":2fgxys1p]
It would be nice for Intel to actually improve performance on the E series once and awhile. We are basically at the point where the only reason for the E series to exist is for people who need more than 4 DIMMs or more than 4 physical cores. Otherwise they're just pointlessly more expensive previous gen CPUs.

The third reason would be PCIe lanes for lots of GPUs, but with nVidia killing SLI for more than two cards, that need is disappearing as well. The E series is mostly for e-peen now.

Extra PCIe lanes is actually the reason I've bought the E series in the past (PCIe 3.0 FPGA boards), but SLI isn't a great use for them. The Z170 doesn't seem to do any worse than X99, and 3-way SLI never really delivered anyway.
 
How about features as they relate to security? Intel has SGX for example.

They haven't announced anything on that level, I think. I suppose they still have that TrustZone/special ARM core for security?

They're doing their own memory encryption thing, though its got limited utility: http://amd-dev.wpengine.netdna-cdn.com/ ... Public.pdf

And whoever thought adding a coprocessor with more access to system memory than the Intel ME was a good security addition should be dragged out back and shot.
 
And whoever thought adding a coprocessor with more access to system memory than the Intel ME was a good security addition should be dragged out back and shot.
How can the PSP have "more" access than Intel's ME when the latter has full access to the memory? And what makes Intel's ME dangerous is the access to all the computer hardware, including the network interfaces. AMD's PSP has no communication with the external world, all it does is continuously encrypts/decrypts pages without any awareness of the outside world.

AMD's PSP is similar to Intel's SGX except that it has worse granularity as it only works for full VM instances while Intel's SGX allows any application to encrypt (and hide) its memory content to other applications (including the OS). IMO that's incredibly stupid as it will allow any virus/malware to hide itself from anti-virus scanner.

More on topic, it looks like Zen will support all the latest ISA instructions, including TSX (transactional memory extensions).
 

mpat

Ars Praefectus
5,951
Subscriptor
[url=http://arstechnica.com/civis/viewtopic.php?p=31770423#p31770423:2ginvqsh said:
Blue Apple[/url]":2ginvqsh]
More on topic, it looks like Zen will support all the latest ISA instructions, including TSX (transactional memory extensions).

That is great in that case. TSX was supported in Haswell but then disabled by microcode (presumably it was buggy). You need Broadwell after a certain revision for it to work, so it is still a fairly new. For the server chips, you need Broadwell-E, which was only released in Q2 of this year. If AMD can get that feature working less than a year later, that is much better than their usual lag.
 
[url=http://arstechnica.com/civis/viewtopic.php?p=31770423#p31770423:qrmv1x7o said:
Blue Apple[/url]":qrmv1x7o]
And whoever thought adding a coprocessor with more access to system memory than the Intel ME was a good security addition should be dragged out back and shot.
How can the PSP have "more" access than Intel's ME when the latter has full access to the memory?
The ME doesn't have full access to system memory, it's part of the chipset. Admittedly, this is a somewhat meaningless distinction in 32-bit systems without an IOMMU.

And what makes Intel's ME dangerous is the access to all the computer hardware, including the network interfaces. AMD's PSP has no communication with the external world,
All modern IO interfaces are memory mapped, and the PSP has apparently unlimited access to the memory space.

AMD's PSP is similar to Intel's SGX
The PSP is enabling Secure Encrypted Virtualization (SEV) which AMD will try and position as an SGX alternative. Except SEV doesn't even remotely protect a VM Guest from a malicious hypervisor. What I've been able to find so far about the feature reads like it was engineered to meet a badly written .gov purchasing requirement. SGX has some flaws but at least Intel is trying. AMD is using page tables to try and block memory access by the VMM in charge of managing page tables. o_O


Edit: Regarding SGX, code is cleartext and signed prior to being loaded into encrypted memory. Perhaps AV should stop blanket whitelisting of all signed apps before complaining that SGX breaks signature detection? (Also, AV is fundamentally broken at this point anyway)