PC shuts off during Baldurs Gate?

w00key

Ars Praefectus
5,907
Subscriptor
Per the AMD overlay, I'm not pulling more than 277W during play time. I haven't limited it yet, but will do later today.
Yeah but that is an average value. Cards since the RTX 3000 series can sometimes pull a ton of power in peaks lasting only milliseconds. The RTX 3080 is the first card infamous for this, pulling 500W in up to 5ms peaks, but it applies to all > 200W cards. Your 6900XT pulls 475W in 1ms spikes:



These transients never show up in monitoring tools and overlays, you need an oscilloscope hooked up with current transformers to measure it. Practically, you should just add 40% to GPU power just to be safe, and a low power limit is a good way to check if shutdowns are caused by insufficient PSU capacity / ability to handle power spikes.
 
  • Like
Reactions: continuum

IceStorm

Ars Legatus Legionis
24,871
Moderator
The power monitor in the overlay won't have the resolution to see power spikes. You'd need an external piece of hardware to see what the transient spikes are. TechPowerUp's review of a Gigabyte 6900XT showed spikes up to 600W when monitoring at 40 samples/sec.

Having said that, the minimum recommended PSU is 850W, but there are people who report having to move up to 1k PSUs to satisfy their 6900 XTs. The other thing to check is that you're using one PSU connector per 8-pin on the GPU. Don't use the pigtail to chain two inputs to a single PSU output.
 

malor

Ars Legatus Legionis
16,093
The other thing to check is that you're using one PSU connector per 8-pin on the GPU. Don't use the pigtail to chain two inputs to a single PSU output.
Ooh, yeah, good point there. Even with a single-rail supply, it's a good idea to use two separate runs directly from the supply with these high-amperage cards. The wires may not be thick enough to carry enough amperage for more than one 8-pin connector.

That's even more important with a multi-rail supply. Each rail is usually limited to 20 amps total.

edit: well, they used to be, anyway. I haven't bought a power supply in five or six years, and I'm not sure what current multi-rail supply outputs actually look like. They could have more rails, or be going above 20A each.
 

mpat

Ars Praefectus
5,951
Subscriptor
Ooh, yeah, good point there. Even with a single-rail supply, it's a good idea to use two separate runs directly from the supply with these high-amperage cards. The wires may not be thick enough to carry enough amperage for more than one 8-pin connector.
At that point it is a design flaw in the PSU. If the manufacturer ships a PSU with two 8-pin connectors on a single cable, that cable MUST be thick enough to carry all of the 300W of power that those two pins can draw. The reason to use separate cables (and in effect separate outlets on a modular PSU) is that it can help mitigate the effect of spikes, but it should never be a problem for safety.

That's even more important with a multi-rail supply. Each rail is usually limited to 20 amps total.

edit: well, they used to be, anyway. I haven't bought a power supply in five or six years, and I'm not sure what current multi-rail supply outputs actually look like. They could have more rails, or be going above 20A each.
Designs with multiple 12V rails are pretty much extinct. It was a requirement in ATX 2.0 that no rail was allowed to go over 20A, but this requirement was removed with 2.3 in 2007. Companies kept reusing old multi-rail designs for a while, but those old designs are gone now.
 

malor

Ars Legatus Legionis
16,093
At that point it is a design flaw in the PSU. If the manufacturer ships a PSU with two 8-pin connectors on a single cable, that cable MUST be thick enough to carry all of the 300W of power that those two pins can draw. The reason to use separate cables (and in effect separate outlets on a modular PSU) is that it can help mitigate the effect of spikes, but it should never be a problem for safety.


Designs with multiple 12V rails are pretty much extinct. It was a requirement in ATX 2.0 that no rail was allowed to go over 20A, but this requirement was removed with 2.3 in 2007. Companies kept reusing old multi-rail designs for a while, but those old designs are gone now.
There was actually a good reason for doing multiple rails; it helped isolate problems. If one rail failed, any resulting damage would be contained to that rail and anything attached to it. Your other components would be safe, as long as the power supply was well designed.

But dual-rail supplies sucked, because those supplies simply wouldn't drive big cards properly. The cards would have either one or two connectors, and since one rail had to be shared with the motherboard, you'd often end up with an overloaded rail in either scenario. (either one rail over 20A, or two rails, with the secondary rail being forced over 20A by also having to hold up the motherboard.)

So folks with a clue switched to single-rail, which IIRC wasn't an official part of the ATX spec. It was specifically dual-rail supplies that didn't work well, but that ended up being translated as 'buy single-rail'. If tri- and quad-rail supplies really are gone, that's probably why. It's kind of a shame, because they were slightly less likely to wreck your stuff.

But power supplies in general have improved, too, so planning around their failure may not be as important as it was.
 

steelghost

Ars Praefectus
4,975
Subscriptor++
But power supplies in general have improved, too, so planning around their failure may not be as important as it was.
At this point the strategy looks to be for PSUs to sense a variety of failure modes and shut down, rather than just failing catastrophically and accidentally blasting your motherboard with mains AC, or whatever.

Unfortunately if the "sensing" capability is liable to pick up false positives, you can end up with the kind of problem the OP has. Or it could be the "very high, very short" transient load issue that modern GPUs seem very good at. Either way, the OP's PSU may not be "faulty" in the classic sense, it may just not be able to handle their GPU's behaviour.
 
  • Like
Reactions: malor

w00key

Ars Praefectus
5,907
Subscriptor
At this point the strategy looks to be for PSUs to sense a variety of failure modes and shut down, rather than just failing catastrophically and accidentally blasting your motherboard with mains AC, or whatever.

Unfortunately if the "sensing" capability is liable to pick up false positives, you can end up with the kind of problem the OP has. Or it could be the "very high, very short" transient load issue that modern GPUs seem very good at. Either way, the OP's PSU may not be "faulty" in the classic sense, it may just not be able to handle their GPU's behaviour.
When it first happened there was a huge discussion if it was the PSU's fault or the GPU. The defective GPU side points out that a PSU shouldn't be expected to handle more watts than on its sticker, and that overcurrent protection is right to shut down before something burns and that the GPU's total board power is bullshit with spikes up to 50% more, and these should have been stickered with a bigger PSU requirement:

1703867827434.png

The defective PSU side points out that there should be enough capacitors to smooth this out and OCP should only trigger with sustained overcurrent.


But I tend to agree with the GPU's fault side. The RTX 2080 Ti should have been called a 375W card and not 275W, for PSU sizing. That way you probably won't randomly run into overcurrent protection shutting down the system. The spikes only got bigger so PSU manufacturers like Corsair now recommend a 850W PSU with a RTX 4080 (320W). Nvidia itself recommends 750W minimum, so with spikes up to 500W that leaves 250W for the CPU which should be plenty, you're usually not Cinebenching while gaming.


Both GPU and PSU makers now correctly suggest you install 850-1000W when you get a high end GPU. Don't be stubborn and count the watts yourself, you probably forgot to lookup peak 1/5/10ms power draw.
 
Modern video cards can have very peaky power draw but its usually in very short bursts. However if you have a PSU that is pretty much at its limit and fighting voltage ripple at its end while pulling close to max amperage and also having to cope with a power hungry GPU peaking above where it should be it could well be enough to trip the overcurrent protection circuitry in the PSU and hey presto, you have a PC that shuts down while gaming.

The only real solution for this is to get a better PSU.
 
Yep, and if it's the PSU as expected it's a 5 minute swap as long as you make sure the cables are compatible. Corsair is generally good stuff, though they have a huge range like any other large OEM.
Since you're ruling out possibilities it's far better to change everything out including the cables.

Personally I typically buy professional workstations for pretty much everything (including gaming these days), tossing out the workstations before the 5-year warranty runs out... and only build for specialised tasks. And I leave the ruling out to the onsite support, or an overnighted component that typically takes seconds to exchange (including the PSU since it's on a backplane with a single-latch release). Better uptime, better stability, no clown lights to boot.
 
Last edited:
  • Wow
Reactions: JohnCarter17

malor

Ars Legatus Legionis
16,093
Since you're ruling out possibilities it's far better to change everything out including the cables.

Personally I typically buy professional workstations for pretty much everything (including gaming these days), tossing out the workstations before the 5-year warranty runs out... and only build for specialised tasks. And I leave the ruling out to the onsite support, or an overnighted component that typically takes seconds to exchange (including the PSU since it's on a backplane with a single-latch release). Better uptime, better stability, no clown lights to boot.
And probably a lot more money than consumer PCs. You can easily spend $20K on a machine like the one you describe.
 
  • Like
Reactions: continuum

NervousEnergy

Ars Legatus Legionis
10,547
Subscriptor
Since you're ruling out possibilities it's far better to change everything out including the cables.

Personally I typically buy professional workstations for pretty much everything (including gaming these days), tossing out the workstations before the 5-year warranty runs out... and only build for specialised tasks. And I leave the ruling out to the onsite support, or an overnighted component that typically takes seconds to exchange (including the PSU since it's on a backplane with a single-latch release). Better uptime, better stability, no clown lights to boot.
I've bought many, many Dell Precision workstations that cost in the $15K to $20K range for our plant engineering groups, and before that DEC Alpha workstations for the first versions of ProEngineer and ProMechanica. The Alphas were... lord... 26 years ago? There's a lot to be said for that approach, and if you've got the money and are willing to leave a fair bit of consumer performance on the table in exchange for rock-solid stability, then it's a good deal.

For gaming, though, the exchange is horrendous value. You can't spec them with consumer GPUs, you have to go with A4000, A6000, etc. These are 10X as expensive and slower than RTX 4 series cards, all for the pleasure of running certified drivers. If you spec with a base model 2GB card with the intention of throwing it out for a consumer GPU, then you've lost that sweet warranty for the card and any driver instability GeForce drivers may cause. RDIMM memory costs more from Dell than a nice used car for any decent amount. Etc...

I do love the Precision integrated case, with it's dual PSU backplane, integrated cabling and power management, hot-swap drive cages, fantastic cooling, etc. They're beautiful machines with great business-class warranty support. I've also gutted a few of memory and GPUs and installed consumer-grade stuff, and they've continued to run fine. But then... what's the point? Better to call Alienware and just buy 2 top-of-the-line gaming systems and if one has issues switch to the other one. You'll be out less and have considerably better performance.
 
I've bought many, many Dell Precision workstations that cost in the $15K to $20K range for our plant engineering groups, and before that DEC Alpha workstations for the first versions of ProEngineer and ProMechanica. The Alphas were... lord... 26 years ago? There's a lot to be said for that approach, and if you've got the money and are willing to leave a fair bit of consumer performance on the table in exchange for rock-solid stability, then it's a good deal.

But then... what's the point? Better to call Alienware and just buy 2 top-of-the-line gaming systems and if one has issues switch to the other one. You'll be out less and have considerably better performance.
I have bought some Precisions in the past, but they always struck me as very agricultural - built for utility and not always to the best in terms of look, finish and feel even for a pro's tool where you don't necessarily care as much about that stuff. I started buying Precisions with the T7400, and I still remember being bitterly disappointed that for what I spent especially compared to the XPS 7x0 H2C's I had as gaming PC's back then, the machine stayed upright thanks to basically a single giant bent-wire coathangar... and that " " design " " philosophy seems to have stuck around with them.

Went HP/Lenovo after the introduction of the Z series / P series pretty much exclusively for home use where I wasn't building something more specialized. As for 'no consumer GPU's' it's not some magic trick to install a regular Geforce when you want to, and in this situation if there's instability then you only have one component to troubleshoot yourself before calling warranty - obvious, no?

As for Alienware / Omen / etc, I've owned a few for gaming on the side types of uses like VR, and they have different priorities for their consumer machines which is easy to see - and it'll be a no from me for something I spend most of my time on.

And probably a lot more money than consumer PCs. You can easily spend $20K on a machine like the one you describe.
I mean you could and I happen to since my main/gaming home machines are current flagship-spec workstations for both Xeon and Threadripper, but you don't have to, espeically if you're going toe to toe capability-wise with a consumer machine.
 

mpat

Ars Praefectus
5,951
Subscriptor
So here was some settings I messed with today:

View attachment 70996

I re-played a scene (Act 2 where you meet the Spider guy) where it hard shut off'd before, three times in a row. And this time, it did not. So very unscientific test and so far so good.

If it happens again I'll let you guys know.

That is very good to hear. What does the AMD overlay say about your power draw now? Because if it said 277W before, I think your card may have had the power limit increased by the manufacturer. The GPU itself is supposed to be power limited to 255W by default.
 

malor

Ars Legatus Legionis
16,093
I mean you could and I happen to since my main/gaming home machines are current flagship-spec workstations for both Xeon and Threadripper, but you don't have to, espeically if you're going toe to toe capability-wise with a consumer machine.
I think about the minimum to get into one of those is $5K, and a machine like that is probably going to kind of suck for gaming, since it will have a weak video card. Compare that with a $3K self-built machine (I'm assuming a $1600 4090), and the homebrew version is going to blow the doors off the expensive machine for gaming, and possibly for a fair number of other tasks as well.

The enterprise-class machines are nice, but don't make a lot of financial sense as a personal workstation unless you're directly generating revenue with them, and downtime is very expensive. If you're gonna lose, say, $5K/day if your computer breaks, then spending $20K on a machine might make perfect sense, if the service contract is good enough.

Probably not too many of the readers here are in that position, however. Even when we're making money with our machines, we can typically just fall back to a laptop while taking a couple days to order in parts to fix the desktop. I would personally much rather use workstation money on, say, a new car, or a boat, or a bigger house, or whatever.
 
Last edited:
So here was some settings I messed with today:

View attachment 70996

I re-played a scene (Act 2 where you meet the Spider guy) where it hard shut off'd before, three times in a row. And this time, it did not. So very unscientific test and so far so good.

If it happens again I'll let you guys know.
AAANNNDDDD shut off again later into Act 2. Lost a ton of progress. I hate everything.
 
  • Sad
Reactions: abj
Seasonic has made more effort than most with their 12VHPWR cable and connector, plus they have a new cable that meets the new intel standard they will give you if you ask for it.

 
  • Like
Reactions: continuum

mpat

Ars Praefectus
5,951
Subscriptor
850w isn't always 850w as far as PSUs go, if you see what I mean. I always slightly overprovision with a top-end Seasonic PSU if building (I don't mostly, as per above post) to avoid issues.
I’m also in favor of Seasonic, but I find shopping for PSUs to be a bit too much of “buy what I always have”. Does anyone test how PSUs handle transenient spikes?
 

malor

Ars Legatus Legionis
16,093
Yes there are tests from time to time you can dig up. I've tended to go with Seasonic's traditional rep (and not go with their lower-end models), and that's not failed me yet.
Yeah, if you peruse the PSU Tier List, you'll see a lot of low-ranked Seasonic products. Only their premium models seem to be truly good, these days.
 
So.....

Shut off last night while trying to rescue Tieflings in the Moonrise towers jail. However, when I started this morning, AMD Adrenaline said it reverted back to default settings due to a system failure. That never happened with the older PSU.

If this happens again, would I assume it is the Video card then? It has ONLY started and still only does this during BG3.
 

io-waiter

Ars Tribunus Militum
1,513
The driver reverts the “OC” tuning back to default if the machine reboots/powers off uncleanly. If you ran at default driver settings for all tuning, it would not have shown a reset message, it’s a safety for not ending up in a crash loop.

Did you still have the reduced power limit settings ?

DX11 is more stable for me on AMD, Vulcan had regular crashes, but just the game engine.

When swapping PSU, all cables where swapped as well?