PC shuts off during Baldurs Gate?

malor

Ars Legatus Legionis
16,093
Boy, that's frustrating. I would really have expected the power supply to fix that. You could probably still return it and go back to the old one.

From there, it's either motherboard, CPU, or RAM. I would guess that RAM is probably not the cause, since BG3 is the only game doing that, and it's not just a crash, it's an instant hard poweroff. I could see good arguments for either of the other two; maybe the motherboard isn't able to supply enough power, maybe something's busted on the GPU.

Since replacing the motherboard is the single most painful thing to do on a PC, I'd probably try testing with another graphic card first, if you have one available. If you need to buy one, those are also easy to box up and return if they don't fix the problem.
 

NervousEnergy

Ars Legatus Legionis
10,547
Subscriptor
Boy, that's frustrating. I would really have expected the power supply to fix that. You could probably still return it and go back to the old one.

From there, it's either motherboard, CPU, or RAM. I would guess that RAM is probably not the cause, since BG3 is the only game doing that, and it's not just a crash, it's an instant hard poweroff. I could see good arguments for either of the other two; maybe the motherboard isn't able to supply enough power, maybe something's busted on the GPU.

Since replacing the motherboard is the single most painful thing to do on a PC, I'd probably try testing with another graphic card first, if you have one available. If you need to buy one, those are also easy to box up and return if they don't fix the problem.

I'd be completely flabbergasted it wasn't the power supply too. VRMs failing could be easily tested by stressing the CPU with any freely available stress test routine. It's been a while, but I think I last used Prime95. That would expose power or cooling faults, as not even the most intense scenes in BG3 would stress a CPU like P95. I'd seriously doubt that's it, but it's easily tested.

My money would now be on the GPU, since IIRC you'd found a couple of high-intensity panning scenes that could reproduce the issue. Have you tried something like FurMark to see if that crashes it?

Like the Ship of Theseus, we'll soon have you building a whole new PC, piece by piece.
 

theevilsharpie

Ars Scholae Palatinae
1,199
Subscriptor++
Dumb, obvious question that we should probably have asked: are you sure your PC isn't actually losing power?

Even a brief brownout on the circuit it's plugged into on could be enough to either trigger a PSU safety mechanism or starve it of enough voltage that it can't support the load anymore.

My PCs (including my gaming PC) are on a UPS, so this isn't something I've had to worry about. If yours isn't, you may want to consider moving other devices off of the circuit the PC is running on, even if only to test.
 

cerberusTI

Ars Tribunus Angusticlavius
6,449
Subscriptor++
Dumb, obvious question that we should probably have asked: are you sure your PC isn't actually losing power?

Even a brief brownout on the circuit it's plugged into on could be enough to either trigger a PSU safety mechanism or starve it of enough voltage that it can't support the load anymore.

My PCs (including my gaming PC) are on a UPS, so this isn't something I've had to worry about. If yours isn't, you may want to consider moving other devices off of the circuit the PC is running on, even if only to test.
That is a good note, and it is not always obvious either.

I had two UPS units click on at the same time today to handle what I am assuming is a voltage sag by the low numbers which appeared on their screens. It was not noticeable in other ways, but it did cause my file server which is not on a UPS to crash (I should really refresh that battery and put it back on one).

I checked my video card (and system) with https://www.ocbase.com/ before turning on ECC and running some long compute tasks on a 4090. I never pushed anything to the point it failed, and it came back clean in all ways so there was nothing to show, but it was easy to use and has video card stability tests.

Instant power off really does suggest a motherboard to me if it is not a power supply issue though (or a power to the power supply issue).
 
Last edited:
  • Like
Reactions: hansmuff
Dumb, obvious question that we should probably have asked: are you sure your PC isn't actually losing power?

Even a brief brownout on the circuit it's plugged into on could be enough to either trigger a PSU safety mechanism or starve it of enough voltage that it can't support the load anymore.

My PCs (including my gaming PC) are on a UPS, so this isn't something I've had to worry about. If yours isn't, you may want to consider moving other devices off of the circuit the PC is running on, even if only to test.

yah, going into a UPS, no clicking and everything else on the desk is powered just fine.
 

cerberusTI

Ars Tribunus Angusticlavius
6,449
Subscriptor++
yah, going into a UPS, no clicking and everything else on the desk is powered just fine.
How much can you swap out?

The last two intermittent power off issues I had resulted from:

1) A video card, long ago (a 9800 Pro I believe), which did not handle transitioning from low power to max power repeatedly very well. Some games crashed it, and setting it to not go into its lowest power state fixed it.

2) A keyboard short, which eventually killed the USB port on the motherboard, and the system was never entirely stable again (it was old at the time and I replaced it as a main computer, but the intermittent crashes continued until it was taken out of service and not everything on the board worked).

Peak power use for a modern video card can be high (my 4090 reports the max power over time as 309 watts since power on, with the peak power use is listed as 675 watts), but trying two supplies and reconnecting everything makes that less likely.

If your MB had three power plugs, did you connect them all?

If you run hwinfo, which MB voltages are reported, and what kind of range do they have?
 
Well that was completely unhelpful. Technical help threads are NOT the place for tribalism.
This is why you put a real video card in your system, not some overpriced AMD half-baked toy!

Very amusing considering all the bullshit AMD and their fanboi's have come out with regarding rebar and how they invented it and it works better for them!

Searched online and couldn't find a way to disable rebar for a specific application in AMD drivers, but with nvidia its pretty easy to do with nvidia profile inspector.

Two months of hair pulling and trying different solutions for Pocky all because AMD's driver team couldn't code their way out of a wet paper bag!
 

Attachments

  • NVinspector REBAR.jpg
    NVinspector REBAR.jpg
    130.3 KB · Views: 9

IceStorm

Ars Legatus Legionis
24,871
Moderator
Anyone else find something on Resizable BAR causing issues for them?
The only time I had ReBAR cause an issue for me, it was due to a marginal riser cable. It didn't cause the system to shut down, just caused games to crash. This was using an nVidia card.

If you have a B550 or X570 board, try lowering the PCIe generation on the PEG slot to 3.0, then re-enabling ReBAR. If It's 400 series or 300 series, then the slot's already at Gen 3.
 

NervousEnergy

Ars Legatus Legionis
10,547
Subscriptor
Wild that it was a BIOS setting. I would have thought that would cause some sort of driver panic that would show up in the logs with a blue screen, not just... 'power off'.

I experienced one physically cracked mobo in the past. I imagine that could do it under thermal cycling or vibration. That was a frustrating intermittent problem until it wasn't (intermittent).
Long ago back in the Pentium II days I had a new MB where one of the screw mounting holes was slightly off center, and the little flat-head machine screw just barely touched a MB trace that ran near it. Took forever to figure that out, but was obvious once we did. I hate those kinds of problems.
 

malor

Ars Legatus Legionis
16,093
Hooray for useless power supply purchases.
Sorry for the bad advice. It really sounded like a bum supply, especially since you'd just installed a new one and then started having issues.

Doing BIOS updates might fix the actual problem. Resizeable BAR isn't a huge performance gain, but can be noticeable, so getting that working properly would improve framerates somewhat. It's like a 5 or maybe 10% speed boost for free. It doesn't work in every game, but helps with many.

edit: what motherboard is it, btw?
 

continuum

Ars Legatus Legionis
94,897
Moderator
elsewhere that forcing it down to PCIe 4x or something might help
Probably try PCI-e 3.0 instead of PCI-e 4.0 is what you're thinking.

ow do I check what the riser cable from my case tolerates
Check the cable for a part number and check; check the invoice/spec sheet for the cable/case (if it came with the case) as well.
 
Probably try PCI-e 3.0 instead of PCI-e 4.0 is what you're thinking.

Check the cable for a part number and check; check the invoice/spec sheet for the cable/case (if it came with the case) as well.

Looking at the case specs, it is already a "high quality" PCIe 4.0 cable. So I doubt that is it. I might try and reseat everything and give it a whirl. What girds my loins about this is that it is ONLY BG3. Running Furmark for hours is fine. Stupid windows!
 

hobold

Ars Tribunus Militum
2,657
What girds my loins about this is that it is ONLY BG3. Running Furmark for hours is fine.
Furmark is a static scene. Presumably it can live entirely on the GPU, without a whole lot of data moving across the PCIe connection. The game, in contrast, probably transfers a lot of assets into VRAM as you travel the game world.
 

malor

Ars Legatus Legionis
16,093
Looking at the case specs, it is already a "high quality" PCIe 4.0 cable. So I doubt that is it. I might try and reseat everything and give it a whirl. What girds my loins about this is that it is ONLY BG3. Running Furmark for hours is fine. Stupid windows!
The problem is that you don't actually know what the cable quality is. We don't have any way to test them, unfortunately; machines that could do that would cost $KIDNEY.

I like to avoid cases that use risers or riser cables, because they're adding another potential point of failure. It's usually fine, but it adds something you may potentially have to troubleshoot.

I think the fact that turning off Resizeable BAR improved your symptoms is probably a big honking clue about what's actually wrong. Unfortunately, I can't think of how to interpret that clue. Other than suggesting you do all your BIOS updates (remember that the card can have firmware updates too; this 3070, for instance, didn't support Resizeable BAR when I first got it), I'm not sure where to go from there.

It's a big flashing red light, but it's not attached to anything, at least in my mental model of PC troubleshooting. Well, other than motherboard BIOS and card firmware, anyway.
 

io-waiter

Ars Tribunus Militum
1,513
I’d run prime95 and set it so that it uses almost all ram as well as all CPU. Then if stable, step it up to running furmark and prime at the same time, make sure to tune back the resource usage of prime a bit to let furmark get run full tilt while still maintaining high cpu and ram load.

Thing is that since it “reset” cold reboots it’s not that the OS sees a fault, its hardware, this highly points to electrical issues.
 

mpat

Ars Praefectus
5,951
Subscriptor
Looking at the case specs, it is already a "high quality" PCIe 4.0 cable. So I doubt that is it. I might try and reseat everything and give it a whirl. What girds my loins about this is that it is ONLY BG3. Running Furmark for hours is fine. Stupid windows!
Even if you consider it unlikely, switch down PCIe 3.0 anyway. It isn't going to cost you anything (the game isn't going to bottleneck on 3.0 x16), and if it works, it works.
 

hobold

Ars Tribunus Militum
2,657
I think the fact that turning off Resizeable BAR improved your symptoms is probably a big honking clue about what's actually wrong. Unfortunately, I can't think of how to interpret that clue.
Yes, that ReBAR thing is what steered me towards suspecting PCIe problems, too.

I am not qualified to interpret the ReBAR pseudo-symptoms, but I have a vivid imagination about one single wire or contact of the PCIe connection being unreliable.

When ReBAR is off, at most 256MB of VRAM can be visible to the CPU at any single point in time. Any transfers into that window thus have relatively short adresses (= offset from window start address), and any transferred data blocks have relatively short length (they have to fit within the window).

When ReBAR is on, all of the VRAM can be visible and addressable at the same time, and larger block transfers can be sent across PCIe.

So my unsubstantiated fantasy is that the single unreliable wire or contact affects one of the high address bits. And when ReBAR is on, wide addresses are being used, in particular the flaky wire is responsible for transferring an address bit. When it drops out, hilarity ensues.

(The hypothetical defective wire is being used even when ReBAR is off, but only for data. So the failure mode then is less spectacular; a miscolored pixel or texel every now an then, or occasionally a displaced vertex.)
 

malor

Ars Legatus Legionis
16,093
So my unsubstantiated fantasy is that the single unreliable wire or contact affects one of the high address bits. And when ReBAR is on, wide addresses are being used, in particular the flaky wire is responsible for transferring an address bit. When it drops out, hilarity ensues.
That sounds like a possible explanation, certainly worth testing. It may not be correct, but it's a lot better than being totally stuck.