Disable NPU in BIOS?

Lord Evermore

Ars Scholae Palatinae
1,490
Subscriptor++
Can the Neural Processing Unit (dumbass name) on Intel and AMD CPUs generally be disabled in the UEFI on most motherboards? Desktop and laptops? I don't expect to ever need those functions (though I probably won't be upgrading to a CPU that has one for quite some time anyway; maybe 5 years down the line it will be unavoidable, they'll just be integrated into common tasks), and it seems that having it enabled, even when not running a workload, would add a small amount of heat unless the CPU automatically completely disables it when not active.

I found one Intel PDF that mentioned disabling it in the BIOS because it won't work in Windows 10, but it isn't clear if not disabling it actually would cause a problem, like AI workloads failing because they want to use the NPU but can't or something like that, or the OS freaking out or just showing some unknown device in Device Manager. (It doesn't look like it's necessarily fully supported in Linux, either.)

Is it likely that either of them will release models without an NPU (either disabled because it turned out defective or missing the chiplet/tile completely)? They've both done this with models that have the IGP disabled. I think it's less likely with the NPU because they're invested in the AI push, where you WILL USE AI whether you want to or not, whether it benefits you or not, and allowing you to have a PC that doesn't include an NPU gives the lie to the assertion that people won't be able to live without it. And of course Microsoft is trying to make it seem like you have to have it for their AI software that nobody asked for. I imagine even allowing it to be disabled in the UEFI is only done due to the possibility that people will still want to install an OS that doesn't support it and it might cause problems.
 
  • Like
Reactions: Made in Hurry

Made in Hurry

Ars Praefectus
4,553
Subscriptor
I think the NPU is going to be a required part to even boot Windows in the future as it seems Microsoft is going Gung-Ho on everything AI related.
It will probably not happen with the first release of Windows 12, but probably some service pack or major update in a few years. RAM requirements are also going up quite a lot at least for Windows, but i have seen 23H2 requires more memory than previous updates.

I think this is going to create some headaches for the Linux community to be honest as there seems to be a small at least, but yet a movement towards migrating to Linux these days.
I am not an advocate in such way, i am just looking at the crystal ball, and i doubt it will be possible to disable it, and personally i am slowly migrating to Linux, but more to become platform independent as i have no interest in anything AI at all, or a copilot button etc etc.

I am not investing in any hardware these days until the dust kind of settles and seeing how computing will be going forward. Apple is implementing it, but hopefully in a non-intrusive way, so personally i am going to have to make a decision on what platform i am going to invest in going forward.

WIth Qualcomm and ARM, Windows 10 EOL and the AI craze, these are interesting times going forward.
 

hobold

Ars Tribunus Militum
2,657
I told a joke. I laughed. My wife laughed. The toaster laughed. I shot the toaster.
This could be a contender for the shortest dystopian SF story that is relevant to today's society.:)

The current crop of AI accelerators are not intelligent, though; not even smart. Most of them are just hardwired low precision multipliers and adders for matrix calculations.
 

whoisit

Ars Tribunus Angusticlavius
6,565
Subscriptor
This could be a contender for the shortest dystopian SF story that is relevant to today's society.:)

The current crop of AI accelerators are not intelligent, though; not even smart. Most of them are just hardwired low precision multipliers and adders for matrix calculations.

If they do that, I hope they dig up who originally said it. I heard it somewhere long ago.
 

steelghost

Ars Praefectus
4,975
Subscriptor++
I wonder if We should think twice about putting AI hardware in everything. I can see a market for those of us who don't want it.
It's nothing more than an accelerator for certain types of calculations, just like the "FPU" was back in the day, or the GPU is today.

Of course there are architectures like RISC5 that will allow people to have any set of functional blocks they like in their CPUs, so you'll be able to get something without "an NPU" if you want it, but the classes of problem that can be addressed with a neural network approach are too useful. I think CPUs that don't have it because things that people expect to work will be slower, more error prone or just plain won't work.
 

Lord Evermore

Ars Scholae Palatinae
1,490
Subscriptor++
i have seen 23H2 requires more memory
Requires, or just appears to use? It's been years since Windows just started loading up as much as it could into memory at all times, making it appear that the RAM was "in use" but it was just pre-loading stuff so that applications would have it instantly available if needed, and that memory was released if it wasn't actively being used and an application needed other data loaded. That said, if it's going to require that AI bullshit always be active, RAM requirements will definitely go up.

If they do make it a requirement to even install/boot, they may do the same that happened with Windows 11's TPM and other requirements, where it's possible to install without it, but it won't be "supported". Then the next version, or a major update a couple of years in, will disable that workaround.

I wonder if We should think twice about putting AI hardware in everything. I can see a market for those of us who don't want it.
That sounds like paranoia that the AI hardware is going to do something bad. The hardware won't inherently do anything on its own, other than wasting space on the CPU package and generating heat, and the software can be made to run AI workloads whether there is dedicated hardware or not; it will just run on the GPU (possibly faster) or the CPU (much slower). Integrating AI hardware into random other devices with custom chips, of course, the hardware could have "software" embedded into it that would run without "installing" anything, but you could technically build a CPU that has the entirety of Windows "in hardware", if you wanted a processor the size of a small house. That's how the first computers were built, and that's what an ASIC is.
 

Made in Hurry

Ars Praefectus
4,553
Subscriptor
Requires, or just appears to use?
Maybe it is just on my end, but I noticed on some installs, and not only mine that after 23H2, memory usage is generally higher from boot. I have been curious to understand what performance would be on an 8GB machine as usage often surpases 8GB by a good margin while doing nothing. I would be curious to understand what others see on their machines. The systems i have noticed it on have 16GB+
It's nothing more than an accelerator for certain types of calculations, just like the "FPU" was back in the day, or the GPU is today.

Of course there are architectures like RISC5 that will allow people to have any set of functional blocks they like in their CPUs, so you'll be able to get something without "an NPU" if you want it, but the classes of problem that can be addressed with a neural network approach are too useful. I think CPUs that don't have it because things that people expect to work will be slower, more error prone or just plain won't work.
Wouldn't that junk a fair chunk of hardware yet again, or do you think the NPU far outweighs the benefits?
I really wonder how the Linux distributions will handle this.
 

Lord Evermore

Ars Scholae Palatinae
1,490
Subscriptor++
The clock and power gating on these thing is really advanced. You will never notice its power draw. A potential outcome of monkeying with it in the BIOS will be that because it's disabled it can't be powered down, which is the opposite of what you intend.
I would assume that disabling it signals the controller circuits in the CPU itself to simply always power it down, and not even reveal its existence to software, rather than just letting it run but somehow blocking software from accessing it. You can disable the IGP in the BIOS as well (or at least you could in the last systems where I had an IGP) and I can't imagine it was just sitting there cycling power through that, keeping it "alive" but not performing work. I'm sure it is just a tiny bit of power, but we're always being told that every little bit helps.
 

Aeonsim

Ars Scholae Palatinae
1,057
Subscriptor++
So both AMD and Intel are putting AI in their CPUs. Why not put the NPU on a PCIe card instead?
That's what a GPU is, or if you more specifically are talking about lower precession accelerators that can't also do graphics then Google Coral. They want the function more tightly integrated into the CPU for lower power use and marketing.
 

Lord Evermore

Ars Scholae Palatinae
1,490
Subscriptor++
That's what a GPU is, or if you more specifically are talking about lower precession accelerators that can't also do graphics then Google Coral. They want the function more tightly integrated into the CPU for lower power use and marketing.
They want to be able to force AI into everything now (because they've run out of ideas for other new useful features and this is the hot new trend, and they need something to make sales), regardless of whether users need or want it, and that means the computer has to support it at a minimum capability level. Integrated graphics can't accelerate it well enough, or at low power, and until the current generation of AMD processors they couldn't depend on even a minimal IGP existing. (Not that Microsoft would necessarily hold off on anything just because of AMD given their market share.) If it depends on the customer making the choice of purchasing an extra card later, or having to pay extra to an OEM for an add-in card, then it won't happen unless they can come up with a "killer app". If it's already baked into the price of a required component, then customers will just have to pay it, and they won't know what fraction of that price was due to the NPU they didn't need. They haven't come up with anything that people really need or want. Copilot and Gemini and all the other AI horseshit that companies are coming up with for consumers would be of interest to only the tiniest fraction of users if it wasn't being shoved down their throats, and the only way it will get any use is if they just make it impossible not to use it, baking it into standard functions we can't not use or making it the default that users won't think to change. Microsoft can put the product into the OS, then has the power to ensure that OEMs and CPU vendors will follow along with the hardware to make it actually work to a usable level on every machine, along with all the marketing to make people think it's something valuable. The same happens on the Apple front, and the Android front, and the ARM front.
 
  • Like
Reactions: whm2074

Aeonsim

Ars Scholae Palatinae
1,057
Subscriptor++
I am trying to understand the benefits, and a part of me just think this is a new Altivect/SIMD/3DNOW/MMX etc, but how it is going to be implemented is of more interest than the actual NPU itself. Is it going to be used to scrape more data from you, or does it actually have any decent benefits?
At the moment there is minor benefits mainly around battery life. Having it built into the SoC/CPU means it can be on a very advanced processor node and little energy is needed to send data to and from the CPU to it. The Apple Silicon macs have had them since the get go, but seem to have very limited use cases, at the moment seem to mainly be used for processing video effects when using the mac camera. Tends to only use a single watt for this, while the CPU cores would likely need more to deliver the same effect especially if they have to fire up a performance core.

Apparently a number of machine learning algorithms in Apple's CoreML library also use the NPU/ANE, the general advantage is okay performance at a lower power cost than having to fire up 1 or more performance cores or the main GPU to deliver the same effect. Mostly useless if you have a desktop with a decent GPU or even a reasonable integrated GPU, but the energy savings can be noticeable on a battery powered device. And for some matrix heavy math algorithms it probably gives you more performance than a cpu but less than a gpu.

It's a lot more than SIMD/MMX/AVX, those were/are extra specalised instructions that still run on the primary CPU, for an NPU it's a completely different compute pathway optimised (and limited to) for processing matrix's or other specific mathematical operations that can run independently on specialised data types. More of a DSP or co-processor.
 
Last edited:
  • Like
Reactions: Made in Hurry

Aeonsim

Ars Scholae Palatinae
1,057
Subscriptor++
I am trying to understand the benefits, and a part of me just think this is a new Altivect/SIMD/3DNOW/MMX etc, but how it is going to be implemented is of more interest than the actual NPU itself. Is it going to be used to scrape more data from you, or does it actually have any decent benefits?
You can read https://github.com/hollance/neural-engine for more info about the Apple ANE/NPU.
 

Lord Evermore

Ars Scholae Palatinae
1,490
Subscriptor++
I am trying to understand the benefits, and a part of me just think this is a new Altivect/SIMD/3DNOW/MMX etc, but how it is going to be implemented is of more interest than the actual NPU itself. Is it going to be used to scrape more data from you, or does it actually have any decent benefits?
In this case, the software that's going to use the features is already in place, to some degree. With those you mentioned, there were things that just couldn't be done well because the CPUs and GPUs couldn't do it, so to a large degree the hardware had to be made available before any workloads could be created. The hardware was made with the anticipation that software would later come out to take advantage of it. Then for many reasons, the software never appeared, at least not to a huge degree. There would be some flagship product that used MMX or 3DNow! or SSE, stuff they could show in marketing that ran so much better with that enabled, but the majority of it never really appeared or only became useful with certain applications. Sometimes other technology was developed suddenly that made things like you mentioned nearly obsolete before they could take hold.

But there are already AI workloads, even for consumer devices, that were ready to go, just waiting for hardware implementation of an NPU to allow them to take off (at least the creators hope they'll take off with consumers). They were already running, unoptimized, using the hardware that was in place like the CPU or GPU or in the cloud. There aren't a LOT of them, but they exist, and some people are using them and want them to run better, faster, with lower power, on local hardware. So at the least this is a little different from those previous technologies. (And some of those do still have niche uses. SSE of whatever version is used for video stuff I believe.)

The NPU is a bit like the IGP. We were already running graphics, and there was an opportunity to make low-cost, low-power basic graphics built into the CPU rather than having an add-in card. (And even the first 3D accelerators had a existing application to make use of them, where they were taking the load from the CPU. Same with 2D video graphics even.) Companies like Microsoft and Intel are hoping that the NPU will see the same type of universal usage as the IGP, but I really doubt it will happen without forcing it upon people.

Whether it gets used to scrape data or has any real benefits just depends on what the companies decide to actually make it do. I don't see any REAL use for AI on my desktop. I can locate files just fine without generating a load on the processor trying to understand my spoken request and scanning all my patterns of file accesses and application usage. I don't need AI to write letters for me, and spelling and grammar checking works pretty well without AI. (Given the examples we've seen over the years, AI doesn't seem to be any better at those than we are.)
 

hobold

Ars Tribunus Militum
2,657
Apple's inclusion of a matrix math accelerator was probably driven by phones. Heavy brute force computations are primarily limited by power / heat, where phones are most constrained. For laptops or desktops, there just wasn't the same extreme pressure to integrate a new set of circuits; they could perform the operations fast on the existing processors (CPU / GPU), and power consumption wasn't such a big deal really.

The recent rush to copy Apple is, as usual when it comes to copying Apple, mostly a marketing thing. Because in the few cases where Apple actually was first to introduce some new functionality, it didn't necessarily work right away, or take off at all. But competitors are always eager to copy, even when they have no clue whatsoever what Apple's long term plans are for the innovative gimmick (and Apple usually does have a plan, but it doesn't necessarily always pan out).

Apple's NPU initially was nothing but an overpowered digital signal processor for the camera. That alone justified its inclusion, because cameras were a major driver of phone sales. With laptops and desktops, cameras aren't much of a distinctive feature.
 
  • Like
Reactions: Lord Evermore

redleader

Ars Legatus Legionis
35,019
I am trying to understand the benefits, and a part of me just think this is a new Altivect/SIMD/3DNOW/MMX etc, but how it is going to be implemented is of more interest than the actual NPU itself. Is it going to be used to scrape more data from you, or does it actually have any decent benefits?
It's not like Altivect/SIMD/3DNOW/MMX in that this is not a new x86 extension. It's a separate processor that does matrix math, basically another iGPU except without the video output.

There's an overview here of the new Intel one: https://chipsandcheese.com/2024/04/22/intel-meteor-lakes-npu/

It looks super limited, slower than the iGPU even, so not much to get excited about.
 

Lord Evermore

Ars Scholae Palatinae
1,490
Subscriptor++
It looks super limited, slower than the iGPU even, so not much to get excited about.
Perhaps slower in absolute terms, but I think the goal was to make it MUCH more performance per watt due to being task-specialized compared to using the GPU. A GPU is just an accelerator for the tasks involved in 3D graphics, and gives more performance per watt than using a CPU for that, but since 3D graphics was an existing task in wide use, the first 3D accelerators had to be faster in absolute terms as well right from the start. AI workloads aren't a widely used thing yet. An NPU has great potential to become much better at them than a GPU while also saving power, IF a need for it can be shown. They need to prove the software is going to be there before the hardware is going to get built, but nobody is going to even want to begin using the software if it's so power-hungry that it can't be made mobile (mobility basically not having been a requirement at all when 3D accelerators started to be created, and power usage was hardly even a concern; it was so low that they didn't even need heatsinks).
 

w00key

Ars Praefectus
5,907
Subscriptor
So back to the title, even if you disable it the "AI" code can run just fine on the CPU and GPU, it just takes more power. The GPU is faster anyway, and CPU isn't too far behind.

"AI" as it is nothing but fancy autocomplete so far. There is plenty of real useful applications for a matrix multiplication unit, image and video processing can use it to make resampling / scaling / magic tools in Photoshop much faster. If you don't want your computer to go sentient just don't run skynet.exe.
 

redleader

Ars Legatus Legionis
35,019
Perhaps slower in absolute terms, but I think the goal was to make it MUCH more performance per watt due to being task-specialized compared to using the GPU. A GPU is just an accelerator for the tasks involved in 3D graphics, and gives more performance per watt than using a CPU for that, but since 3D graphics was an existing task in wide use, the first 3D accelerators had to be faster in absolute terms as well right from the start. AI workloads aren't a widely used thing yet. An NPU has great potential to become much better at them than a GPU while also saving power, IF a need for it can be shown. They need to prove the software is going to be there before the hardware is going to get built, but nobody is going to even want to begin using the software if it's so power-hungry that it can't be made mobile (mobility basically not having been a requirement at all when 3D accelerators started to be created, and power usage was hardly even a concern; it was so low that they didn't even need heatsinks).
I think you're right that this is a preliminary part designed to test a concept that may or may not be expanded in future generations into something more capable. But this is also why it's hard to be excited at the moment. There probably won't be a lot of applications for the initial first gen hardware.

Side note: the NPU and GPU actually have very similar needs in terms of memory and both do matrix math quickly. I suspect that eventually the NPU and GPU end up sharing more hardware (at least memor/cache), but makes sense that they would get to that much later.