Distributed Computing takes on Coronavirus COVID-19

JimboPalmer

Ars Tribunus Angusticlavius
9,402
Subscriptor
https://foldingathome.org/2020/03/10/covid19-update/

Folding@Home is now in Production with 6 COVID-19 simulations.

11741: Coronavirus SARS-CoV-2 (COVID-19 causing virus) receptor binding domain in complex with human receptor ACE2. atoms: 165550, credit: 15396

11746: Coronavirus SARS-CoV-2 (COVID-19 causing virus) receptor binding domain in complex with human receptor ACE2 (alternative structure to 11741). atoms: 182699, credit: 16615

11742: Coronavirus SARS-CoV-2 (COVID-19 causing virus) protease in complex with an inhibitor. atoms: 62227, credit: 9405

11743: Coronavirus SARS-CoV-2 (COVID-19 causing virus) protease – potential drug target. atoms: 62180, credit: 9405

11744: Coronavirus SARS-CoV (SARS causing virus) receptor binding domain trapped by a SARS-CoV S230 antibody. atoms: 109578, credit: 7608

11745: Coronavirus SARS-CoV (SARS causing virus) receptor binding domain mutated to the SARS-CoV-2 (COVID-19 causing virus) trapped by a SARS-CoV S230 antibody. atoms: 110370, credit: 7685
 

JimboPalmer

Ars Tribunus Angusticlavius
9,402
Subscriptor
Apparently, the number of Folders on Sunday is 4 times what it was on Thursday. The number of Work Units they planned for the weekend was all gone by Saturday morning. They have contacted some researchers by phone and they have added more WUs, but the backlog means they are gone in seconds.

Even worse, the servers that could keep up with the Thursday user base, is overwhelmed by the Sunday user base. My experience is that faster servers are more expensive servers. That will take a while to finance.

On the other hand, boy are we doing science!
 

MichaelB

Ars Scholae Palatinae
889
Subscriptor
Apparently, the number of Folders on Sunday is 4 times what it was on Thursday. The number of Work Units they planned for the weekend was all gone by Saturday morning. They have contacted some researchers by phone and they have added more WUs, but the backlog means they are gone in seconds.

Even worse, the servers that could keep up with the Thursday user base, is overwhelmed by the Sunday user base. My experience is that faster servers are more expensive servers. That will take a while to finance.

On the other hand, boy are we doing science!

That would explain how long I had to wait between WUs!
 

JimboPalmer

Ars Tribunus Angusticlavius
9,402
Subscriptor
All setup, are there still issues with WU? I see that my GPU has not crunched anything for quite some time, and I cannot get it to pull any data for some reason. Funny enough I was having that issue with my CPU the other day, and my GPU was running like bandits!
This is my new excuse, I wonder how long it lasts.

We now have 8 times the volunteer folders we did on Thursday.
The researchers got behind but now have generated 'enough' WUs.
However, a server and network that is capable of serving 1/8 as many people may not do as well with that growth.
My understanding is that they are scrambling to add capacity. Sadly faster servers tend to be expensive, and the speed a government procurement process proceeds at is legendarily slow.

Sadly there are relatively few Billionaire Biochemists to donate cash.
We will see how long to get more server/network capacity.

I have 7 CPU slots idle and 2 GPU slots idle. Sometimes I see an HTTP error I think is a network capacity issue but more frequently I see a no WU are available for this configuration that I think is a server error.

If they can solve one, we will still have the other.

Even though each volunteer is doing less, total work done is wildly increased. 5.5 times as much work done, but 8 times as many folk trying to do work. So some folks will be butt hurt.
 

MadMac_5

Ars Praefectus
3,700
Subscriptor
I've been getting intermittent work units too. Right now I'm crunching away on the first CPU unit I've had since Friday, and during those times I've finished maybe 3 or 4 GPU units. I'm leaving the client running and ready to scoop up more work as it arrives; at this point I care less about points per day and more about helping push the science forward. Once the immediate crisis is over, though, and things have stabilized, then I'll try to keep things as efficient as possible through a Manitoba summer (hot days, cooler nights).
 
Posting to orange this for later. F@H is something I've been meaning to try contributing to for a while now, and this seems as good a time as any to investigate.

Assuming I get it set up and running, is there anything needed to join the Ars Technica team besides using the team number in the config?
To answer my own question, nothing needed besides putting in the team number. Now if the stats site wasn't down I could see my small contribution showing up!

This was surprisingly easy to get running. I've got it ticking away on my desktop (with a lowly GTX 760 and an i7-3770) and one of my servers (dual E5-2660v2s).

The installer works as-is to install on Windows Server Core, which slightly surprised me. From there I just had to figure out fahclient.exe and config.xml syntax to get it running and remotely managed by my desktop.
 

Drizzt321

Ars Legatus Legionis
28,408
Subscriptor++
Have rarely done F@H, putting my new 3800X and 1080 GTX to good use at night when I'm not using it!

EDIT: And added Team Ars (with passkey!)
EDIT2: Apparently I should get better cooling if I really want to crank the CPU like this. In Ryzen Master I'm just at 4GHz on 7 of my 8 cores (F@H seems to want to leave 1 core lightly loaded), and it's running ~84C. Only have the Noctua NH-U12S tower with 1 120mm fan. Which I think is running ~1270 RPM based on HWMonitor.

It's still just chewing through the corona virus CPU stuff though. WU says ~3.4 days to complete, looks like I'll be done in 35-40 minutes or so. Nice.
 

JimboPalmer

Ars Tribunus Angusticlavius
9,402
Subscriptor
F@H devotes one CPU thread to keeping your GPU fed over the PCIE bus.

But it turns out that the CPU Core_a7 hates large prime numbers and their multiples. 7 is always large, 5 is rarely large, and 3 never is.
So rather than run 7 threads, I bet it down shifted to 6 threads.

If you spend WAY too much time reading your Log files you may see parameters like -np 6 or -nt 6

In any case it is trying to not throw away work due to a known quirk in GROMACS. https://en.wikipedia.org/wiki/GROMACS
 

Drizzt321

Ars Legatus Legionis
28,408
Subscriptor++
F@H devotes one core to keeping your GPU fed over the PCIE bus.

But it turns out that the CPU core hates large prime numbers and their multiples. 7 is always large, 5 is rarely large, and 3 never is.
So rather than run 7 cores, I bet it down shifted to 6 cores.

If you spend WAY too much time reading your Log files you may see parameters like -np 6 or -nt 6

In any case it is trying to not throw away work due to a know quirk in GROMACS. https://en.wikipedia.org/wiki/GROMACS

Well, actually 14, since it's SMT, so 14 of 16 available threads.

But yea, also makes sense to keep 1 core free to feed the GPU. It's certainly not idle, but not as heavily used as the rest.

So it's all done, but now unfortunately according to the log file:

07:18:29:WARNING:WU02:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
07:18:29:ERROR:WU02:FS00:Exception: Could not get an assignment

So looks like my CPU is going to be idle for a bit until the system has some more WUs that can operate for my system. Ah well. Bed time!
 
From my systems, it looks like last night they got the servers under control and work is being issued and returned.

Now we find out how much new work there is really is!
I'll check my setup remotely today... If everything is maxed out on work and staying that way, I'll probably add my other 20core/40thread server to the pile. I ran squarely into the 32-core limit with the first one, and solved that by changing from one CPU autoslot to 2 20-core slots.
 

Drizzt321

Ars Legatus Legionis
28,408
Subscriptor++
Apparently there was some work over night, although might just have been the GPU stuff for cancer, but no new work for either CPU or GPU at this time unfortunately for me.

EDIT: Looks like it's back up and running, but with GPU this time, and for Corona. Playing a bit of Factorio and don't notice any slow down at all :) Probably will need to pause it for ARK later tonight tho.
 

riskin

Ars Praefectus
3,121
Subscriptor++
I have rejoined with an i7-7700K + GTX 1080 ti and an i7-2600K @ 4.5 GHz + GTX 1080, so we'll see how much that adds to the cause. I've left the client set to "any disease" since COVID-19 isn't in the menu options. If there's any manual configuration beyond the "express" setup in the client to make sure I'm crunching the right things, optimized for my hardware (i.e. selected the right configuration and client options and flags), or anything else please let me know. It's been a very long time since I've done F@H let alone any distributed computing projects.
 

Drizzt321

Ars Legatus Legionis
28,408
Subscriptor++
I have rejoined with an i7-7700K + GTX 1080 ti and an i7-2600K @ 4.5 GHz + GTX 1080, so we'll see how much that adds to the cause. I've left the client set to "any disease" since COVID-19 isn't in the menu options. If there's any manual configuration beyond the "express" setup in the client to make sure I'm crunching the right things, optimized for my hardware (i.e. selected the right configuration and client options and flags), or anything else please let me know. It's been a very long time since I've done F@H let alone any distributed computing projects.

Nope, the software package they use under the hood takes care/figures out all your CPU/GPU accelerations and what it can use and uses them if possible.

I've noticed that I don't always have CPU and GPU at the same time. And usually if one of them is on COVID-19, the other, if running, doesn't appear to be. Lots of busy bio/sim folks out there I'm sure trying to keep the pipeline flowing to keep our systems all busy.
 

JimboPalmer

Ars Tribunus Angusticlavius
9,402
Subscriptor
F@H has some issues with threads over 32, so if you have 256 threads, you need to organize them in 'slots' of 32 or less each, but for those of us with sane PCs, the software does a good job of optimizing.

The CPU core is priority Low and does a good job of staying out of the way of your day job. GPUs are all or nothing*, so folding 24/7 can impact gaming, or AutoCAD. There is a setting 'on idle' if that is biting you, or you can pause by hand if you prefer.

I understand the concern with COVID-19, but if you inadvertently help cure cystic fibrous or Huntington's Disease that is not a BAD thing.

*Lets be honest, if you wrote a preemptive multitasking driver for your graphics cards, Gamers would just whine that they lost 1 FPS. And that is 99.4% of your market mad at you.
 

JimboPalmer

Ars Tribunus Angusticlavius
9,402
Subscriptor
F@H has experienced over 10x growth in the last 10 days.
As soon as they fix one bottleneck, they find another. Over and over.

It is about 9:15 AM on a monday at Standford. The may have their coffee by now, their Morning meeting will be starting. Once it is over they begin finding what bottlenecked them over the week end. Sometime this afternoon they may get that fixed, and find a new bottleneck.

Meanwhile we get fewer work units that we could process if 90% of their volunteers disappeared. More work is being done, they are learning what they need to optimize. The Work Units must flow! <G>
 

Drizzt321

Ars Legatus Legionis
28,408
Subscriptor++
Any recommendations so we can track individual and team progress?
https://folding.extremeoverclocking.com ... 597#429597

Is where I check.
Woo, #653! I'm not last!

Jeeze, I'm #270 already. Going from 0 to that in just a few days, jeeze. My system is more powerful than I think, I think? Although also looks like lots of those are people who haven't been doing any for quite some time, or may just have machines not quite as powerful and started recently as well.
 
Jeeze, I'm #270 already. Going from 0 to that in just a few days, jeeze. My system is more powerful than I think, I think? Although also looks like lots of those are people who haven't been doing any for quite some time, or may just have machines not quite as powerful and started recently as well.
I think my biggest bottlenecks right now are getting WUs to run (none for >24h), and having a crusty GPU. I've got ~86 cores of CPU online now, and whenever I get a CPU WU it completes quickly. My sole GPU is a GTX 760, and most of this time it appears to have been idle.
 
D

Deleted member 32907

Guest
Jeeze, I'm #270 already. Going from 0 to that in just a few days, jeeze. My system is more powerful than I think, I think? Although also looks like lots of those are people who haven't been doing any for quite some time, or may just have machines not quite as powerful and started recently as well.

Having managed an awful lot of Folding@Home on a ton of Pentium 4 class machines before, the reality is that new systems are insanely more powerful than older stuff - as long as you have a GPU. They can do much more work, far faster.

ISEAGE, the office building of Pentium 4s I had running Folding when they were not being used for other stuff, crunched 19,046 workunits for a total value of 9,354,766 points - or about 500 points per WU, and I think those machines took an awful long time to crunch them. You can probably dig back in the F@H thread to find some comments I made on it over a decade ago when I was running them. We came on big and fast with an awful lot of machines, when they were bored. Go look around page 27-30 in the F@H thread for some details, but we had a lot of computing power.

A modern GPU or two will blow past that production in a matter of months. My office heater, with a pair of GTX 980s, is a substantial fraction of the work of that entire facility, and mine is only running 8-10h/day, on days with good sun. It's not even close. I just can't run when I don't have good sun. The protein folding stuff just really runs well on GPUs.

You can get an idea when people were folding based on the points per workunit. A lot of the higher ranked people did a ton of CPU folding, but haven't done much in years (or decades - ISEAGE probably went offline sometime in 2008 or 2009 for the most part).

Someone throwing a couple grand into a modern rig can out-produce decades of older farms easily.