Optimal Load Balancer Configuration Woes + Gaining Hash Rate Through Difficulty Curving: (aka: setting up speed racer mining)
Well with this morning's
stunning market rates I'm sure everybody is interested in picking up some new ideas on improving your hash rate. Thankfully I've been spending an embarrassing amount of time playing around with XMR-Node-Proxy and finally have alot to say about it. Now once again, I want to state, I'm not a programmer here, I don't understand mining at some intense level, and what you're getting is a guy utilizing his available time to fumble fuck results into place. Anyone more technically capable of laying down a proper and I do mean in-depth understanding of everything going on is MORE THEN WELCOME TO (and I do encourage it) come in and lay down some major information on not only what I'm saying but their own optimizations. What I really care about is bringing you all empirical data showing success, proving their is optimization to be had. Honestly for most of the last five days I've been doubting that was the case with my small amount of power I have vs. what XMR-Node-Proxy was designed for. Though I can finally say, yes, it is worth your time using, even on the home mining front!
"Step 1" so to speak would be figuring out your load balancer's
"shareTargetTime" optimal setting for the port difficulty you're connecting too. This is done by dividing the difficulty rating by your peak hash rate across your equipment. That's essentially what it considers to be a broad but semi-accurate starting value for the load balancing effect to start working with.
EXAMPLE: You connect to a port of 25000 difficulty, you hash at 1.3 KH/s across your home, 25000/1300=19.2308 shareTargetTime. Through doing this and just connecting my entire home's mining operation to the balancer I saw a peak hate rate (reported on pool and verified) of 3.8 KH/s though it would still ball out low to 2 KH/s at times. So the high's where better, but the bouncing between the peaks was still rough & very much present all the time. The stabilization I was hoping for was nonexistent. The problem you'll run into as soon as you look at your default config file is it's designed to be setup where you can open multiple difficulty ports on your own proxy & it'll balance the unit it's grabbing from the pool based on what you connect. Well the question is then, like, there's only one "shareTargetTime" so which port and which difficulty do I optimize my timing for? :facepalm:
You can't. So some of you might do what I did for a couple days, which was really dumb, an that was try every possible combination of difficulty changing on my side, then stacking different share times on top of whatever seemed to make a difference after 15 minutes of being left alone. (this is where the hours and hours and hours have been wasted) Also let me step back a second in case somebody missed the boat on finding what type of hash rate to expect to even type into your load balancer in the first place. Fire up WHATEVER mining program you like, as you can read I'm big on using STAK's stuff right now. Make all your overclocks, system tweaks, etc. there, before you ever even have the idea of XMR-Node-Proxy in your mind. Once you've got a setup that is stable 24/7, no crashes, no issues, look at your peak & 15 minute average hash rates. Write them down, add them up across your gear, now you have your normal rate to expect (the 15 min average) and the peak your gear is going to pull when everything's coming up milhouse.
Okay so now back on track, you've got a load balancer that is giving you new high peaks but brutal lows, and you can't figure out how to optimize your shareTargetTime for the different difficulty ports you're setting up. Well from what I can find online, most people do not care about it. They leave it at the default setting of 15 or even increase it to 30 depending on the latency timing of all their remote miners who are connecting to the proxy. Though as I'm complaining about, the peaks to valley ratio sucks doing this on my personal gear. I think it's because it was designed to run 10-2500 low end cpu connections and get a better rate through distributing the work load better. Not four mid range to high end CPUs & two graphics cards, one being an unreasonable beast like the titan. It throws the whole balancing element out of wack, or atleast, it appears to me it does, I'm guessing there is some trouble always lining up the process for this unique split. Solution?
"Step 2" is creating multiple load balancers depending on how far off your gear is from each other. I've simply made two balancers running, one for my CPUs, another for my GPUs. Then the difficulty & port settings are stripped down to a single port, a single difficulty, and a single shareTargetTime for each load balancer, no flipping around between them, no splitting loads between pools or ports, just straight up one road back and forth.
To do this, go into your hyper-ubuntu, open up terminal where you run pm2 monit to watch it work, hit ctrl-c to close it if you still have it up and running. Then type
"pm2 stop proxy", followed by
"pm2 delete proxy" to get rid of your current running setup. Then, reconfigure your config.json file to reflect the equipment you're trying to separate out. (cpus or gpus in this case) Once you've got a single port config ready to go head back to your terminal and repeat the process you did for firing it up the first time. This is
"pm2 start proxy.js --name=NAME OF YOUR FARM --log-date-format="DD-MM-YYYY HH:mm Z"" for anyone who might currently have some wings like I do. Though you can see the new "NAME OF YOUR FARM" part. Here is where you're going to want to say "PimpDaddy_GPUs_Only" or whatever you like. It's gotta be different then the next one we're setting up though. Follow this up with the same old
"pm2 save" afterwards to make sure we're good. Next copy and past the xmr-node-proxy folder in it's same location, it'll give you "xmr-node-proxy (copy)" as a folder. You can rename this "xmr-node-proxy2" or whatever you like. Now go into this folder and change the config.json to reflect your other shareTargetTime you'd like to achieve with the other side of the now separated hardware. Once you've got it ready to go head back to your terminal and
"cd ~/xmr-node-proxy2/" into your new folder. (obviously, change name to reflect whatever you named it). Here we repeat the same process from above but give it a different name, so another
"pm2 start proxy.js --name=NAME OF YOUR SECOND FARM --log-date-format="DD-MM-YYYY HH:mm Z"" followed by another
"pm2 save".
Boom, type
"pm2 monit" to get your back into your basic monitoring screen & now you'll have two load balancers running but reporting to the same global logs screen. You can see this in my screenshots below. Which leads us to
"Step 3" which is another very long round of optimization of your two load balancers to achieve a sustained mining rate higher then what your gear would do on it's own without all this fuss. So I'm sure you're asking, how is this done? Well now that you've got two balancers running on your half a hyper thread (lulz) you can watch the stats popping up inside pm2 monit's dashboard. Once it's dialed in (let it run for 15-30-60 min tests) on an average difficulty that is achieving your equipments hash rate pre-loadbalancer look at what it's working with for difficulty across your machines. Go back to your actual mining software prompts and watch for it's difficulty to change. When you look at all your machines you'll see the lowest difficulty and the highest that it's using to achieve that average in the load balancer's dashboard. Then you can go into your config files and change the min and max difficulty to be a couple hundred in either direction. With your advertised port difficulty to your machines starting where the load balancer has finally taken you after letting it sit for a while figuring out what's best. Now you can fire your load balancers up again & see that it's taking your gear almost right to the hash rates you where getting an hour into your testing instead of the way off hash rates you got the first time you fired it up. (pre-dialing in phase)
The next step is really, and I'm sorry I don't have a better explanation or technical breakdown for you, but is simply playing with your shareTargetTime, VERY SLIGHTLY, and letting it sit for 15-30-60 min stretches while you record the results. Eventually you'll see going one way will help, one way will hurt, and going to far either way will wreck your rates. Finally you'll get to a place where your gear is actually running
above the hash rate it would achieve on it's own through the load balancers act of splitting blocks down into individual segments for your gear to rip through. That's called
success baby and we all fucking love it. Though you're going to notice the same peak to valley system happening still, even post all this nonsense. That's the nature of the beast with mining, but what you're trying to do here is pull the valley's up to form a new higher minimum hash rate over time & to increase the peaks, climbing higher & faster through this splitting of the work. Let's talk some real numbers, I'm
still testing many theories I have on how this all works, I could've held back this post for another week easily. (but I wanted to get something out there for people to absorb and possibly bounce ideas back at me) I've currently seeing a minimum hash rate improvement of 500 H/s average during daily rate with people flipping on and off the machines. The dips use to shoot down to 2 KH/s at the worse times & now they only go down to around 2.5 KH/s during the falls. (once again, normally, on average, we're not talking always, it's not an absolute) While my peaks, they've gone mad. I hit 3.3 KH/s alot more often then I use to and I even saw a peak of 3.8 KH/s yesterday while I was playing around. (though I've not been able to reproduce this, ever) My equipment does a 15 min combined average of 2.65-2.7 KH/s when it's not on the load balancer so the objective is to tweak it to do better then that across the same 15-30-60m measurements.
This also has a couple nice benefits, example would be when your miners go to their donation cycle. Instead of just dropping the work all together, then reconnecting to it later, or just getting new work later, the mining you're doing doesn't stop as the other miners connected are going to immediately shift over to pickup the leftover work. So your average goes down a little bit on the charts (dips) but you quickly recover when whatever miner left, reconnects to your node again. It's fairly notable on the charts when the Titan X (P) drops off the GPU load balancer and just leaves the 1080 hanging yet the average hash over the day is still higher then just letting the two gpu's do their own thing connecting to the pool directly. Which after you've finally done all this, if you're like me, just won't be enough and you won't stop tinkering around. Which takes us to
"Step 4" noting that when you play around with the settings of the load balancer & get some craaazy results, they do not report to your mining pool accurately at all! So example, I can get this whole place up to a reported
8 KH/s in the load balancer. All the machines are working hard as ever, heat pouring out of them, energy getting used, nothing changes on my mining clients, but the load balancer is going bonkers. (try dialing in high difficulties)
THIS DOES NOT MEAN your pool is accepting all those results, or any of them. Often when I get crazy high numbers in the load balancer I'm getting terrible numbers reported as accepted by the actual pool. Given those are the people who pay you, if they're not counting it, it doesn't matter.
So I just open up my account inside my pool & watch the double load balancers report in there. I only use their graphs, their charts, their reports, to verify if what I'm doing in the load balancers makes any dick of a difference at all. Because the numbers in the load balancer are either 1.) lying 2.) hashing an incredible fuck rate of bad or unsuable data. You absolutely have to use your pool as your measurement for success. I basically just carry my phone around with me everywhere and every time I report back a block it tells me my current hash rate and my averages. I can tell pretty quickly if I've missed the mark with some radical setting I'm trying out or not but I like to let them run for a while to make sure it's not just some fluke. (proper testing time guys, let them run) So then you gotta be asking me, what has made a difference? Well it's speed racer low difficulty all day. I don't know if that's because my whole place is wired for gig and I have actual gig internet with an average latency of 5-10ms but for me, SPEED RACER ALL DAY. I've got my shareTargetTime's cranked down to 1.5 & 3 seconds.
They crank ultra low difficulty work all day but they fly through it just constantly spamming results back and forth between the pool & my load balancers. This seems to do about the best of everything I want, it minimizes dips when they occur, it brings the lower floor (valleys) up a considerable amount, & it tends to consistently provide hash rates that are higher then my gear does on it's own without the balancers. PERSPECTIVE, my titan was pulling like 50,000 difficulty work off the pool all the time on it's own, and crunching through it no problem. (though it took some time) Though now at this moment of typing it's speed racer burning through 1450 difficulty results like a crack addict with a brand new eight ball. Shit is on fire the console log is just a spam of results getting reported.
THIS is what is getting me better averages then anything else I've tried. The global logs in the load balancer dashboard is actually accurately reporting hash rates that are right next to what I'd get without the balancers (so I know that's working) but the rate I'm getting from the pool on my results is better overall. So once again, don't rely on the numbers inside your load balancer for anything but a basic check to see if they're in the correct ball park at all. Fine tuning needs to be done from your pool back to your balancers, not your balancers to the pool. Here's another BIG TIP, no not make a hyper-v checkpoint in the middle of mining & then revert to it later on with your internet still enabled. It'll pick back up where it was when you saved (obviously), error the fuck out for a bit, throw some flags, and you'll get an invalid share from the pool over your previous work. I've never had an invalid share in my short but now gaining some time mining career / life, so I was really upset when I saw one pop up. So yeah, do not do that, going to cost you a share! (currently on this new pool, I have 83,982 valid shares to my 1 invalid share, but still, pisses me off to see the big red X on the stats screen with a 1 next to it)
I'm not giving up yet, I'm convinced I can fine tune this thing myself into better hash rates. I think I've just not done a proper job testing this in a scientific manner nor applied enough time across each measurement to really find optimal results over luck & the natural mining cycle. I'll be sure to let you guys know what I find in the future as I'm clearly not going to stop playing around with it until I'm pretty dead nuts that I'm getting the best out of my gear that's possible. Though let's keep in mind, this software isn't even developed for this use. From the author himself, it's not worth running if you don't have atleast 10 miners connected to it. Though I'd like to say I've provided some interesting results showing it is actually still handy to home miners like myself. 99% of people probably don't give two shits for how long this explanation has been & the required dialing in time. Though I know there's going to be somebody out there who is going to read this and be like
FUCK YEAH IMPROVED RESULTS! and will join me on the load balancer quest for optimization and efficiency.
Load Balancer Failure! AKA: when things go wrong & you're not around:
Double Balancers sharing info in the Global Logs:
Success! Double Virtual Balancers running inside half a Hyper thread providing better hash rates then the gear does alone: