Message boards : Number crunching : resume from checkpoint: same gpu?
Author | Message |
---|---|
When rebooting, does the checkpointed task resume with the same GPU (eg GTX-1070) or could it be assigned to a different GPU like the GTX-1060 or vice versa? | |
ID: 52228 | Rating: 0 | rate: / Reply Quote | |
When rebooting, does the checkpointed task resume with the same GPU (eg GTX-1070) or could it be assigned to a different GPU like the GTX-1060 or vice versa?It could be assigned to a different GPU. | |
ID: 52229 | Rating: 0 | rate: / Reply Quote | |
It could be assigned to a different GPU. Ok, that can explain what I see. I am doing a study of risers on various slots Motherboard is x8, x4, x8, x4 (4 slots) Riser respectively are x1: 1070 x1: 1060 none 4-in-1: 3 at 1060 I lost track of which board was d0, d1, etc so I rebooted without making a note of which board had 9 elapsed hours and another day and 1/2 to complete. After rebooting I checked boinc messages and the 1070 had the task that supposedly had another day and 1/2 to complete. It was running %75 gpu load and was very warm unlike the three on the 4-in-1 riser. I thought something was wrong with the 1070 but it seems one of the tasks from the 4-in-1 riser was reassigned to the 1070. I was not aware that tasks can be reassigned but that must have happened as that task on the 1070 gained a day in just a couple of hours and is back to the normal completion time (almost) for a gtx1070. I am seeing gpu load of 42-55% on the 4-in-1 riser (with three boards), the load on the single 1060 is around %65 and the 1070 is around 75. Very strange, cannot get the format correct after posting but preview is ok. I am going to post the missing text below. Not sure how it got cutoff and I don't see any ctrl charters in my text. I was not aware that tasks can be reassigned but that must have happened as that task on the 1070 gained a day in just a couple of hours and is back to the normal completion time (almost) for a gtx1070. I am seeing gpu load of 42-55% on the 4-in-1 riser (with three boards), the load on the single 1060 is around %65 and the 1070 is around 75. | |
ID: 52233 | Rating: 0 | rate: / Reply Quote | |
The GPUGrid app needs more PCIe bandwidth than (Bit)Coin mining, or other BOINC projects (like SETI@home) to achieve optimal GPU usage. <core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
# GPU [GeForce GTX 1060 3GB] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 3 :
# Name : GeForce GTX 1060 3GB
# ECC : Disabled
# Global mem : 3072MB
# Capability : 6.1
# PCI ID : 0000:08:00.0
# Device clock : 1708MHz
# Memory clock : 4004MHz
# Memory width : 192bit
# Driver version : r430_00 : 43086
# GPU 0 : 75C
# GPU 1 : 49C
# GPU 2 : 54C
# GPU 3 : 56C
# GPU 4 : 39C
...
# GPU [GeForce GTX 1070] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0 :
# Name : GeForce GTX 1070
# ECC : Disabled
# Global mem : 8192MB
# Capability : 6.1
# PCI ID : 0000:03:00.0
# Device clock : 1683MHz
# Memory clock : 4004MHz
# Memory width : 256bit
# Driver version : r430_00 : 43086
# GPU 0 : 51C
# GPU 1 : 40C
# GPU 2 : 42C
# GPU 3 : 41C
# GPU 4 : 43C
...
# Time per step (avg over 5905000 steps): 3.989 ms
# Approximate elapsed time for entire WU: 39887.681 s
# PERFORMANCE: 64800 Natoms 3.989 ns/day 0.000 ms/step 0.000 us/step/atom
called boinc_finish
</stderr_txt>
]]> It is quite obvious that it switched over to a different GPU. | |
ID: 52234 | Rating: 0 | rate: / Reply Quote | |
The GPUGrid app needs more PCIe bandwidth than (Bit)Coin mining, or other BOINC projects (like SETI@home) to achieve optimal GPU usage. Was running GPUGrid on 16.04 UBUntu with pair of 1060 and was doing very well for a while. I switched to windows over a month ago to test risers for a study I am interested in. I wanted results from techpowerup GPU log's and temperatures from tthrottle which I are available under windows. I wanted to compare performance for various projects and had created a windows app for performance calculations https://forum.efmer.com/index.php?board=47.0 I will be moving the 5 nvidia boards to a TB85 (6 slot miner motherboard) for testing. The 4-in-1 riser gives a real hit on performance on gpugrid. The TB85 runs 18.04 Ubuntu so that will be an interesting comparison plus there are enough x1 slots to where I don't need a splitter. The 4-in-1 seemed OK on seti and Einstein and I am putting a table of results together. while I cannot get gpuload under ubuntu (??) the comparison of elapsed time is probably just as good a marker. Not sure what is going on webwise here but I noticed that my previous post now has all the correct text whereas before only the preview was correct. Sometime when text in a post is missing it is because of a ctrl character accidently left in the body. Hopefully this post is intact. | |
ID: 52235 | Rating: 0 | rate: / Reply Quote | |
while I cannot get gpuload under ubuntu (??) Huh ? ? ? You most certainly can. From a Terminal session just start nvidia-smi which shows gpu wattage, gpu utilization, memory used and temperature. nvidia-smi -l 1 polls the installed cards every 1 second to display the utilization every second. | |
ID: 52236 | Rating: 0 | rate: / Reply Quote | |
If it is just temperature you are chasing to import into a custom app (instead of using tthrottle), you can use: | |
ID: 52237 | Rating: 0 | rate: / Reply Quote | |
I will be moving the 5 nvidia boards to a TB85 (6 slot miner motherboard) for testing. The 4-in-1 riser gives a real hit on performance on gpugrid. The TB85 runs 18.04 Ubuntu so that will be an interesting comparison plus there are enough x1 slots to where I don't need a splitter.The reduction of the GPUGrid app's performance comes from the reduced PCIe bandwidth, regardless of the limiting factor (an x1 riser or a motherboard with x1 slots will give the same result on the same OS). | |
ID: 52238 | Rating: 0 | rate: / Reply Quote | |
The reduction of the GPUGrid app's performance comes from the reduced PCIe bandwidth, regardless of the limiting factor (an x1 riser or a motherboard with x1 slots will give the same result on the same OS). Yes, agree, but I was testing a 4-in-1 adapter for comparison to various projects that are good for gridcoin mining. That adapter cause a performance hit. here are results from the same motherboard but each is x1 (no splitter) in slots x8,x4,x8,x4. I failed to get a screen printer when the splitter was used but the gpuload was in the mid 40's to 50s for the boards on the splitter. Considering that the splitter had 3 gpu1060s one might expected 60-80% divided by 3 for a %20-30 load so the splitter caused degradation but was not that bad. Unaccountably I had to us http instead of https for below image. Some sites require secure others don't seem to care. My images are on GoDaddy at my (now folded up) motorcycle club web site I put together before I retired. Keith listed tools in Ubuntu for accessing gpu info which I was unware of. Same for rod4x4. I will look into that windows app 4x4 mentioned because I was a C C# VB windows developer for years. I retired when my company switched platform to Linux and CORBA. I have been looking into accessing gpuz shared memory to get values but it seems easier to log results to the disk drive and read them using my C# program. I will look at nvidia-smi however. [edit] strange, parts of my post are missing but are present in the preview. I just pasted this url into chrome and there is no missing text in my post. Not sure what is happening in Microsoft Edge that is causing text to be missing. | |
ID: 52239 | Rating: 0 | rate: / Reply Quote | |
for the actual runs, I'd avoid GPU temps higher than 70°C. Lower than that is even better. | |
ID: 52240 | Rating: 0 | rate: / Reply Quote | |
for the actual runs, I'd avoid GPU temps higher than 70°C. Lower than that is even better. Temps are all under 65c as I had not started up Afterburner when I took the screen shot. | |
ID: 52241 | Rating: 0 | rate: / Reply Quote | |
nvidia-smi is available on both Linux and Windows platforms. Shows the same information. nvidia-smi --help prints out all the possible parameters. | |
ID: 52242 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : resume from checkpoint: same gpu?