Message boards : Graphics cards (GPUs) : GPUGRID and app_info
Author | Message |
---|---|
GPU load varies from 65% to 80% on my OC'd GTX570. I think different WUs using GPU more or less. | |
ID: 21618 | Rating: 0 | rate: / Reply Quote | |
I get about 85% to 95% usage on my GTX 570, no OC, running Linux with SWAN_SYNC=0. Yes, it seems the usage depends on the app or task. | |
ID: 21619 | Rating: 0 | rate: / Reply Quote | |
I've tested the app_info in the past on tasks with various GPU utilization. | |
ID: 21621 | Rating: 0 | rate: / Reply Quote | |
I get about 85% to 95% usage on my GTX 570, no OC, running Linux with SWAN_SYNC=0. Yes, it seems the usage depends on the app or task. I've got couple of questions: 1. SWAN_SYNC=0 - where I should input that? AFAIK it's on in linux by default... 2. How I can check GPU load on linux? AFAIK, nvclock project is dead since G200 series (beta support, my GTX5275 never worked properly though) and year 2008 :-( That's good news that linux is faster. I need to spend some time on w7 though to find stable clocks/voltage and then to flash them and then - "home, sweet home" :-) Theretically speaking 2 tasks runnig concurrently will increase GPU load to 100% (I see no reasons why not). That will slow down individual task for sure, but while running 2 tasks that should increase overall output. ____________ | |
ID: 21623 | Rating: 0 | rate: / Reply Quote | |
I've tested the app_info in the past on tasks with various GPU utilization. please, please where I can get that app_infO? I wanna try it - both on w7 x64 and linux (ubuntu 10.04 x64). that's weird. in my understanding while you increasinf GPU load you getting more. That approach works on any other GPU project and I see no reason why it should not work on GPUGRID. But that's thoery, I'd like to try that in real life. SO, if u've got that app_info, could u pls PM it to me and give the link where ir been discussed on the forum? ____________ | |
ID: 21624 | Rating: 0 | rate: / Reply Quote | |
Theretically speaking 2 tasks runnig concurrently will increase GPU load to 100% (I see no reasons why not). That will slow down individual task for sure, but while running 2 tasks that should increase overall output. That's correct, but the wu turnaround (return) time is more important in GPUGrid, than the overall output. that's weird. in my understanding while you increasinf GPU load you getting more. That approach works on any other GPU project and I see no reason why it should not work on GPUGRID. But that's thoery, I'd like to try that in real life. GPUGrid is different than other GPU projects. Here in GPUGrid a new workunit continues the calculation from where the previous wu finished. Therefore the new workunit depends on the result of the previous workunit, that's why turnaround time is very important, and honored by +50% bonus credit if you return the result within 24 hours, (and by +25% bonus credit if you return the result within 48 hours). If you run two workunits simultaneously on the same GPU, your wu turnaround time will miss the 24 hour deadline (with the long workunits), and you will lose the +50% bounus, and you will receive only +25% bonus. All in all, there is no point to increase the overall output by 5-15% to lose 25% credit bonus. BTW your wus are failing because you're overclocking your GTX570 too much. The higher the GPU utilization is, the lower the overclocking can be. GPUs cannot be overclocked for 5-8 hours long GPUGrid tasks as much as for gaming (or for shorter workunits). You should also increase your GPU's fan speed. The lower the GPU temperature is, the more stable the GPU will be. | |
ID: 21625 | Rating: 0 | rate: / Reply Quote | |
I've got couple of questions: It might be on by default in Linux, I don't know. The stderr on the website for each task shows the value for SWAN_SYNC. If it says there the value is 0 and you have no "export SWAN_SYNC=0" statement anywhere then assume it's 0 by default. I put "export SWAN_SYNC=0" in ~/.bashrc. Note the period, it's a hidden file. I'm the only user on the system so that's adequate for me. If you have multiple users you'll want to put the statement in a script that runs for every user. 2. How I can check GPU load on linux? AFAIK, nvclock project is dead since G200 series (beta support, my GTX5275 never worked properly though) and year 2008 :-( Yah, nvclock doesn't work here either. The nvidia-settings utility doesn't give GPU load either. I've been able to read GPU usage only with the nvidia-smi utility and only with the 270.xx drivers. It doesn't report usage here with the 260.xx driver. That's good news that linux is faster. I need to spend some time on w7 though to find stable clocks/voltage and then to flash them and then - "home, sweet home" :-) There's only one way to be sure and that's to try it and see. I'm sure skgiven and others have been down that road before so keep in mind what they say when you're testing. And as Retvari said, make sure your tasks return within 24 hrs. if you want the credit bonus. I think a GTX 570 will have no problem returning 2 concurrently running tasks in 24 hrs. but ya never know. Also, I had troubles with the temperature on my GTX570 when I started running it on Linux. The fan would not speed up as the temp rose so I had to manually force it to run at high speed permanently. That required some research and playing around. I expect you and others will run into the same problem so I wrote up what I did and posted it here on the BOINC dev forum. I recommend getting a handle on that BEFORE you run your first CUDA task on Linux. | |
ID: 21626 | Rating: 0 | rate: / Reply Quote | |
That's correct, but the wu turnaround (return) time is more important in GPUGrid, than the overall output. No doubt about return time :-) GPUGrid is different than other GPU projects. Here in GPUGrid a new workunit continues the calculation from where the previous wu finished. Therefore the new workunit depends on the result of the previous workunit Not really - MilkyWay is pretty much the same. They use the WUs results to correlate their model and based on new model they issuing new batch of WUs. that's why turnaround time is very important, and honored by +50% bonus credit if you return the result within 24 hours, (and by +25% bonus credit if you return the result within 48 hours). If you run two workunits simultaneously on the same GPU, your wu turnaround time will miss the 24 hour deadline (with the long workunits), and you will lose the +50% bounus, and you will receive only +25% bonus. All in all, there is no point to increase the overall output by 5-15% to lose 25% credit bonus. Let’s take the worst case scenario (theoretical case) when GPU load is 100%. If I'm running the 2nd WU concurrently, duration for both of them is twice longer then for single WU. Am I right? Let's come back to the real world. I finished one WU in 6hrs, so if I'm running 2 WUs it should take me 12 hrs to complete both of them. But remember - GPU load is less than 100%, so in reality it will be not 12hrs, but, let's say, 9-11hrs. All in all, I'm pretty much OK to be within 24hrs limit to get +50% bonus. That’s why IMHO it makes sense to go with app_info trick. In this case the project is OK in terms of return time as well as productivity from this video card. BTW your wus are failing because you're overclocking your GTX570 too much. The higher the GPU utilization is, the lower the overclocking can be. GPUs cannot be overclocked for 5-8 hours long GPUGrid tasks as much as for gaming (or for shorter workunits). You should also increase your GPU's fan speed. The lower the GPU temperature is, the more stable the GPU will be. I was failing due high OCing, that's correct. It's not a big secret that different projects allow different level of OCing. On PrimeGrid this card is solid rock stable for 7 months now @880/0.988V and another two - @900/1.0000V That task I completed @860/1.0000V - way lower then PrimeGrid, but that's OK. I'll play around to figure out the balance between higher clocks, voltage and heat on these long run WUs. While GPU load is way less than 100% I can slightly increase voltage and clock and still get adequate temps. When I'm been done I'll flash the cards with clocks/voltage and move to linux. So, my question is: where I can get app_info? Don’t get me wrong: I’m NOT trying to BS the project. In fact, I wanna do more. ____________ | |
ID: 21627 | Rating: 0 | rate: / Reply Quote | |
It might be on by default in Linux, I don't know. The stderr on the website for each task shows the value for SWAN_SYNC. If it says there the value is 0 and you have no "export SWAN_SYNC=0" statement anywhere then assume it's 0 by default. I put "export SWAN_SYNC=0" in ~/.bashrc. Note the period, it's a hidden file. I'm the only user on the system so that's adequate for me. If you have multiple users you'll want to put the statement in a script that runs for every user. Here’s the link on my completed task But I cannot see any mentioning of SWAN_SYNC… I’ll put "export SWAN_SYNC=0" in ~/.bashrc when I’ll be on linux. Thanks a lot, man :-) Yah, nvclock doesn't work here either. The nvidia-settings utility doesn't give GPU load either. I've been able to read GPU usage only with the nvidia-smi utility and only with the 270.xx drivers. It doesn't report usage here with the 260.xx driver. I never heard about this utility. I’ll try it for sure. Thanks again :-) I installed the latest 275.xx.xx driver, so hopefully it will work with this version. BTW, what’s version of drivers is the fastest? 260, 270 or 275? There's only one way to be sure and that's to try it and see. I'm sure skgiven and others have been down that road before so keep in mind what they say when you're testing. And as Retvari said, make sure your tasks return within 24 hrs. if you want the credit bonus. I think a GTX 570 will have no problem returning 2 concurrently running tasks in 24 hrs. but ya never know. That’s exactly what I want – to try by myself. No doubts that skgiven and other guys did that before, but look - we all helping science, so we all are some sort of scientist. So, let’s use scientific approach and try to repeat the results :-) Me also think that I should be pretty much OK to meet 24hrs limit. Also, I had troubles with the temperature on my GTX570 when I started running it on Linux. The fan would not speed up as the temp rose so I had to manually force it to run at high speed permanently. That required some research and playing around. I expect you and others will run into the same problem so I wrote up what I did and posted it here on the BOINC dev forum. I recommend getting a handle on that BEFORE you run your first CUDA task on Linux. That’s nice manual. I tried coolbits 1, but never 4. May be that is why either fan speed either GPU OCing never worked for me. When I’m selecting "Thermal Settings" – nothing really happens. But I found my way – finding proper clocks/voltage in windows using MSI AfterBurner, modify BIOS using NiBiToR and then flash it. BTW, you cannot set fan speed less than 40% and greater than 85% on GTX5x0 cards. But starting from NiBiToR version 6.0 you can adjust fan speed in BIOS as well. If you need modded BIOS for your card –just let me know, I’ll do it for u. ____________ | |
ID: 21628 | Rating: 0 | rate: / Reply Quote | |
BTW, what’s version of drivers is the fastest? 260, 270 or 275? I'm not sure, I've never tested to see which is fastest. If you need modded BIOS for your card –just let me know, I’ll do it for u. Thank you for the offer :) | |
ID: 21629 | Rating: 0 | rate: / Reply Quote | |
OK, I'll use 275.xx drivers, NP at all :-) | |
ID: 21630 | Rating: 0 | rate: / Reply Quote | |
But I cannot see any mentioning of SWAN_SYNC… It only says so if you are running SWAN_SYNC, it doesn't mention anything if not (at least when looking at my results). Be careful when putting SWAN_SYNC in .bashrc, it will only work if boinc is running as that user. To set it system-wide, look in /etc/environment or /etc/env.d/ | |
ID: 21631 | Rating: 0 | rate: / Reply Quote | |
It only says so if you are running SWAN_SYNC, it doesn't mention anything if not (at least when looking at my results). Stupid me... That rig is on win7 now and that's why SWAN_SYNC is no there :-) Be careful when putting SWAN_SYNC in .bashrc, it will only work if boinc is running as that user. that's not a problem coz there's one user account on that PC. BUT: I think it worth the efforts to put that in FAQ. May be it's necessary to talk to some1. All in all, where I can get app_info to try it? ____________ | |
ID: 21632 | Rating: 0 | rate: / Reply Quote | |
Be careful when putting SWAN_SYNC in .bashrc, it will only work if boinc is running as that user. If you installed BOINC from repositories then it is NOT setup to run on your account. It will run on a special boinc user's account which is usually named boinc or something similar. In that case the boinc user will not see the SWAN_SYNC environment variable if you put it in your .bashrc. If you installed BOINC from the Berkeley installer (the .sh script) then putting it in your .bashrc is adequate. | |
ID: 21633 | Rating: 0 | rate: / Reply Quote | |
If you installed BOINC from repositories then it is NOT setup to run on your account. It will run on a special boinc user's account which is usually named boinc or something similar. In that case the boinc user will not see the SWAN_SYNC environment variable if you put it in your .bashrc. If you installed BOINC from the Berkeley installer (the .sh script) then putting it in your .bashrc is adequate. normally I'm using repos, but BOINC is that rare exception. I'm downloading it from Berkeley and then runnin .sh script. So, it should work for me :-) ____________ | |
ID: 21635 | Rating: 0 | rate: / Reply Quote | |
Try this message from the Einstein forums. It has an app_info.xml used for running 4 Einstein tasks on a GTX480. It should give you the general idea. Maybe you can modify it to work with GPUgrid. | |
ID: 21639 | Rating: 0 | rate: / Reply Quote | |
Try this message from the Einstein forums. It has an app_info.xml used for running 4 Einstein tasks on a GTX480. It should give you the general idea. Maybe you can modify it to work with GPUgrid. I'll try it, but... different apps got different arguments and not necessary they are the same all over the projects... I've got app_info for MilkyWay, PrimeGrid, Collatz, but i'm not sure it will work. ____________ | |
ID: 21640 | Rating: 0 | rate: / Reply Quote | |
You could try this, | |
ID: 21641 | Rating: 0 | rate: / Reply Quote | |
<name>acemdlong_6.15_windows_intel86__cuda31.exe</name> There is no ".exe" extension at the end of the filename. The correct line is: <name>acemdlong_6.15_windows_intel86__cuda31</name> Back in February I've tried something similar you've just posted, and all I get is errors. ps: also there is no ".dll" files on Linux, so this app_info.xml is only for Windows | |
ID: 21642 | Rating: 0 | rate: / Reply Quote | |
I did run using an app_info file for a week or more on a dual GPU setup, but that was over 6months ago. Others also did this. For a short time it was worth it, but that was before 6.14 and only when there were lots of less GPU utilizing tasks around (~50%). I did post a bit about this and sent several PM's about my findings. Others concurred. I did not start a thread about this because using app_info at GPUGrid is definitely not the recommended way to go; by in large the tasks are fast enough. You have to be savvy and hands on when setting it up, and then the opposite - hands off, just let it run. | |
ID: 21643 | Rating: 0 | rate: / Reply Quote | |
Let’s take the worst case scenario (theoretical case) when GPU load is 100%. If I'm running the 2nd WU concurrently, duration for both of them is twice longer then for single WU. Am I right? You're right. Theoretically. :) Let's come back to the real world. I finished one WU in 6hrs, so if I'm running 2 WUs it should take me 12 hrs to complete both of them. But remember - GPU load is less than 100%, so in reality it will be not 12hrs, but, let's say, 9-11hrs. All in all, I'm pretty much OK to be within 24hrs limit to get +50% bonus. There is a prerequisite for this scenario: Your PC should not keep a spare pair of WU's in it's queue for too long, because if these two spare WUs are sitting in the queue for 12 hours, and then they are processed for another 12 hours they will just miss the 24 hours deadline. But if your queue is short, and a WU fails, your GPU will run only one WU (or in worst case: nothing) until a new one (or two) is downloaded. That’s why IMHO it makes sense to go with app_info trick. You forget about the reason behind low GPU utilization: GPUGrid uses both the GPU and the CPU for processing a WU. The more the CPU usage of a WU is, the less its GPU usage is, because of the heavy data transmission on the PCIe bus. A GIANNI_KKFREE WU (82% GPU usage) uses 5.5 times more CPU than a TONI_AGGsoup WU (98% GPU usage). I learned from my experience, that I have to overclock a Core2Duo by 33% to get the performance (say 5-10% more GPU usage) of a Core i3 (its integrated PCIe controller is 33% faster than the X48 chipset's). I think the existence of a bus between the GPU and the CPU is the bottleneck, and this bus (ie. the PCIe) can be overloaded even by processing a single GIANNI_KKFREE-like WU. If this is right, there will be no significant rise in GPU usage by running two of them simultaneously (although there can be significant rise in GPU usage by running low and high GPU utilizing tasks at the same time, but maybe the lower GPU utilizing would be more slow than expected). Don't get me wrong, I'm curious about this, but at the same time I'm very skeptical. Let's find out... | |
ID: 21644 | Rating: 0 | rate: / Reply Quote | |
I did run using an app_info file for a week or more on a dual GPU setup, but that was over 6months ago. Others also did this. For a short time it was worth it, but that was before 6.14 and only when there were lots of less GPU utilizing tasks around (~50%). Then maybe I'm wrong about low GPU utilizing tasks are overloading the PCIe bus, and in this case it's worth it to make an app_info.xml I'm getting totally confused. | |
ID: 21645 | Rating: 0 | rate: / Reply Quote | |
You have to remember this was six months ago and then SWAN_SYNC was much more important (a full CPU core/thread vs 2.5% of the total CPU; 16% of an i7 thread). A lot has changed and now running 2 tasks without SWAN_SYNC migh be different, but only so long as these tasks are relatively low GPU Utilizing tasks. There will be no benefit when just one task is a 95% utilizing task. At 85% who knows (things have changed so you would need to check). At 50% (and there is nothing near that) we would benefit, and I would be running on an app_info setup already. | |
ID: 21646 | Rating: 0 | rate: / Reply Quote | |
Double checked and I did have the .exe in my old app_info file (might not make any difference in Windows). The .exe has been omitted since version 6.14. I know this because I'm using eFMer's Priority. | |
ID: 21648 | Rating: 0 | rate: / Reply Quote | |
Nice discussion. guys :-) Rite now I'm waiting WU to complete and then I'll try app_info - both with and w/o .exe You're right. Theoretically. :) That's good news:-) There is a prerequisite for this scenario: Your PC should not keep a spare pair of WU's in it's queue for too long, because if these two spare WUs are sitting in the queue for 12 hours, and then they are processed for another 12 hours they will just miss the 24 hours deadline. But if your queue is short, and a WU fails, your GPU will run only one WU (or in worst case: nothing) until a new one (or two) is downloaded. Anyway rite now I've got 0 tasks in queue, so I'm really worry about that. You forget about the reason behind low GPU utilization: That's a good point. I was wondering what the hell is going on: CPU and GPU usage is way under 100%, so what's the bottleneck? Now it's clear. If the reason for the is PCIe, then u r rite - nothing I can gain from that trick. From other hand - PCIe 2.0 x16 is hell fast bus and I'n not really sure it is the problem. But anyways - let's wait for 20 minutes and we'll get the answer. ____________ | |
ID: 21649 | Rating: 0 | rate: / Reply Quote | |
OK, I tried both version (with .exe and w/o .exe). They deleted everything (including .exe and all .dll) file from the BOINC folder, so smth wrong with app_info | |
ID: 21650 | Rating: 0 | rate: / Reply Quote | |
I don't know what the problem is but I guess the issue may be related to the fact that priority is now being controlled at the thread level rather than Process level. | |
ID: 21653 | Rating: 0 | rate: / Reply Quote | |
I don't know what the problem is but I guess the issue may be related to the fact that priority is now being controlled at the thread level rather than Process level. Eh? App_info will work if both the structure and contents of the file are accurate. Just "guessing" the name of a file, library or DLL is never going to work - you need to apply some serious comprehension to the issue too. Then it works - and whatever an application does to control its own thread priority will work just the same, whether launched under app_info or otherwise. skgiven posted an app_info for Windows yesterday. I haven't tested it, but it looks OK to me. Linux users can study it, see what the basic shape is, and adapt it to their own needs. You need: <app> The name GPUGrid uses internally to identify the application. 'Long' and 'normal' length tasks may use different app names. <file_info> One section for each executable, library, or other supporting file you're going to use. (You also need to have the actual files themselves, of course!) <app_version> A control structure which links the components together and tells BOINC how to use them - e.g. the <coproc><count> values which started this conversation. Have a look at the app_info documentation. Note in particular the line at the bottom of that page: Generally this should match the corresponding elements in a scheduler reply message (sched_reply_URL.xml) If you are already running the project successfully in its normal, automatic download, mode you can read all the information you need to construct an app_info.xml file that will work on your own machine (including the urls for downloading the necessary files) from either sched_reply...xml or client_state.xml | |
ID: 21656 | Rating: 0 | rate: / Reply Quote | |
The app name is wrong: | |
ID: 21663 | Rating: 0 | rate: / Reply Quote | |
If you're going to change it in one place, you have to change it in the other, too. The app name is wrong: | |
ID: 21664 | Rating: 0 | rate: / Reply Quote | |
Thanks Richard. | |
ID: 21665 | Rating: 0 | rate: / Reply Quote | |
Thanks Richard. I'll test it later 2day ____________ | |
ID: 21670 | Rating: 0 | rate: / Reply Quote | |
same story - almost everything gone from www.gpugrid.net folder... | |
ID: 21673 | Rating: 0 | rate: / Reply Quote | |
Probably too late, but maybe helpful for future tests: | |
ID: 21833 | Rating: 0 | rate: / Reply Quote | |
Obviously it does not, I tried myself. But I also tried to translate a working app_info from 6.14 to 6.15 and it did not work either. :( | |
ID: 21834 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : GPUGRID and app_info