Advanced search

Message boards : Graphics cards (GPUs) : New Gianni tasks take loooong time... a warning (8-12-16)

Author Message
Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44148 - Posted: 12 Aug 2016 | 12:57:13 UTC
Last modified: 12 Aug 2016 | 12:59:18 UTC

Just a warning for those with weaker/slower GPUs than maybe a 770, the new batch of Gianni is going to take around 30 hours on my 980 TI Classified. That means one of my 980 standards would crunch it in about or less than 47 hours. You can go way down from there. My 730 would not finish one in time. My Quadro K2100M would not finish one in time. Some place between the K2100M and the 980, it the cut-off point on these in cards. Depending on settings and configuration, some cards may just make it or fail it wasting time.

To compare, the 980 TI Classy will do a single task in 8 hours or so, and 2 tasks in 11-13 hours. This Gianni has taking 20 hours for the first 68.5% and I turned the card from 2 tasks to 1 task 12 hours ago when it was looking to try to hit 36 hours or more. The other tasks, even the one Gerard_FXCXCL12RX that was running for those first 8 hours with it did normal time and was scheduled to finish in about 12.5 hours. It has since cycled through the other cards on the system and finished.

If you are running a card that struggles for the wall of time given it on normal Gerard_FXCXCL12RX tasks and occasionally misses or makes it on the MO_MOR or MO_TRV tasks, do not attempt these current GIANNI_D3C36bCHL tasks. Again, I am thinking based on configuration and settings and assuming 24 hour a day crunching, somewhere between a 970 and a 770 might not make it and lower than a 770 it would not.
____________
1 Corinthians 9:16 "For though I preach the gospel, I have nothing to glory of: for necessity is laid upon me; yea, woe is unto me, if I preach not the gospel!"
Ephesians 6:18-20, please ;-)
http://tbc-pa.org

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44150 - Posted: 12 Aug 2016 | 13:34:15 UTC
Last modified: 12 Aug 2016 | 13:37:12 UTC

I also just had one lock my system up and error out on a dual 980 system after 13 hours of running. It took 2 reboots, second one being a hard boot, to get it back up and running aborting the task naturally (didn't hit abort task).

https://www.gpugrid.net/result.php?resultid=15233074

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44151 - Posted: 12 Aug 2016 | 14:42:44 UTC - in response to Message 44148.

If you are running a card that struggles for the wall of time given it on normal Gerard_FXCXCL12RX tasks and occasionally misses or makes it on the MO_MOR or MO_TRV tasks, do not attempt these current GIANNI_D3C36bCHL tasks. Again, I am thinking based on configuration and settings and assuming 24 hour a day crunching, somewhere between a 970 and a 770 might not make it and lower than a 770 it would not.

Sounds like they should be put in a separate queue.

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 64,069,811
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 44152 - Posted: 12 Aug 2016 | 17:46:09 UTC - in response to Message 44151.

Earlier this morning I've seen the error rates for gianni around 90% and now it's in the 80's.

I have a 960 that completes tasks just fine though in under 24 hours, but I got an error last night due to a power outage.

If I get a gianni task, I'll report how it goes here if you want.


____________
Cruncher/Learner in progress.

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44154 - Posted: 12 Aug 2016 | 20:41:32 UTC - in response to Message 44152.

Sounds good. I was just warning about the length, not really the error rate til I noticed that. Often times new tasks get errors when first run and a bug is fixed for the rest of them, so that isn't a concern of mine. the length on some cards not being able to do them in time was what the warning was really about. But thanks.

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 64,069,811
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 44156 - Posted: 12 Aug 2016 | 20:59:48 UTC - in response to Message 44154.

Oh, I didn't know the errors got fixed liked that, thanks for letting me know!

And no prob, will gladly keep my eye out.
____________
Cruncher/Learner in progress.

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44157 - Posted: 12 Aug 2016 | 22:35:40 UTC - in response to Message 44156.
Last modified: 12 Aug 2016 | 22:40:37 UTC

Looks like the one I was running that was going to go longer than 30 hours as 2 tasks per card was able to be reduced to 28.2 hours after it ran the last 10 or so hours as a single task on that card. Still stand by anything 770 or below would not finish and 770 to 970 might come close depending on config and settings. I think this as 2 per card would have been 32-34 hours on the 980 TI Classified clocked at 3005mem and 1430mhz clock speed.

https://www.gpugrid.net/result.php?resultid=15233076

439,250.00 credit by the way, so good accurate recompense for reward compared to those done in half the time for just more than half the credit.

12 have finished. Does anyone have feedback on what card and how long it took? As long as I am making a warning for these, I'd like to make it accurate with feedback. Especially if they are going to continue to come out and be this large 0computationally. Thanks.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44163 - Posted: 14 Aug 2016 | 13:50:26 UTC

I just finished one of these units on my windows 10 computer:

e2s7_e1s51p0f618-GIANNI_D3C36bCHL1-0-1-RND2166_0 11693869 13 Aug 2016 | 19:38:10 UTC 14 Aug 2016 | 13:35:27 UTC Completed and validated 63,577.01 63,348.98 527,100.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65)


http://www.gpugrid.net/result.php?resultid=15235031


These units seem to be very CPU dependent. The GPU and power usage are slightly lower than the GERARD_FXCXCL12RX units.


Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44164 - Posted: 14 Aug 2016 | 14:26:23 UTC - in response to Message 44163.
Last modified: 14 Aug 2016 | 14:50:02 UTC

I have three running on my hosts.
On a GTX980Ti the task properties show 6.48% per hour, which estimates 15h 25m 56s, but just now I've reduced the number of CPU tasks to 1, so it's a bit faster from now.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44165 - Posted: 14 Aug 2016 | 14:42:33 UTC - in response to Message 44156.
Last modified: 14 Aug 2016 | 14:48:55 UTC

Earlier this morning I've seen the error rates for gianni around 90% and now it's in the 80's.
Often times new tasks get errors when first run and a bug is fixed for the rest of them...
Oh, I didn't know the errors got fixed liked that, thanks for letting me know!

Guys, this could be the case, but usually this is a "natural phenomenon" coming from the way the performance / reliability stats work:
As a valid result takes 8-15~24-48 hours to process, a failed one takes only seconds (or maybe just a few hours), so right after the release of a new batch there are only failed tasks in the stats, which can be ignored. Then the stats "normalize" themselves when valid results have returned, but it takes at least as much time as it takes to finish a WU (plus the overhead of the data transmission).
The only way to know that a batch has a bug if it is failing even on the most reliable hosts. This is very rare at GPUGrid.

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 64,069,811
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 44166 - Posted: 14 Aug 2016 | 15:54:13 UTC - in response to Message 44165.

I just downloaded a new task and it's a gianni task. (first one)

I have still a bit to go on my current Gerard task, though.

Would you like me to link you all to my result once it's complete? Maybe that can help others?

Your choice.






____________
Cruncher/Learner in progress.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44167 - Posted: 14 Aug 2016 | 16:48:31 UTC - in response to Message 44166.

Would you like me to link you all to my result once it's complete?
There's no need for that as your computers are not hidden, so anyone can see and browse your hosts and results.
But you can do it by courtesy if you want to make our job easier :)

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 64,069,811
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 44168 - Posted: 14 Aug 2016 | 17:49:57 UTC - in response to Message 44167.

Would you like me to link you all to my result once it's complete?
There's no need for that as your computers are not hidden, so anyone can see and browse your hosts and results.
But you can do it by courtesy if you want to make our job easier :)


Sure! I'll post back when it's complete. (1 or 2 days)
____________
Cruncher/Learner in progress.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44170 - Posted: 14 Aug 2016 | 23:06:26 UTC - in response to Message 44168.

Expecting these Gianni tasks to take ~32h on a GTX970 (W10/WDDM).
Noticed the ~65% GPU usage, higher than usual clocks and lowish power usage.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44171 - Posted: 15 Aug 2016 | 1:49:29 UTC - in response to Message 44163.

I just finished one of these units on my windows 10 computer:

e2s7_e1s51p0f618-GIANNI_D3C36bCHL1-0-1-RND2166_0 11693869 13 Aug 2016 | 19:38:10 UTC 14 Aug 2016 | 13:35:27 UTC Completed and validated 63,577.01 63,348.98 527,100.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65)


http://www.gpugrid.net/result.php?resultid=15235031


These units seem to be very CPU dependent. The GPU and power usage are slightly lower than the GERARD_FXCXCL12RX units.




Here is an example of this unit type running on a computer with an older and slower CPU and motherboard:

e2s4_e1s51p0f710-GIANNI_D3C36bCHL1-0-1-RND6774_0 11693866 13 Aug 2016 | 19:40:23 UTC 15 Aug 2016 | 1:36:12 UTC Completed and validated 104,399.86 100,946.40 439,250.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65)

http://www.gpugrid.net/result.php?resultid=15235028


Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44172 - Posted: 15 Aug 2016 | 3:16:31 UTC - in response to Message 44171.

Here is an example of this unit type running on a computer with an older and slower CPU and motherboard:

e2s4_e1s51p0f710-GIANNI_D3C36bCHL1-0-1-RND6774_0 11693866 13 Aug 2016 | 19:40:23 UTC 15 Aug 2016 | 1:36:12 UTC Completed and validated 104,399.86 100,946.40 439,250.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65)

http://www.gpugrid.net/result.php?resultid=15235028

Yikes, and that's on a 980Ti. Found 2 running on my boxes. Aborted the one on the 650Ti, left the one running on the 750Ti. Will report when done, if I don't die of old age first.

BTW, CONGRATS on kicking my tukus getting to 3,000,000,000! :-)

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44173 - Posted: 15 Aug 2016 | 10:40:35 UTC - in response to Message 44172.



BTW, CONGRATS on kicking my tukus getting to 3,000,000,000! :-)



Thanks, that's what happens if you hang around here long enough! You do lots of crunching.

By the way, I am (have been) (and will not be for long) keeping the number 6 position in total credit warm for you!


Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44174 - Posted: 15 Aug 2016 | 14:52:38 UTC

e4s30_e1s26p0f463-GIANNI_D3C36bCHL1-0-1-RND6365_0 15h 6m 47s (54.407s) 980Ti/XP
e3s119_e1s26p0f620-GIANNI_D3C36bCHL1-0-1-RND0388_0 14h 58m 29s (53.909s) 980Ti/XP
e3s65_e1s13p0f646-GIANNI_D3C36bCHL1-0-1-RND3014_0 22h 18m 2s (80.282s) 980/XP

I have the feeling of that the length of these workunits is set to the performance level of the GTX 1080.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44175 - Posted: 15 Aug 2016 | 16:22:06 UTC
Last modified: 15 Aug 2016 | 16:24:14 UTC

These long WUs have an extra caution: if there's any kind of power glitch, the app has a very good chance of causing the WU to error out. The app really needs to be fixed, but wonder if it's ever going to happen. Looks like the Gianni will finish on my super-clocked (factory) 750Ti in about 60 hours. Obviously too late for any bonuses and also at risk of power glitches due to the faulty app. I've sadly started aborting the rest of the Giannis. :-(

As previously requested, a separate queue would be nice. Also probably not going to happen.

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44177 - Posted: 15 Aug 2016 | 19:10:43 UTC - in response to Message 44163.
Last modified: 15 Aug 2016 | 19:17:31 UTC

These units seem to be very CPU dependent. The GPU and power usage are slightly lower than the GERARD_FXCXCL12RX units.

Yes, and I was off on which cards could finish them on time. I was not estimating the time-out time correctly when I said it (TWICE!) And it seems the CPU has more to do with the length than the GPU on these. Across my systems the same NVIDIA cards on different mobo/CPU combos makes a big difference on how fast they are completing. All my main cards are 980 or 980 TI but the CPUs vary from i7-4960X to i7-4790K to AMD A10-7700K Radeon R7 and they step down in time as the processor gets weaker. The i7-4960X was 101,521 (28 hours). The i7-4790K is looking to be around 111,600 (31 hours). And the AMD A10-7700K Radeon R7 is looking at around 152,500 (42 hours). The i7-4790K and the AMD A10-7700K Radeon R7 are both running 980 standard cards and the settings are the same for both cards (card model and memory and core clock speeds) and external factors (like the 3d settings in the NVIDIA control panel, little usage of other processes, etc), but the finish time is a big difference. The fact that the i7-4960X and the i7-4790K are similar in time but the cards are different (i7-4960X has 980TI Classys)(i7-4790K has 980s) tells me the CPU is making up for the GPU in that the i7-4790K is running at 4Ghz and the i7-4960X at 3.6Ghz. Also the i7-4960X is running CPU tasks and the i7-4790K is not (because of heat which cannot be changed because of its low air movement location). BTW, they are all set to run 2 tasks per card and that 28 hour reading was done with some of the time done as single task cards (as noted in my previous comments), so I suspect they would be even closer in time had I not run them single card (all things being equal and all). Also noteworthy is that the i7-4960X is on Windows 10 but the i7-4790K and AMD are on Windows 7.

So still a warning, but not really card defendant, but GPU, CPU, OS, and mobo all play a part in the total time.

...the new batch of Gianni is going to take around 30 hours on my 980 TI Classified. That means one of my 980 standards would crunch it in about or less than 47 hours.

I was basing that statement on the times of other work units which do stretch out at that proportion. The end result seems drastically different because of the factors I just described.

...15h 6m 47s (54.407s) 980Ti/XP
...14h 58m 29s (53.909s) 980Ti/XP
...22h 18m 2s (80.282s) 980/XP

Maybe all things being equal the card does have more to do with it as well. Though I am noticing that the longest time was on an i7 CPU 870 @ 2.93GHz, the shortest is on an i7-4930K CPU @ 3.40GHz, and the one similar in length to the short one is on an i3-4160 CPU @ 3.60GHz. Is there a difference in settings, usage of other processes, or whatever else that is different between the i7-4930K and the i3-4160 that would make the 3.6Ghz slightly slower than the 3.4Ghz one both on 980TIs (like pcie speed on the mobo, etc)?

I have the feeling of that the length of these workunits is set to the performance level of the GTX 1080.

I am not sure if the GTX10 has much to do with planning the length of time to complete. It may be just the case, but I would think if they were planning length to completion they would keep them at or slower than the current GERARD_FXCXCL12RX series. Many of the cards still in common use (the GTX 7 series and above) should be able to do a task in 24 hours or less in my opinion. Having the 9 or 10 series should make the long units as they are stated, "under 8 hours" and the slower cards up to 24 hours with the laptop GPUs and slower cards being able to do them in the time allotted to time out. I have seen many comments here and on my team forum stating they stopped crunching GPUGRID altogether because they just could not finish tasks in time, either they have the older cards or they can't keep the PC on 24 hours a day.

I do understand that the longer they can be "out on the field" the less bandwidth is needed constantly on the servers and college, so I am not ruling out the need for it if that is the case. I just think there has to be a better balance of practicalities between the needs of the project infrastructure and the needs of the project volunteers to complete the work. This would/might be a bigger concern if the tasks were needed quickly, as sometimes they are, for a deadline or if there were so many tasks that the users could not grab and crunch them fast enough for the amount of work to be done. In the current state (and I am talking about at least since this time last year or earlier), the WUs available are zero most of the day most of the time and when they add 200 or 500 they are gone in about an hour or so if that. And I know that is dependent on the amount of students/staff that need work done and the need of the papers and science those students are doing related to what can/has to be done via distribution.

That is the downside of working out of a school for student needs though and not out of a science lab for scientific research like other projects. The upsides far outweigh the downsides though, as the work helps students get their degrees, papers published, and thesis and the actual work concluded by the findings. Back in the United Devices days we did HMMR, Markov modelling, and then moved to straight cancer/protein binding and even finished the Anthrax cure in 30 days and people were mad that UD was a for profit company. Even though the work done through the volunteers was donated to those who could actually use the research for the science, the company itself was using the distributed projects to complete and test their own distributed platform for corporate customers looking to complete large tasks across their in house networks and then sold as such. It was unfortunate that when they sold the company the projects ended without finishing, the work done for over a year was very valuable to Oxford labs and the National Foundation for Cancer Research here in the states. I am sure that the work completed there (as well as the work since then at F@H and other BOINC projects) led to finding the markers that my eventual cancers would be reacted upon by my current chemotherapy drug. When my doctor told me that it might have an effect on my cancer he told me, without me asking, that computer modeling was what found the reaction and not trials on actual people and that this particular drug was not used for my cancer until that was found in the distributed projects. And low and behold, it did reduce the tumor and its activity. So as I said, the science far outweighs the methods and needs for those methods.

Obviously too late for any bonuses

It looks like these are getting bonuses as their credit is high enough to not need the bonus as an extra bonus is added into the task based on its run-time. Even the 35 hour ones award 439,250 credit which is a bonus in relation to the credit that would be awarded to any other WU in current production for that time. It does seem though that the ones done in less 24 hours are getting 527,100, so... normal bonus on top of extra bonus?. According the the Performance page, only the top 10 (that allow their tasks to be seen publicly)(of which I should be listed in the 18 spot for that 28 hour one and am not for some reason) have been under 24 hours.

Michael
Send message
Joined: 29 Apr 16
Posts: 5
Credit: 79,699,134
RAC: 0
Level
Thr
Scientific publications
watwatwat
Message 44178 - Posted: 15 Aug 2016 | 19:25:38 UTC

Just finished a Gianni that took almost two full days on a 960.

http://www.gpugrid.net/result.php?resultid=15235099

Took a long time, but it worked and didn't fail.

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44180 - Posted: 15 Aug 2016 | 22:35:03 UTC - in response to Message 44178.

Just finished a Gianni that took almost two full days on a 960.

http://www.gpugrid.net/result.php?resultid=15235099

Took a long time, but it worked and didn't fail.

It looks like 38 hours. Good job!

When I added that one had errored and there was a high error rate I was not taking into account the error rate being higher when they first release because that is all they have is the fast errors and not the ones that actually can complete yet. Zoltan pointed that out to me above. But as also mentioned, they are fragile, so any power glitch or anything has the potential to cause an error. I have errored out 2 so far, but that isn't even the majority of my errors recently. But when they do error, they cause the system to fail and need a reboot and also affect others running if they are on the same card or system. So being more fragile, I have clocked all the cards on my most problematic system down to zero overclocking above the factory boost and am hoping that helps. I had it that way for 2 days and turned it back up today and went 2 days without error. lol That will slow them down a bit, but they are already going to be over 24 hours, so what is an extra hour to 28-30 anyway?

Either way, I don't think the "error rate" on these is an issue UNLESS you have one. At that point, one is too much. The time is the issue and why I put out the warning.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44181 - Posted: 15 Aug 2016 | 23:51:19 UTC - in response to Message 44177.

...15h 6m 47s (54.407s) 980Ti/XP
...14h 58m 29s (53.909s) 980Ti/XP
...22h 18m 2s (80.282s) 980/XP

Maybe all things being equal the card does have more to do with it as well. Though I am noticing that the longest time was on an i7 CPU 870 @ 2.93GHz,
Yes, but this host have a GTX 980, while the others has GTX 980 Ti's
the shortest is on an i7-4930K CPU @ 3.40GHz, and the one similar in length to the short one is on an i3-4160 CPU @ 3.60GHz. Is there a difference in settings, usage of other processes, or whatever else that is different between the i7-4930K and the i3-4160 that would make the 3.6Ghz slightly slower than the 3.4Ghz one both on 980TIs (like pcie speed on the mobo, etc)?
The i7-4930K is running at 4.4GHz, and 5 CPU tasks are running simultaneously, while the on the i3-4160 no CPU tasks are running. But this not a clean comparison, as I've booted the i3-4160 to Windows 10 to update it to version 1607, and this task was running under Windows 10 for a short period. You can see it in the task's stderr output, as there are different driver versions present.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44182 - Posted: 15 Aug 2016 | 23:54:19 UTC - in response to Message 44180.
Last modified: 15 Aug 2016 | 23:57:28 UTC

Just finished a Gianni that took almost two full days on a 960.

http://www.gpugrid.net/result.php?resultid=15235099

Took a long time, but it worked and didn't fail.

It looks like 38 hours. Good job!

When I added that one had errored and there was a high error rate I was not taking into account the error rate being higher when they first release because that is all they have is the fast errors and not the ones that actually can complete yet. Zoltan pointed that out to me above. But as also mentioned, they are fragile, so any power glitch or anything has the potential to cause an error. I have errored out 2 so far, but that isn't even the majority of my errors recently. But when they do error, they cause the system to fail and need a reboot and also affect others running if they are on the same card or system. So being more fragile, I have clocked all the cards on my most problematic system down to zero overclocking above the factory boost and am hoping that helps. I had it that way for 2 days and turned it back up today and went 2 days without error. lol That will slow them down a bit, but they are already going to be over 24 hours, so what is an extra hour to 28-30 anyway?

Either way, I don't think the "error rate" on these is an issue UNLESS you have one. At that point, one is too much. The time is the issue and why I put out the warning.



The one thing, I noticed is your CPU time is lot lower the the run time:

Run time 136,638.91
CPU time 20,414.34

Which indicates to me that you are not using the SWAN_SYNC 1, which can reduce your run time.

Click on the link below, the instructions to set this up, are at the bottom of the post:

http://www.gpugrid.net/forum_thread.php?id=4346&nowrap=true#44111

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44183 - Posted: 16 Aug 2016 | 0:46:10 UTC - in response to Message 44177.

Obviously too late for any bonuses

Even the 35 hour ones award 439,250 credit which is a bonus in relation to the credit that would be awarded to any other WU in current production for that time.

But that is including the 25% bonus. The credit may be ok for fast cards but it's poor for everyone else. On top of that there's more than double the chance of a failure due to power failure/BSD and no completion at all.

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 64,069,811
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 44184 - Posted: 16 Aug 2016 | 0:54:56 UTC - in response to Message 44183.

Does anyone know how long it takes for projects with these kind of problems to be fixed? (the gianni project)

I've been casually lurking and see a lot of people having problems with the gianni project.

I have to say I'm at 20 hours with windows xp, 90% gpu usage, and I still have quite a bit to go on the gianni... (only at 53% complete)

Thanks.

p.s. I'll still post the results once it's done.


____________
Cruncher/Learner in progress.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44185 - Posted: 16 Aug 2016 | 8:09:44 UTC - in response to Message 44184.
Last modified: 16 Aug 2016 | 8:12:29 UTC

Does anyone know how long it takes for projects with these kind of problems to be fixed? (the gianni project)

I've been casually lurking and see a lot of people having problems with the gianni project.

I have to say I'm at 20 hours with windows xp, 90% gpu usage, and I still have quite a bit to go on the gianni... (only at 53% complete)

Thanks.

p.s. I'll still post the results once it's done.



There is no problem Logan just some complaining about length of time to complete and failures due to excessive over clocking probably.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44186 - Posted: 16 Aug 2016 | 8:24:56 UTC - in response to Message 44185.

failures due to excessive over clocking probably.

The failures have nothing at all to do with overclocking. They're due to an app that can't recover from outages such as power failures.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44187 - Posted: 16 Aug 2016 | 8:32:57 UTC - in response to Message 44186.
Last modified: 16 Aug 2016 | 8:34:39 UTC

failures due to excessive over clocking probably.

The failures have nothing at all to do with overclocking. They're due to an app that can't recover from outages such as power failures.


There is no proof of that. However, even if it was the case, how many power outages do you have?

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44189 - Posted: 16 Aug 2016 | 8:54:15 UTC

Zoltan, that's why I asked. Slower CPU and GPU would make the it significantly slower on tasks. Also OS changes may affect things too.

Bedrich, I have had issues with swan_sync on every system I have tried it on. I slows all the processes to a point that it makes the system unusable. Most of the systems I access remotely with Teamviewer and I am not sure if remoting in is affected by the setting or if it is a program or setting I have on all the systems, but I have chosen not to use it. While I was using it for that short time a few tasks completed and did not show improvement and in fact were slower across them all. I was not willing to experiment or investigate at that time and just gave up. My memory being what it is, I can only conclude that the problems were worse than the potential benefit for me to not take on the challenge. I like challenges when it comes to PCs usually.

Beyond, I am not sure what you are saying, but a 20% bonus would be on the less than 24 hour ones that award 527,100, not the ones over 24 that award 439,250. And if a Gerard or Adria took the same amount of time you would get around 200,000, so there is more awarded for these longer units.

Logan, I am not sure if this length issue is considered a problem. The error rate may actually be one though. I think it usually takes one of the forum volunteer moderators to contact someone on the inside to get an issue resolved, which is one reason why we have them to help us and the project and let the scientists and students keep their time on the work.
I ask the mods now, if you haven't already, please contact someone about the error issue with these and inquire about shortening the units as well for the sake of our cards and times, or take Beyond's idea of adding a new level of maybe "Very Long Tasks" for new tasks created for the series 10 NVIDIA cards. After I posted the comment about the error rate possibly not being an accurate length of time to tell if they are erroring out more or not I had 4 error out on me across 2 different systems all GIANNI totaling almost 45.75 hours of work before they errored out. Maybe I am noticing it more and brought undue attention to it too early or maybe there is something to it, but would like some feedback as well on if there is a potential issue, if so can it be corrected, and possibly has it already been corrected and we are just erroring out the old broken ones. Thanks.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44192 - Posted: 16 Aug 2016 | 15:30:27 UTC - in response to Message 44187.

failures due to excessive over clocking probably.

The failures have nothing at all to do with overclocking. They're due to an app that can't recover from outages such as power failures.

There is no proof of that. However, even if it was the case, how many power outages do you have?

Frequent but usually only for a few seconds. Long enough to wreak havoc with computers. You should be thankful that you live in an area that's more reliable. The proof is that there's about a 50% failure rate when this happens. Zoltan has posted about the problem too. If you won't believe anyone else, maybe you'll believe him. BTW, other than some factory OCs, none of my cards are OCed. In fact some are down-clocked.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44193 - Posted: 16 Aug 2016 | 15:44:28 UTC - in response to Message 44189.
Last modified: 16 Aug 2016 | 15:45:59 UTC

Beyond, I am not sure what you are saying, but a 20% bonus would be on the less than 24 hour ones that award 527,100, not the ones over 24 that award 439,250. And if a Gerard or Adria took the same amount of time you would get around 200,000, so there is more awarded for these longer units.

As I understand it there's a 50% bonus for completing a WU in under 24 hours (including UL/DL time) and a 25% bonus for under 48 hours. So for instance a 200 credit base rate unit would get 250 credits if completed in 47 hours and 300 credits in 23 hours. Someone please clue me in if I'm mistaken.

I ask the mods now, if you haven't already, please contact someone about the error issue with these and inquire about shortening the units as well for the sake of our cards and times, or take Beyond's idea of adding a new level of maybe "Very Long Tasks" for new tasks created for the series 10 NVIDIA cards. After I posted the comment about the error rate possibly not being an accurate length of time to tell if they are erroring out more or not I had 4 error out on me across 2 different systems all GIANNI totaling almost 45.75 hours of work before they errored out.

Sorry to hear. It's no fun having large amounts of GPU time wasted. Hopefully the admins will improve the next app's fault tolerance, add a separate queue for super long WUs and also find a way to lower WU the error rate. The larger the WUs become, the more important it is to address these issues. Good for the project and good for their volunteers.

Michael
Send message
Joined: 29 Apr 16
Posts: 5
Credit: 79,699,134
RAC: 0
Level
Thr
Scientific publications
watwatwat
Message 44195 - Posted: 16 Aug 2016 | 19:07:42 UTC - in response to Message 44182.

Will that influence other projects I'm running?
I got a 4 core i5, with 3 cores (that is 75%) running CPU WCG tasks and the last one remaining for GPUGRID and POEM@Home (when no GPUGRID are available).

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 64,069,811
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 44196 - Posted: 16 Aug 2016 | 19:16:59 UTC - in response to Message 44195.
Last modified: 16 Aug 2016 | 19:17:19 UTC

Alright all, thanks for clearing some things up for me.

Here's my results:

http://www.gpugrid.net/result.php?resultid=15236421


Took about 1 day and 14 hours, but hey, I got a decent amount of credit for how long it took.

Hope the result helps someone

Cheers,

LC
____________
Cruncher/Learner in progress.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44197 - Posted: 16 Aug 2016 | 19:50:14 UTC - in response to Message 44196.

Took about 1 day and 14 hours, but hey, I got a decent amount of credit for how long it took

Not so much. Here's your last GERARD_FXCXCL12RX:
Time: 54,279.73 - 53,837.44 - Credits: 267,900.00

Here's the GIANNI_D3C36bCHL:
Time: 137,043.38 - 136,562.60 - Credits: 351,400.00

2.5x the time, 1.3x the credits. Add to that: 2.5x the chance for failure due to many unforeseen factors.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44202 - Posted: 16 Aug 2016 | 23:17:01 UTC - in response to Message 44195.

Will that influence other projects I'm running?
I got a 4 core i5, with 3 cores (that is 75%) running CPU WCG tasks and the last one remaining for GPUGRID and POEM@Home (when no GPUGRID are available).



You would be better off having 2 cores crunching your CPU project, one core supporting your GPU and one core free to run the operating system.



Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44204 - Posted: 17 Aug 2016 | 1:25:58 UTC - in response to Message 44192.

failures due to excessive over clocking probably.

The failures have nothing at all to do with overclocking. They're due to an app that can't recover from outages such as power failures.

There is no proof of that. However, even if it was the case, how many power outages do you have?

Frequent but usually only for a few seconds. Long enough to wreak havoc with computers. You should be thankful that you live in an area that's more reliable. The proof is that there's about a 50% failure rate when this happens. Zoltan has posted about the problem too. If you won't believe anyone else, maybe you'll believe him. BTW, other than some factory OCs, none of my cards are OCed. In fact some are down-clocked.


I am sorry for your power outages, thought that the USA was beyond such things. In this part of the UK we count power outages in YEARS although there was a 2 day one last December due to flooding of a substation which is the longest power outage in my 64 year history, guess we're just lucky.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44206 - Posted: 17 Aug 2016 | 2:37:17 UTC - in response to Message 44204.
Last modified: 17 Aug 2016 | 2:40:13 UTC

I am sorry for your power outages, thought that the USA was beyond such things. In this part of the UK we count power outages in YEARS although there was a 2 day one last December due to flooding of a substation which is the longest power outage in my 64 year history, guess we're just lucky.

Thanks. Even though some (most likely mentally challenged) claim climate change to be a myth, we've been having crazy storms and frequent torrential downpours (another one just today). Goes great with the neighborhood underground power lines. Animal species previously unknown here have been steadily moving in from the south. Actually the most frequent reason for outages seems to be lightning strikes on the further out above ground lines. It's improved from a couple years ago when there used to be a few seconds outage almost every day at 7am. If you think the USA power grid is suspect, you should get a load of our abysmal internet service (except in big cities and where Google has graced the population). The horrible broadband speeds makes doing GPUGrid even more challenging. Yeah, greedy monopolies are great... :-(

I'm crossing my fingers as my next door neighbor is having a new sewer system installed. Last time that happened a ways down the block the idiot contractors cut though the power and phone lines even though they were marked on the ground with bright neon orange paint. Took 3 days to get it fixed.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44211 - Posted: 17 Aug 2016 | 11:49:25 UTC - in response to Message 44206.

Actually the most frequent reason for outages seems to be lightning strikes on the further out above ground lines. It's improved from a couple years ago when there used to be a few seconds outage almost every day at 7am.

I was forced to start using uninterruptible power supplies when I went with ramdisks and large write caches a few years ago. But the UPS also take care of the brief (less than a second) power glitches we get here in the spring and summer due to switching loads around and lightning strikes. Otherwise, the power is very reliable where I am, but that varies a lot in the U.S. And our power company is now implementing a smart grid for automatically routing around downed power lines, to help isolate the problem.

I once had an expert on buried telephone lines tell me that they are just as susceptible to lighting strikes as the overhead lines, since the lighting has no problem finding the best conductor anyplace. However, optical fiber cables have largely solved that problem for the Internet, and it is good where I am, but that varies a lot too. The U.S. is a big country; Europeans don't always realize how different it is from one section to another. (Americans don't always realize it either.)

Global Warming will force a lot of investment in infrastructure upgrades though, assuming the affected areas still want access and power, etc.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44212 - Posted: 17 Aug 2016 | 12:53:13 UTC - in response to Message 44192.


Frequent but usually only for a few seconds. Long enough to wreak havoc with computers. You should be thankful that you live in an area that's more reliable. The proof is that there's about a 50% failure rate when this happens. Zoltan has posted about the problem too. If you won't believe anyone else, maybe you'll believe him. BTW, other than some factory OCs, none of my cards are OCed. In fact some are down-clocked.

A quality UPS would solve that issue if you could do it. We have momentary glitches and surges where I live also. I bit the bullet and put UPSs on all 8 of my DC machines 1 at a time. Even put 1 on my fridge after a surge took out a $600 control board but that's another story.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44214 - Posted: 17 Aug 2016 | 13:23:37 UTC - in response to Message 44212.
Last modified: 17 Aug 2016 | 13:25:41 UTC

Even put 1 on my fridge after a surge took out a $600 control board but that's another story.

Surges are a problem too. After a lightning strike a few years ago, I put Zero Surge filters on all my equipment, even the ones with a UPS. The surge filter plugs into the wall first, then the UPS into that. I am loaded for bear.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44217 - Posted: 17 Aug 2016 | 18:01:32 UTC - in response to Message 44212.

A quality UPS would solve that issue if you could do it. We have momentary glitches and surges where I live also. I bit the bullet and put UPSs on all 8 of my DC machines 1 at a time. Even put 1 on my fridge after a surge took out a $600 control board but that's another story.

Thanks guys for the suggestions. I used to have a UPS on every machine but it was expensive to buy them and after a year or two got to be ridiculous trying to keep the batteries replaced (currently 12 machines). Now they have quality surge protectors, much less headache but also not protection against outages.

Think that I mentioned this before and it's just my personal experience, but I used to have an even mix of AMD and Intel boxes. All had APC sine wave UPS at the time. After a lightning strike on the house (lightning rod BTW), every Intel system either failed immediately or within the next month. All the AMD systems were still running years later. Go figure. Why, I don't know. Maybe better MB components, maybe something in the basic design. Maybe just dumb luck. Since then I've used mainly AMD and have never had a CPU or MB failure. Maybe other peoples experiences are different...

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44218 - Posted: 17 Aug 2016 | 18:24:54 UTC

The Gianni finally finished on my 750Ti:

https://www.gpugrid.net/result.php?resultid=15236288

It took just under 62 hours, not a good fit. Interesting, it previously failed on someone's 980Ti that's usually ok for the Gerards:

https://www.gpugrid.net/workunit.php?wuid=11694702

For some reason GPUGrid thinks that my two 650Ti cards are the best candidates for the Gianni WUs. Just had to abort another one. They run the Gerard_FX WUs in an average of about 34 hours so always made the 2 day deadline but they'd be ridiculous on the Giannis.

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44229 - Posted: 18 Aug 2016 | 23:26:14 UTC - in response to Message 44182.

The one thing, I noticed is your CPU time is lot lower the the run time:
Run time 136,638.91
CPU time 20,414.34
Which indicates to me that you are not using the SWAN_SYNC 1, which can reduce your run time.

I figured out the problem when I tried this in the past. I was not limiting my tasks for GPU and CPU. I tried this again and it froze as soon as BOINC started each time I rebooted. So I opened in safe mode and changed the cc_config and app_config files to limit WCG on 2 systems and also GPUGRID on one system, based on the number of cores and task total. So the one system with 4 AMD cores is running 3 GPUGRID tasks and has bettered previous similar tasks by hours with a whole core (25%) CPU usage for each task. The other system with 12 Intel cores and 6 GPUGRID tasks possible I reduced the WCG tasks to 5 and GPUGRID tasks are at 6 still. That leaves on core free for tasks and OS and fills the rest with BOINC tasks. the 2 tasks that have completed are almost equal on GPU and CPU time, but only saved about 50 minutes for similar tasks with significantly more CPU time.

I am happy to let this run as such, though it still is slower on other tasks running even though the CPU usage is not 100% now. Would it help a bit to change the swan_sync setting to an incremental like .8 or .7 instead of 1? Or would just reducing the WCG tasks to 4 be my option?

It is odd that the swan_sync setting has not had anyone run into this same thing, but there should be another tutorial added somewhere for this setting. I searched online and found bits and pieces, but nothing complete or that answered this for me. Experimenting, time, and some logic were what got me here.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,960,232,676
RAC: 31,919,313
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 44232 - Posted: 19 Aug 2016 | 8:57:51 UTC

There have been several WUs by Gianni in the recent past. They are really huge and result in a nice credit, but no GPU below a 980Ti can crunch them within 24 hours and get the 20% extra credit.
One of my hosts is a GTX970 - which it took some 36 hours at ~1360MHz.
Hence, a change in this 24hrs rule would be desireable :-)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44235 - Posted: 19 Aug 2016 | 12:09:24 UTC - in response to Message 44232.
Last modified: 19 Aug 2016 | 12:12:01 UTC

There have been several WUs by Gianni in the recent past. They are really huge and result in a nice credit, but no GPU below a 980Ti can crunch them within 24 hours and get the 20% extra credit.
One of my hosts is a GTX970 - which it took some 36 hours at ~1360MHz.
Hence, a change in this 24hrs rule would be desireable :-)


My Linux system is crunching a Gianni now and it's at 50% after 15h 45min.
My GPU clock (acording to NV X Server is 1278MHz) and I'm using the 361.42 NV driver. X Server also says I'm using 4% PCIE bandwidth on a PCIE2.0 x16 slot. GPU utilization is around 67% but varies from 62% to 70%. My CPU is an AMD A6-3500 APU (2.1/2.4GHz).

So it looks like a GTX970 (at stock on a weak system) will take 31 to 32h to crunch these on Linux-x64. That suggests the WDDM overhead for these is at least
12.5% but probably closer to 16%.

A GTX980 is ~17% faster (stock) than a GTX970 so would still take over 24h to complete on Linux (over 26h). If it was overclocked by ~10% then it might be able to just about complete inside 24h if the system was tuned to do so (SWAN_SYNC used, high CPU clock and fast RAM...).

Note that the bonus is +25% for finishing (and reporting) inside 48h or +50% for finishing and reporting inside 24h.

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,960,232,676
RAC: 31,919,313
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 44238 - Posted: 19 Aug 2016 | 16:08:28 UTC - in response to Message 44235.

So it looks like a GTX970 (at stock on a weak system) will take 31 to 32h to crunch these on Linux-x64. That suggests the WDDM overhead for these is at least
12.5% but probably closer to 16%.

The CPU on that host is an "old" Intel 2 Core Duo E8400 - which my account for at least part of the longer crunching time. And, of course, WDDM OH as well (Win10 64-bit))

Note that the bonus is +25% for finishing (and reporting) inside 48h or +50% for finishing and reporting inside 24h.

Oh sorry, I missed that :-(

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44245 - Posted: 20 Aug 2016 | 13:42:31 UTC - in response to Message 44235.

So it looks like a GTX970 (at stock on a weak system) will take 31 to 32h to crunch these on Linux-x64. That suggests the WDDM overhead for these is at least 12.5% but probably closer to 16%.

A GTX980 is ~17% faster (stock) than a GTX970 so would still take over 24h to complete on Linux (over 26h). If it was overclocked by ~10% then it might be able to just about complete inside 24h if the system was tuned to do so (SWAN_SYNC used, high CPU clock and fast RAM...).

GTX 980 @ 1388MHz, GDDR5 @ 3505 MHz, i3-4160, WinXP, SWAN_SYNC on, no other tasks: 19h 24m 26s
It's almost (~8m) missed the 24h bonus, as it spent 5h 28m in the queue.

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44247 - Posted: 21 Aug 2016 | 8:53:36 UTC

I got one on my laptop. Windows 8.1 64bit, i7-4900MQ, 32GB RAM, NVIDIA Quadro K2100M @802Mhz mem @ 2504, swan_sync off. At 13.5% now and started as soon as it downloaded, it looks like 15:45 has passed and it might make the 5 day deadline by just squeezing through! We shall see, but it looks good at this point. I've never had a WU fail on this laptop except for downloading errors or crashes related to other programs or my own dumb experimentation with things like swan_sync (lol)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44249 - Posted: 21 Aug 2016 | 10:56:19 UTC - in response to Message 44245.

So it looks like a GTX970 (at stock on a weak system) will take 31 to 32h to crunch these on Linux-x64. That suggests the WDDM overhead for these is at least 12.5% but probably closer to 16%.

A GTX980 is ~17% faster (stock) than a GTX970 so would still take over 24h to complete on Linux (over 26h). If it was overclocked by ~10% then it might be able to just about complete inside 24h if the system was tuned to do so (SWAN_SYNC used, high CPU clock and fast RAM...).

GTX 980 @ 1388MHz, GDDR5 @ 3505 MHz, i3-4160, WinXP, SWAN_SYNC on, no other tasks: 19h 24m 26s
It's almost (~8m) missed the 24h bonus, as it spent 5h 28m in the queue.


GIANNI_D3C36bCHL from Performance

1 Retvari Zoltan 15236101 14.49 NVIDIA GeForce GTX 980 Ti (4095MB) driver: 368.22

14h 30min isn't much over the app description: Long runs (8-12 hours on fastest card) and there is a good chance the GTX1080 (when the CUDA 8 dev kit goes on public release) will manage it within that 12h time frame (on Linux).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44250 - Posted: 21 Aug 2016 | 11:57:08 UTC - in response to Message 44249.

There is also a good chance that after this summer shake-down cruise, the real work units in the fall won't be so long. I am hoping, and expecting, that a GTX 970 under Linux can handle them, though maybe not a 960. Otherwise, there will be some discontented people around here.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,960,232,676
RAC: 31,919,313
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 44251 - Posted: 21 Aug 2016 | 15:21:17 UTC

Well, what we also hope - I guess - is that there will be enough WUs available anytime around the clock. For the past several months, the situation was far away from that.

Rion Family
Send message
Joined: 13 Jan 14
Posts: 21
Credit: 15,415,926,517
RAC: 781
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44252 - Posted: 21 Aug 2016 | 19:04:14 UTC

These are some observations with systems I have working the GIANNI work units - from GTX 770, 780 and 970

Task name Work unit Computer ComputerName Specs RunTime=h:m:ss CPUTime=h:m:ss ElapsedTime Credit/Sec BatchName e4s27_e1s26p0f453-GIANNI_D3C36bCHL1-0-1-RND7578_0 11694844 319927 sr71-w10 W10, SWAN=1, GTX 970 30:07:55 28:38:40 30:37:40 4.049311534 GIANNI_D3C36bCHL1 e4s2_e1s26p0f434-GIANNI_D3C36bCHL1-0-1-RND0457_1 11694819 319927 sr71-w10 W10, SWAN=1, GTX 970 30:50:54 29:15:05 33:33:38 3.955295125 GIANNI_D3C36bCHL1 e5s26_e2s33p0f456-GIANNI_D3C36bCHL1-0-1-RND5827_0 11695686 319927 sr71-w10 W10, SWAN=1, GTX 970 30:14:00 28:41:17 33:43:58 4.035734975 GIANNI_D3C36bCHL1 e8s171_e2s15p0f614-GIANNI_D3C36bCHL1-0-1-RND4754_1 11697165 289414 GridBench-w10 W10, SWAN=1, GTX 980 25:25:38 25:11:28 25:47:06 4.798538404 GIANNI_D3C36bCHL1 e8s37_e3s57p0f691-GIANNI_D3C36bCHL1-0-1-RND4952_0 11697031 289414 GridBench-w10 W10, SWAN=1, GTX 980 25:19:42 25:05:20 34:52:23 4.817271594 GIANNI_D3C36bCHL1 e9s11_e3s104p0f660-GIANNI_D3C36bCHL1-0-1-RND4267_0 11697896 176528 stealth-mint Linux, SWAN=0, GTX 770 30:39:17 3:52:32 30:51:13 3.980252871 GIANNI_D3C36bCHL1 e8s142_e3s69p0f433-GIANNI_D3C36bCHL1-0-1-RND5710_0 11697136 187252 rahl588-v81 W10, SWAN=0, GTX 770 35:09:10 5:54:14 52:49:35 2.776770928 GIANNI_D3C36bCHL1

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44254 - Posted: 21 Aug 2016 | 20:39:51 UTC - in response to Message 44251.

Well, what we also hope - I guess - is that there will be enough WUs available anytime around the clock.



That would be very, very nice!

May it happen soon.



Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,960,232,676
RAC: 31,919,313
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 44257 - Posted: 22 Aug 2016 | 17:58:06 UTC

very annoying:
after some 20 hrs on my GTX750Ti client, a Gianni WU broke off, indicating a "computation error".
I was kind of suspicious anyway when I saw that this host had caught a Gianni WU. Since at that time it had run for several hours already, I decided not to stop it.
However, next time I will definitely do so. I guess that the Gianni tasks are no good for GPUs below a GTX970 (or maybe 960).
Somehow, these WUs should be programmed for NOT being downloaded on a GTX750Ti or below.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44272 - Posted: 24 Aug 2016 | 0:28:12 UTC - in response to Message 44171.

I just finished one of these units on my windows 10 computer:

e2s7_e1s51p0f618-GIANNI_D3C36bCHL1-0-1-RND2166_0 11693869 13 Aug 2016 | 19:38:10 UTC 14 Aug 2016 | 13:35:27 UTC Completed and validated 63,577.01 63,348.98 527,100.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65)


http://www.gpugrid.net/result.php?resultid=15235031


These units seem to be very CPU dependent. The GPU and power usage are slightly lower than the GERARD_FXCXCL12RX units.




Here is an example of this unit type running on a computer with an older and slower CPU and motherboard:

e2s4_e1s51p0f710-GIANNI_D3C36bCHL1-0-1-RND6774_0 11693866 13 Aug 2016 | 19:40:23 UTC 15 Aug 2016 | 1:36:12 UTC Completed and validated 104,399.86 100,946.40 439,250.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65)

http://www.gpugrid.net/result.php?resultid=15235028





Sunday I had several of these WUs download on my computers.

On my xp computer, I ran these two WUs simultaneously (1 CPU + .5 GPU):

e17s52_e1s50p0f278-GIANNI_D3C36bCHL1-0-1-RND0542_0 11700247 22 Aug 2016 | 2:50:27 UTC 23 Aug 2016 | 17:31:11 UTC Completed and validated 137,031.21 129,248.40 439,250.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65)

http://www.gpugrid.net/result.php?resultid=15245061

e17s51_e4s53p0f693-GIANNI_D3C36bCHL1-0-1-RND8402_0 11700246 22 Aug 2016 | 2:50:27 UTC 23 Aug 2016 | 16:51:07 UTC Completed and validated 134,710.64 128,933.70 439,250.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65)

http://www.gpugrid.net/result.php?resultid=15245060


The average run time per WU is: (137,031.21 + 134,710.64)/2 =135870.92/3600=37.74 hours/2 WUs = 18.87 hours

Compare that (1 CPU + 1 GPU) 104,399.86/3600=29.00 hours. (See above)

Which translates into 1-(18.87/29.00) =.35 or approximately a 35% improvement in productivity.

The GPU usage is 98% max and power usage is 81% for running 1 CPU + .5 GPU mode, while 1 CPU+ 1 GPU mode yields GPU usage of 70% max and power usage of 67%.

For my windows 10 computer, when I ran (1 CPU + .5 GPU for a few hours) the progress rate (from the boinc manager, task tab, properties button) was 3.6% per hour, which is 100/3.6 = 27.78 hours / 2 WU = 13.89 hours per WU computing time.

When running (1 CPU + 1 GPU) the computing time per WU is 63,577.01/3600= 17.66 hours. (See above)

Which translates into 1-(13.89/17.66) = .21 or approximately a 21% improvement in productivity.

I guess that one way to beat WDDM lag!

The GPU usage is 92% max and power usage is 80% for running 1 CPU + .5 GPU mode, while 1 CPU+ 1 GPU mode yields GPU usage of 80% max and power usage of 72%.

Those are my results. I hope you understand my logic.


Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44288 - Posted: 24 Aug 2016 | 15:04:32 UTC - in response to Message 44272.
Last modified: 24 Aug 2016 | 15:05:07 UTC

It's ironic that running the longest tasks simultaneously would be the most beneficial in terms of throughput (for some).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44295 - Posted: 25 Aug 2016 | 3:42:49 UTC - in response to Message 44288.

It's ironic that running the longest tasks simultaneously would be the most beneficial in terms of throughput (for some).



Being long was a coincidence. These tasks have a relatively high CPU dependence, which yields a relatively low GPU usage, with WDDM lag on the windows 10 computer and relatively old and slow CPU on the xp computer, there is the bottleneck. By running 1 CPU feeding .5 GPU, you are doubling up the CPU capacity, and so productivity increases. It’s all simple mathematics.


I remember a few years ago, we were doing beta testing on multi core CPU tasks. So, if the trend continues, with high CPU dependent tasks, then having 2 or more CPUs feed 1 GPU, would be the logical step to mitigate this bottleneck.


I think this was mentioned in 1 of the threads before, and someone said it might be impossible. I don’t think it’s impossible, maybe difficult, but not impossible.


Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44296 - Posted: 25 Aug 2016 | 9:27:29 UTC - in response to Message 44247.

I got one on my laptop. Windows 8.1 64bit, i7-4900MQ, 32GB RAM, NVIDIA Quadro K2100M @802Mhz mem @ 2504, swan_sync off. At 13.5% now and started as soon as it downloaded, it looks like 15:45 has passed and it might make the 5 day deadline by just squeezing through! We shall see, but it looks good at this point. I've never had a WU fail on this laptop except for downloading errors or crashes related to other programs or my own dumb experimentation with things like swan_sync (lol)

OK, so it finished with a few hours to spare on the 5 day deadline! Up until it said it had 1 day left, it had already run 3 days and 8 hours, but the time was moving faster than realtime. Here is the result:
http://www.gpugrid.net/result.php?resultid=15244431
4 Days 14.5 Hours was the total time. I cut off all WCG, my antivirus, most regular activity, and turned on swan-sync to push the finish, because I thought it would come a lot closer to missing the deadline.

Now to reboot to turn everything back on, but I am glad I could prove myself wrong on this fear of GIANNI. I do however see a trend that will overcome the weaker, older GPUS that are still very abundant throughout the community of crunchers. I don't like the trend. If we could get people to set their systems to not accept short tasks on powerhouse GPUs and then get more short run units, we could have a balance of long runs on strong GPUs and short runs on the others like this laptop and weaker.
____________
1 Corinthians 9:16 "For though I preach the gospel, I have nothing to glory of: for necessity is laid upon me; yea, woe is unto me, if I preach not the gospel!"
Ephesians 6:18-20, please ;-)
http://tbc-pa.org

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,944,948,466
RAC: 15,239,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44314 - Posted: 27 Aug 2016 | 13:05:46 UTC

I would agree that these task are more fragile than most of the other tasks.

So far I had 2 fail on my computers:

e24s139_e8s176p0f481-GIANNI_D3C36bCHL1-0-1-RND3252_0 11708469 26 Aug 2016 | 21:50:04 UTC 27 Aug 2016 | 10:56:02 UTC Error while computing 45,236.39 45,090.61 --- Long runs (8-12 hours on fastest card) v8.48 (cuda65)

http://www.gpugrid.net/result.php?resultid=15256435

e8s162_e3s57p0f378-GIANNI_D3C36bCHL1-0-1-RND5354_0 11697156 17 Aug 2016 | 9:52:12 UTC 17 Aug 2016 | 11:40:21 UTC Error while computing 2,516.73 2,505.53 --- Long runs (8-12 hours on fastest card) v8.48 (cuda65)

http://www.gpugrid.net/result.php?resultid=15239786

In both cases it was this error:

ERROR: file force.cpp line 513: TCL evaluation of [calcforces]
07:42:49 (5856): called boinc_finish

I was running both failed tasks at 1 CPU and 1 GPU mode, and the same speeds as the other tasks. Nothing was different.


Though, I do have, so far, 15 completed and valid, and 2 more still crunching.


Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44315 - Posted: 27 Aug 2016 | 13:45:35 UTC

I have gotten only one GIANNI_D3C on a GTX 960 (Ubuntu 16.04), and it ran for 39 hours. Considering that is within the 48 hour bonus limit, that is OK with me once in a while. It used 16% of a i7-4790 core, with the other seven cores on other BOINC projects.

The GERARD_FXCXCL12RX average about 16 hours on this card, and use about 9% CPU. But this card (an MSI) is only minimally factory overclocked, which helps stability.

[AF>P4G] anthony
Send message
Joined: 14 Mar 10
Posts: 14
Credit: 501,938,373
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44405 - Posted: 2 Sep 2016 | 19:35:39 UTC

Hello,

I have one of these GIANNI WU, on my GTX 850M, it will take 72 hours (Currently I am at 70 % after 50 hours). Does it exist a setting to avoid those WU ?

Thanks.


____________

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44408 - Posted: 2 Sep 2016 | 20:33:44 UTC - in response to Message 44407.

Hello,

I have one of these GIANNI WU, on my GTX 850M, it will take 72 hours (Currently I am at 70 % after 50 hours). Does it exist a setting to avoid those WU ?

Thanks.

Unfortunately, not so far.

[AF>P4G] anthony
Send message
Joined: 14 Mar 10
Posts: 14
Credit: 501,938,373
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44411 - Posted: 3 Sep 2016 | 8:09:21 UTC
Last modified: 3 Sep 2016 | 8:16:47 UTC

Thanks.

Sorry, I will go to Prime Grid, its too long for my graphics cards (I have also a desktop with a GTX 580).

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44412 - Posted: 3 Sep 2016 | 9:41:32 UTC

I just had a GIANNI_D3C36bCHL1 fail after 34 1/2 hours on a GTX 960 (Ubuntu 16.04). I have never seen this error before.
The card is only minimally factory overclocked, and runs at 70C, so I don't think it is a hardware problem.

<core_client_version>7.6.31</core_client_version>
<![CDATA[
<message>
process exited with code 158 (0x9e, -98)
</message>
<stderr_txt>
# SWAN Device 0 :
# Name : GeForce GTX 960
# ECC : Disabled
# Global mem : 2047MB
# Capability : 5.2
# PCI ID : 0000:02:00.0
# Device clock : 1240MHz
# Memory clock : 3505MHz
# Memory width : 128bit
ERROR: file tclutil.cpp line 32: get_Dvec() element 0 (b)
05:03:30 (1637): called boinc_finish

Post to thread

Message boards : Graphics cards (GPUs) : New Gianni tasks take loooong time... a warning (8-12-16)

//