Advanced search

Message boards : Graphics cards (GPUs) : GTX 750ti switching to default clock value (1058MHz) after a while

Author Message
Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46119 - Posted: 10 Jan 2017 | 16:26:21 UTC

Among others, I am using two GTX 750ti (in two different PCs) for GPIGRID crunching.

Since this morning, with one of them I notice that after a few hours of crunching, the GPU clock which was OC'ed to 1300 MHz (shown under "current clock" in the NVIDIA Inspector) falls back to the value 1058MHz which is shown as "default clock" (lower left hand corner of the Inspector window).
From that point on, there is no way to change this value by the Inspector, neither up or down - it's frozen.

Since I remember having had such a "lock" situation on another GPU some time ago what I could repair only by aborting the current task and start a new one, I did the same thing today, but the GPU clock still stayed at 1058MHz (and could not be changed).

So, I tried another method: rebooting the PC, and this helped. The Inspector let me raise the GPU clock, this time I set it to 1250MHz (instead of 1300 as before).
However, after a few hours, same thing again: the GPU clock set itself down to the "default" value of 1058MHz and could no longer be changed, until I restarted the PC once more.

I now set the clock to 1200MHz and will see what happens.

I never had this problem before, even at room temperatures up to 8°C higher than right now.

FYI, the other GTX 750ti (in another PC) runs at 1360 MHz without any problem, it's crunching the same WU type (BNBS).

Does anyone have any idea what is wrong? The graphic card is about 1 year old, however crunching 7/24 almost all of the time. Is the chip approaching the end of it's lifetime?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46124 - Posted: 10 Jan 2017 | 18:30:39 UTC - in response to Message 46119.
Last modified: 10 Jan 2017 | 18:51:34 UTC

I now set the clock to 1200MHz and will see what happens.

After a BNS task was withdrawn by the server an a new one was downloaded some time thereafter, the GPU clock again jumped down to 1058MHz :-(

So, I restartet the PC, and the GPU clock now is back zu 1200MHz. We'll see for how long

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,672,242,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 46129 - Posted: 10 Jan 2017 | 19:20:19 UTC

BNB WUs have been aborted, it is explained here.
http://gpugrid.net/forum_thread.php?id=4488#46125

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46130 - Posted: 10 Jan 2017 | 20:18:04 UTC - in response to Message 46119.

Does anyone have any idea what is wrong? The graphic card is about 1 year old, however crunching 7/24 almost all of the time. Is the chip approaching the end of it's lifetime?

Of the six GTX 750 Ti's that I have bought (all ASUS, minimal factory overclock), two have now failed. I didn't see a problem with the clocks, but the first one bought (shortly after the card was introduced) produced some errors and then caused freezes and BSODs.

The next one that failed was the second one purchased a couple of months later, and started by producing errors; I pulled it out before it caused BSODs. That was at about the 1 1/2 year service point. So I am keeping a close watch on the others as they are now at, or beyond that time too.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46131 - Posted: 10 Jan 2017 | 20:38:27 UTC - in response to Message 46129.

BNB WUs have been aborted, it is explained here.
http://gpugrid.net/forum_thread.php?id=4488#46125

yes, I know.
But my problem startet earlier. Furthermore, the WUs did NOT break off, the problem was a different one, as described in my initial posting above.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46138 - Posted: 10 Jan 2017 | 22:19:32 UTC - in response to Message 46119.

Since this morning, with one of them I notice that after a few hours of crunching, the GPU clock which was OC'ed to 1300 MHz (shown under "current clock" in the NVIDIA Inspector) falls back to the value 1058MHz which is shown as "default clock".
From that point on, there is no way to change this value by the Inspector, neither up or down - it's frozen.
Does anyone have any idea what is wrong? The graphic card is about 1 year old, however crunching 7/24 almost all of the time. Is the chip approaching the end of it's lifetime?
I think you should remove the card, unplug the PCIe and ATX power connectors and check them for any burn marks.
If the card does not have a PCIe power connector then check the plugs of the yellow (12V) cables on the 24-pin ATX power connector.
Perhaps it's a good opportunity to clean the card's (and the CPU's) heatsink with compressed air.
If there are no burn marks, you can put the card back and have another try.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46148 - Posted: 11 Jan 2017 | 20:23:18 UTC

This morning, I followed Zoltan's advice and removed the card for cleaning with compressed air; however (to my surprise) almost no dust around. What concerns checking the cables for burn marks, no PCIe cable (no external power supply for the card), so I looked at the plugs of the ATX cable - no burn marks visible.

So I put the card back in, and set the GPU clock to 1200MHz. At any value above the GPU would throttle - and GPU-Z in the section "PerfCap Reason" shows "Pwr" = "limited by total power limit". Also, the percentage of power consumption was shown with values slightly above 90%, which before was not the case until the GPU clock was set beyond 1320MHz.
After I had the GPU run at 1200MHz all day long, half an hour ago it again switched back to the default clock of 1058MHz.

I guess there are two possibilties: either the power management of the card is defective, or there is something wrong with the power system of the mainboard.
Since I renewed the PSU only some 8 months ago, this should normally not be the source of the problem (but: who knows).

I could now try to replace the GPU with a new one, spacewise I could even do this with the "compact" versions of either a GTX960 or a GTX1060 (although I know that the setback of these "compacts" is that they have one fan only). However, in both cases, electricty would not only come from the mainboard, but also from the 6-pin external power cable.
Should similar problems like the present ones should come up, it would then at least be clear that not the current GTX750ti is defective, but rather the mainboard.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46150 - Posted: 11 Jan 2017 | 22:29:54 UTC - in response to Message 46148.

Is this card in your WinXPx64 host (Core2 Duo E7400)?
Perhaps you should try to find the highest frequency it can run continuously (by lowering the OC by 25MHz every time it falls back to its default clock).

I guess there are two possibilities: either the power management of the card is defective,
This seems to me like the self defense of the card activate itself.

or there is something wrong with the power system of the mainboard.
The mainboard does nothing with the power for the GPU: it just passes the 12V to the PCIe connectors. That's why I asked you to check the 24-pin ATX power connector of the mainboard, but since there are no burn marks on that, it should be fine. What mainboard is this?

Since I renewed the PSU only some 8 months ago, this should normally not be the source of the problem (but: who knows).
What brand/model do you have now?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46152 - Posted: 12 Jan 2017 | 6:43:55 UTC - in response to Message 46150.
Last modified: 12 Jan 2017 | 7:04:54 UTC

Is this card in your WinXPx64 host (Core2 Duo E7400)?
no, it's in the Windows 10 PC with the Core2Quad Q9550 @ 2.83GHz.

This seems to me like the self defense of the card activate itself.
But why would the card do this all of a sudden? For many months it has run well on 1300MHz with power consumption ~92% TDP.
After the GPU has run all last night at 1200MHz, this morning once again it reverted back to base clock 1058MHz, power consumption ~56% TDP.

What mainboard is this?
Fujitsu D3041 in an Esprimo P2560

What brand/model do you have now?
BeQuiet PurePower L8 400W

Perhaps you should try to find the highest frequency it can run continuously (by lowering the OC by 25MHz every time it falls back to its default clock)
Of course I could do that, in fact, I will do it - but yet it seems that something is wrong since 2 days ago.
Crunching a BNBS with - say - 1100MHz or so will take some 50+ hours :-(

FYI, the other GTX750ti in an even older host (Fujitsu Esprimo P2540, old PSU 300W) with a Core2Duo E7400 @ 2,8GHz runs at 1360MHz (memory clock 2860MHz), Power consumption ~96% TDP, without any problems (Windows XP).

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,672,242,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 46153 - Posted: 12 Jan 2017 | 11:40:58 UTC

When hitting the cards this hard I recommend not overclocking at all, a 1% gain in freq is not worth 12+ hours of wasted time and electricity.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46154 - Posted: 12 Jan 2017 | 11:55:15 UTC - in response to Message 46152.

it's in the Windows 10 PC with the Core2Quad Q9550 @ 2.83GHz.
Oh, now I understand this :)

This seems to me like the self defense of the card activate itself.
But why would the card do this all of a sudden?
Because these workunits depend much less on the CPU then the previous batches, hence the WDDM does not hinder their performance as much, so they tolerate less overclocking. This card is probably tolerate less overclocking than the other in the WinXP host, but it wasn't that much evident because the performance of the previous workunits was hindered more by the WDDM. Have you ever swapped these cards (between WDDM and non-WDDM OS) to cross-check their overclocking abilities?

For many months it has run well on 1300MHz with power consumption ~92% TDP.
After the GPU has run all last night at 1200MHz, this morning once again it reverted back to base clock 1058MHz, power consumption ~56% TDP.
Only ~56% TDP? Then I think the real clock of this GPU is much less then 1058MHz! It should be around 700MHz, which is really the self-defense mode of these cards. Next time it happens you should check the "real" clock frequency by GPU-Z's sensors, or MSI Afterburner's monitoring window.

Crunching a BNBS with - say - 1100MHz or so will take some 50+ hours :-(
The 50+ hours too long for 1100MHz, so it confirms that the real frequency is around 700~800MHz when your card is downclocked.

Nick Name
Send message
Joined: 3 Sep 13
Posts: 53
Credit: 1,533,531,731
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 46158 - Posted: 12 Jan 2017 | 16:46:25 UTC

I've seen this. In my case it happened during the SETI Wow event. I was running the optimized apps and figured it was that or an overloaded system. The cause was never clear, but it was almost certainly some sort of software or driver issue. It hasn't happened for a long time and my GPUs have worked perfectly here and on other projects. I know one time GPU-Z indicated the PCI speed had dropped to 1.1, but another time that was not the case. My GPUs were not overclocked, unlike most I don't generally crunch all the time and I also temp limit my cards for both comfort and longevity.

I would do the following, in whatever order is easiest or makes sense for you.

Set the driver power mode to Prefer Maximum Performance. In the Nvidia Control Panel -> Mangage 3D Settings and scroll down. Likely it is set to Adaptive or Optimum.
Reinstall the driver.
Swap the 750 Ti cards, if possible. That should be enough to see if it's a hardware problem.
Try a different slot, if possible.
If you're crunching other projects reduce the overall load.

If none of these work or help show what the problem is then I would start reducing the clock as others said.

You shouldn't have to abort the tasks, a reboot should be enough, if the system responds to let you do it. I had to to a hard reset at least once because it was running so slowly.

____________
Team USA forum | Team USA page
Join us and #crunchforcures. We are now also folding:join team ID 236370!

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46161 - Posted: 12 Jan 2017 | 18:00:24 UTC - in response to Message 46154.

Because these workunits depend much less on the CPU then the previous batches, hence the WDDM does not hinder their performance as much, so they tolerate less overclocking. This card is probably tolerate less overclocking than the other in the WinXP host, but it wasn't that much evident because the performance of the previous workunits was hindered more by the WDDM. Have you ever swapped these cards (between WDDM and non-WDDM OS) to cross-check their overclocking abilities?
Thanks for the logical explanation.
Swapping the two cards as suggested by you unfortunately not possible, since for mechanical reasons the other card does not fit into this PC (the socket of the PCI slot right under the PCIe slot is not located far enough away, downwards).

Only ~56% TDP? Then I think the real clock of this GPU is much less then 1058MHz! It should be around 700MHz, which is really the self-defense mode of these cards. Next time it happens you should check the "real" clock frequency by GPU-Z's sensors, or MSI Afterburner's monitoring window.
I did this, by GPU-Z; it was showing same values as the NVIDIA Inspector.

The 50+ hours too long for 1100MHz, so it confirms that the real frequency is around 700~800MHz when your card is downclocked.
Here the values from a currently crunched task (the previous one I idiot inadvertantly abortet at a progress status of 96%):
runtime so far 5 hours, progress 11,190%; so total time will be roughly 44:40 hours. GPU clock (as shown in GPU-Z and Inspector): 1190 MHz, TDP between 83 and 94%. GPU load 94-95%. Within these 5 hours, so far no drop back to "default clock) 1058MHz.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46164 - Posted: 12 Jan 2017 | 20:20:58 UTC - in response to Message 46161.

I did this, by GPU-Z; it was showing same values as the NVIDIA Inspector.
That's strange.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46166 - Posted: 12 Jan 2017 | 21:04:05 UTC

Perhaps this GPU is somewhat overtaxed with these challenging BNBS WUs. While I am not sure now whether the problem started with a different type WU.
I'll see how the card does once other WUs are available.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46197 - Posted: 16 Jan 2017 | 15:30:48 UTC

After the card has run stable on 1200 MHz for two days (crunching BNBS), a few hours ago it again switched back to default clock 1058 MHz.

So I tested the card with SETI@home.
Step by step I raised the clock to 1300 Mhz, and for several hours it's been running at this clock without any problem.
GPU load ~97% is even a little higher than it is with BNBS (~ 94%), TDP is 66 - 69%.

This nourishes my suspicion that the GPU may have a problem with the BNBS WUs (in contrast to the GTX 750ti in the other PC).

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46202 - Posted: 16 Jan 2017 | 19:19:07 UTC - in response to Message 46197.

After the card has run stable on 1200 MHz for two days (crunching BNBS), a few hours ago it again switched back to default clock 1058 MHz.

So I tested the card with SETI@home.
Step by step I raised the clock to 1300 Mhz, and for several hours it's been running at this clock without any problem.
GPU load ~97% is even a little higher than it is with BNBS (~ 94%), TDP is 66 - 69%.
GPU load is misleading, it's actually the load of the units which distribute the work to the CUDA cores.
The TDP gives a more accurate reading of how much the CUDA cores are utilized. (The TDP was between 83 and 94% by the BNBS workunits)

This nourishes my suspicion that the GPU may have a problem with the BNBS WUs (in contrast to the GTX 750ti in the other PC).
These workunits present the highest stress to your card (comparing to the SETI workunits, or the other GPUGrid workunits), so it could be a slightly 'defective' card. You could test it with stress-test tools like Furmark or MSI Kombustor. Let it run for an extended period of time. While it's running, you should look for artifacts (miscolored dots, missing triangles, vertical lines that disappear in the next frame, any strange pattern) to know if your card is defective. It could take ~30 minutes (or more) for these artifacts to show up. You could use the Unigine Heaven test too, but it does not stress the card as much as the other two tools.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46211 - Posted: 17 Jan 2017 | 19:32:22 UTC

After the GPU clock, having run at 1200MHz for about a day, once more switched back to 1058 MHz, I restarted the PC and ran the Furmark test for about 1 1/2 hours (at 1920x1080) and did NOT spot any artifacts or other irregularities.
Furmark showed the following values in the upper lefthand corner: GPU clock between 1006 and 1032MHz, TDP between 97% and 103%, fan 36%.

The interesting thing though was that GPU-Z which was running at the same time showed different values:
GPU clock 1200MHz, TDP ~ 68%.
Why this discrepency?

So, at least from this test, the card would NOT be defective?
Which would lead me even more to the assumption that the card has some kind of problem with the BNBS tasks (or the other way round :-)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46214 - Posted: 17 Jan 2017 | 20:35:13 UTC - in response to Message 46211.
Last modified: 17 Jan 2017 | 20:41:06 UTC

Well, I'm getting totally confused here (because of the discrepancy).
But the clock reading in Furmark could tell that this card won't run at 1200MHz while it's TDP is high (above 90%).
Perhaps you should try your other card with Furmark, but maybe I'll get even more confused by more data. :)

EDIT:

So, at least from this test, the card would NOT be defective?
No, according to the specifications, the base clock of the GTX 750Ti is 1020MHz, and the boost clock is 1085MHz.
Which vendor / model is this card? (I'm asking it purely out of curiosity, as the silicon lottery matters more than this.)

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46219 - Posted: 18 Jan 2017 | 6:04:17 UTC - in response to Message 46214.
Last modified: 18 Jan 2017 | 6:46:35 UTC

EDIT:
... according to the specifications, the base clock of the GTX 750Ti is 1020MHz, and the boost clock is 1085MHz.
Which vendor / model is this card? (I'm asking it purely out of curiosity, as the silicon lottery matters more than this.)

it's a MSI.

NVIDIA Inspector show following values:
Default clock: 1058MHz - Boost: 1137MHz
GPU clock (second line from bottom): 1044MHz - Boost: 1122MHz

For some 12 hours now, it's run at 1200MHz (without falling back to 1025MHz), TPD between 77% and 94% (values changing every second).

Just for information, my other GTX750Ti is make ZOTAC, and shows the following values in the Inspector:
Default clock: 1032MHz - Boost: 1110MHz
GPU clock (second line from bottom): 1216MHz - Boost: 1294MHz

It's currently running at 1360MHz at TDP between 80 and 102%.

As Zoltan said: "Silicon Lottery" :-)

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46426 - Posted: 2 Feb 2017 | 8:36:41 UTC

In the past few days, "adaptive" WUs by Pablo were downloaded by this host; so I tried to push up the GPU clock, which was no problem. It ran at 1300MHz, all the time.
Yesterday evening, when a BNBS was downloaded, I turned the clock back to 1200MHz (as I did before with BNBS WUs, in order to have the GPU run stable at 1200MHz) - and this morning I saw that again it had automatically switched to "base clock" 1058MHz. Now it obviously doesn't even run stable at 1200MHz any more, with BNBS.
So this was the final proof that this GPU has problems with the full load imposed on by BNBS tasks.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46427 - Posted: 2 Feb 2017 | 9:42:10 UTC
Last modified: 2 Feb 2017 | 9:49:42 UTC

strange thins here: although the "Edit" button was still shown with my previous posting, after adding some text I got the meassage that I cannot edit the postin any longer.

Okay, so the text comes here:

after having run for about 1 hour, right now the GPU once more switches back to 1058MHz (TDP around 60%, according to NVIDIA Inspector plus GPU-Z).
Any idea how I could increase the clock without always having to reboot the whole system for this purpose?
What I tried already was to terminate and restart BOINC - however, this did not help.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46480 - Posted: 9 Feb 2017 | 6:24:32 UTC

When this morning I took the first look at my PCs, I noticed the following situation on both the one with the GTX970 and also the one with the GTX750ti (both crunching a PABLO_adaptive_goal_KIX):

In the NVIDIA Inspector, the GPU clock was down at 540 MHz(!), Memory clock 2700MHz (default), GPU Load 0, Power between 84 and 89%. Changing the clock values by the sliders not possible.

In GPU-Z, no values were shown at all for GPU clock, memory clock, GPU load, Video Engine load (which normally is 0 anyway) - no values means a "-" in the fields were normally values (or "0") are shown. Power consumption shows same values as the Inspector.

However, the "progress" column of the BOINC manager shows a progress in the percentage; as it seems to me (but I might be mistaken) with about same speed as usual.

What's going on with these two cards?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46481 - Posted: 9 Feb 2017 | 13:39:46 UTC

Meanwhile, the WU got finished, and I received credit.
Looking up the Stderr shows the following:

...
<stderr_txt>
# GPU [GeForce GTX 750 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 0 :
# Name : GeForce GTX 750 Ti
# ECC : Disabled
# Global mem : 2048MB
# Capability : 5.0
# PCI ID : 0000:01:00.0
# Device clock : 1137MHz
# Memory clock : 2700MHz
# Memory width : 128bit
# Driver version : r370_00 : 37290
# GPU 0 : 55C
# GPU 0 : 56C
# GPU 0 : 57C
# GPU 0 : 58C
# GPU 0 : 59C
# GPU 0 : 60C
# GPU 0 : 61C
# GPU 0 : 62C
SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1965.
# SWAN swan_assert 0

...

any idea what this exactly means?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46813 - Posted: 2 Apr 2017 | 11:22:27 UTC

This phenomen that when crunching certain WUs, the GPU automatically switches the clock rate back to default 1058MHz is really annoying by now.
Currently, the GPU crunches

e1s5_ubiquitin_100ns_4-ADRIA_FOLDGREED50_crystal_ss_contacts_100_ubiquitin_4-0-2-RND5069_0

which seems to put quite a outragious strain on the card; it's been running for 32 hours now and has reached 60% progress only.
Whenever I try to set the clock to even a little above 1058MHz, it switches back to 1058MHz within a minute.
Power usage oscillates between 58 and 75%.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,884,857,676
RAC: 32,900,563
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47144 - Posted: 30 Apr 2017 | 8:55:45 UTC

Since installation of crunching software acemd_918-80 (plus the newer NVIDIA driver 381.65) the situation has become even worse.

Not only that beyond certain (mild) GPU overclocking the clock, after some unknown falls back to "default clock" 1058MHZ - now crunching comes to a halt completely. So if the PC is unattended for a while and I don't notice this right away, there is an inadvertant pause in crunching for several hours, or even up to a day.

As I said before, this does NOT happen with the GTX750Ti which runs in a Windows XP system (with crunching software 849-65 and driver 368.81).
When, two weeks ago, it looked like that XP would no longer be supported by GPUGRID, I installed Windows 10 on a separate partition and ran GPUGRID there, with acemd_918-80 and driver 381.65; and - no surprise - the same problem as on the other PC.
Meanwhile, since there exists a software for XP again, I changed back to XP, and this runs without problems.

So now, at least, it'c clear for me that nothing is wroing with the video card; it's the crunching software and/or the driver which makes these funny things happen (this morning, I had this on the PC with the GTX970, too).

About a week ago, someone else wrote in another thread here in the forum that same thing happens with his card.

Really too bad :-(

Post to thread

Message boards : Graphics cards (GPUs) : GTX 750ti switching to default clock value (1058MHz) after a while

//