Advanced search

Message boards : Graphics cards (GPUs) : BOINC always feeds "Computation error" when running GPUGrid at GTX1080.

Author Message
guichenge
Send message
Joined: 12 Aug 12
Posts: 6
Credit: 229,781,715
RAC: 229
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 46370 - Posted: 28 Jan 2017 | 11:16:15 UTC
Last modified: 28 Jan 2017 | 11:19:01 UTC

Sometimes the GPUGrid supports the GeForce 10 series well.But when I used the GTX1080 to run the GPUGrid,the feedback information is always the "computation error",I don't know how to do with this situation.
System environments: Win7 64bit,Dirver 372.90+,[email protected](x64) + VirtualBox 5.0.18, secure software is closed ,and GPUGrid has been download"_cudart64_80.dll" &"_cufft64_80.dll".

BOINC log:
2017/1/28 4:52:08 | | Reading preferences override file
2017/1/28 4:52:08 | | Preferences:
2017/1/28 4:52:08 | | max memory usage when active: 4037.07MB
2017/1/28 4:52:08 | | max memory usage when idle: 6055.61MB
2017/1/28 4:52:08 | | max disk usage: 10.00GB
2017/1/28 4:52:08 | | max CPUs used: 6
2017/1/28 4:52:08 | | (to change preferences, visit a project web site or select Preferences in the Manager)
2017/1/28 4:52:33 | | Project communication failed: attempting access to reference site
2017/1/27 21:58:20 | GPUGRID | Starting task e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0
2017/1/27 21:58:21 | GPUGRID | Computation for task e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0

finished
2017/1/27 21:58:21 | GPUGRID | Output file e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0_0 for task

e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0 absent
2017/1/27 21:58:21 | GPUGRID | Output file e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0_1 for task

e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0 absent
2017/1/27 21:58:21 | GPUGRID | Output file e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0_2 for task

e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0 absent
2017/1/27 21:58:21 | GPUGRID | Output file e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0_3 for task

e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0 absent
2017/1/27 21:58:23 | GPUGRID | Started upload of e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0_7
2017/1/27 21:58:27 | GPUGRID | Finished upload of e12s8_e8s3p0f165-ADRIA_FOLD_crystal_ss_contacts_20_ntl9_1-0-1-RND9401_0_7
2017/1/27 22:00:22 | GPUGRID | Sending scheduler request: To report completed tasks.
2017/1/27 22:00:22 | GPUGRID | Reporting 1 completed tasks

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46371 - Posted: 28 Jan 2017 | 13:43:02 UTC - in response to Message 46370.
Last modified: 28 Jan 2017 | 13:43:13 UTC

- What is the exact make and model of your GPU?
- What does GPU-Z show for your "GPU Clock" value?
- Do you have any TDR logs (.log files in your C:\Windows\LiveKernelReports\WATCHDOG folder)?
- If it is factory overclocked, have you tried running it at reference clocks, by using MSI Afterburner to set a negative Core Clock offset value?

I'd start by ruling out the possibility that overclocking is causing the issue.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46372 - Posted: 28 Jan 2017 | 14:17:58 UTC - in response to Message 46370.
Last modified: 28 Jan 2017 | 14:31:37 UTC

GPU applications can't run in virtualbox.
EDIT: At least it's not easy to make them run in virualbox.
EDIT2: This could be related to driver corruption. Try to uninstall / reinstall your GPU drivers. See this post.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46373 - Posted: 28 Jan 2017 | 14:28:42 UTC
Last modified: 28 Jan 2017 | 14:29:08 UTC

I think they meant that their host has 5.0.18 installed (which comes with the latest BOINC installer)

guichenge
Send message
Joined: 12 Aug 12
Posts: 6
Credit: 229,781,715
RAC: 229
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 46397 - Posted: 30 Jan 2017 | 11:58:55 UTC - in response to Message 46371.
Last modified: 30 Jan 2017 | 11:59:59 UTC

I didn't overclock for this card.
AUSU ROG STRIX-GTX1080-A8G-GAMING
GPU Clock 1671MHz
And I only find .dmp files when I watched that floder

guichenge
Send message
Joined: 12 Aug 12
Posts: 6
Credit: 229,781,715
RAC: 229
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 46398 - Posted: 30 Jan 2017 | 12:03:25 UTC - in response to Message 46372.

I had press your postings to operate, but the fault persists.

guichenge
Send message
Joined: 12 Aug 12
Posts: 6
Credit: 229,781,715
RAC: 229
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 46399 - Posted: 30 Jan 2017 | 12:05:26 UTC - in response to Message 46373.

what means 5.0.18? The Recommended version of BOINC is 7.6.33

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46400 - Posted: 30 Jan 2017 | 13:47:29 UTC
Last modified: 30 Jan 2017 | 13:48:39 UTC

That GPU is factory overclocked.

https://www.asus.com/us/Graphics-Cards/ROG-STRIX-GTX1080-A8G-GAMING/specifications/

OC Mode - GPU Boost Clock : 1835 MHz , GPU Base Clock : 1695 MHz
Gaming Mode (Default) - GPU Boost Clock : 1809 MHz , GPU Base Clock : 1670 MHz
*Retail goods are with default Gaming Mode, OC Mode can be adjusted with one click on GPU Tweak II

===============================

The behavior sounds like the overclock might be too high. You should investigate, by removing the factory overclock entirely:

- Install MSI Afterburner
- Adjust "Core Clock (MHz)" to be negative, like "-63"
- Set the "Apply at Startup" option, if you want that clock to be applied at startup
- Install GPU-Z
- Verify that GPU-Z shows "GPU Clock" with a value of 1607 MHz, which is the stock reference clock for a GTX 1080, per:
https://en.wikipedia.org/wiki/GeForce_10_series

===============================

7.6.33 is the BOINC version. 5.0.18 is the VirtualBox version, which is a piece of software that can be installed alongside BOINC, and is made by Oracle, and is used to work on Virtual Machine tasks.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46401 - Posted: 30 Jan 2017 | 13:57:35 UTC - in response to Message 46399.

The stderr output of your tasks shows:

<core_client_version>7.6.33</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741511 (0xc0000139) </message> ]]>
This error code (0xc0000139) translates to "DLL not found."
It's probably one of the dll files needed for the GPUGrid app.
Make sure you have the following dll files in the c:\ProgramData\BOINC\projects\www.gpugrid.net\ folder:
_cudart64_80.dll 366016 bytes, CRC32: 6C8AA6DE _cufft64_80.dll 145769016 bytes, CRC32: B3BEC988 _tcl86.dll 1262080 bytes, CRC32: 46AB87F0 _zlib1.dll 112640 bytes, CRC32: 13C8641F

guichenge
Send message
Joined: 12 Aug 12
Posts: 6
Credit: 229,781,715
RAC: 229
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 46436 - Posted: 4 Feb 2017 | 15:43:07 UTC - in response to Message 46400.

I take a look.But the Core21 at FAH & Opencl projcts at Einstein@home is normal.

guichenge
Send message
Joined: 12 Aug 12
Posts: 6
Credit: 229,781,715
RAC: 229
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 46437 - Posted: 4 Feb 2017 | 15:50:25 UTC - in response to Message 46401.

These four files all be complete

d337z
Send message
Joined: 24 Mar 15
Posts: 1
Credit: 62,541,925
RAC: 0
Level
Thr
Scientific publications
watwatwat
Message 46609 - Posted: 9 Mar 2017 | 7:56:28 UTC

If I may, please check your NVidia Control Panel settings in the "Manage 3D Settings" tab and ensure that "Optimize for Compute Performance" is set to off. GPUGrid, for some strange reason, will not work properly with the large CUDA address spaces and has not been optimized to work with Maxwell GPUs under such conditions.

Xeon Seo
Send message
Joined: 4 Apr 16
Posts: 2
Credit: 90,585,791
RAC: 0
Level
Thr
Scientific publications
watwatwat
Message 47853 - Posted: 6 Sep 2017 | 17:00:38 UTC
Last modified: 6 Sep 2017 | 17:01:48 UTC

I got this problem just yesterday too, and it made me hang with this for a whole day today.

After some researches I figured out the cause was different overclocking frequencies in my case.
I was using both Geforce GTX1050 and GTX1060 on CUDA processor crunching, overclocking GTX1050 by 160MHz offset, GTX1060 by 200MHz offset.
The issue always popped up when these graphic cards were set as above, and they worked smoothly at the factory manufactured frequencies and at the same offsets, strangely.
Now I put 160MHz overclock value to both, though GTX1060 has enough capacity to handle more computations.

Yeah, I tried graphics driver reinstalling that Retvari Zoltan suggested above, but I guess it was more related to overclock settings.

Hope this could be helpful to some other ones.

Xeon Seo
Send message
Joined: 4 Apr 16
Posts: 2
Credit: 90,585,791
RAC: 0
Level
Thr
Scientific publications
watwatwat
Message 47856 - Posted: 9 Sep 2017 | 3:51:23 UTC
Last modified: 9 Sep 2017 | 3:51:37 UTC

And one more thing,
I recommend not to push your VGAs to its overclock limits.
There could be a higher possibility of gaining computation failures, and it really was to me.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47857 - Posted: 9 Sep 2017 | 4:08:21 UTC - in response to Message 47856.

There are tools to use to test whether your overclock can be considered stable. For GPUs... Heaven, Valley, Furmark, etc.

Post to thread

Message boards : Graphics cards (GPUs) : BOINC always feeds "Computation error" when running GPUGrid at GTX1080.

//