Advanced search

Message boards : Graphics cards (GPUs) : Linux-x86_64, 4 Tesla c1060s, fresh install of 6.4.5 cannot get any work

Author Message
Western Scientific Tech S...
Send message
Joined: 22 Dec 08
Posts: 2
Credit: 193,501
RAC: 0
Level

Scientific publications
watwat
Message 4714 - Posted: 22 Dec 2008 | 6:53:33 UTC

Greetings,

I realize there are many threads already on the current WU scheduler problem. I had a machine working perfectly, racking up credits and now, with no changes to the system, it cannot get any work.

I just brought up another system with 4 Tesla C1060s and I cannot get any work either.

Using boinc 6.4.5 x86_64

I have two systems with a total of six C1060s waiting for WUs and I have another eight Tesla C1060s on two systems waiting to be brought online. I am holding off until I get the lack of WU scheduling issue resolved.

Is there a specific fix, yet mentioned, for Linux / boic 6.4.5? It appears Windows users are happily back to getting WUs.

My latest message file:
21-Dec-2008 14:46:08 [---] Starting BOINC client version 6.4.5 for x86_64-pc-linux-gnu
21-Dec-2008 14:46:08 [---] log flags: task, file_xfer, sched_ops
21-Dec-2008 14:46:08 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3 c-ares/1.5.1
21-Dec-2008 14:46:08 [---] Data directory: /home/boinc_user/BOINC
21-Dec-2008 14:46:08 [---] Processor: 8 GenuineIntel Genuine Intel(R) CPU @ 2.33GHz [Family 6 Model 23 Stepping 1]
21-Dec-2008 14:46:08 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca sse4_1 lahf_
21-Dec-2008 14:46:08 [---] OS: Linux: 2.6.27.9-noxen
21-Dec-2008 14:46:08 [---] Memory: 3.87 GB physical, 8.00 GB virtual
21-Dec-2008 14:46:08 [---] Disk: 21.36 GB total, 9.60 GB free
21-Dec-2008 14:46:08 [---] Local time is UTC -8 hours
21-Dec-2008 14:46:08 [---] Not using a proxy
21-Dec-2008 14:46:13 [---] CUDA devices found
21-Dec-2008 14:46:13 [---] Coprocessor: Tesla C1060 (4)
21-Dec-2008 14:46:13 [---] No general preferences found - using BOINC defaults
21-Dec-2008 14:46:13 [---] Preferences limit memory usage when active to 1983.78MB
21-Dec-2008 14:46:13 [---] Preferences limit memory usage when idle to 3570.81MB
21-Dec-2008 14:46:13 [---] Preferences limit disk usage to 9.51GB
21-Dec-2008 14:46:13 [---] This computer is not attached to any projects
21-Dec-2008 14:46:13 [---] Visit http://boinc.berkeley.edu for instructions
21-Dec-2008 14:46:41 [---] Fetching configuration file from http://www.gpugrid.net/get_project_config.php
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:46:57 [---] Suspending computation - running CPU benchmarks
21-Dec-2008 14:47:04 [GPUGRID] Master file download succeeded
21-Dec-2008 14:47:09 [GPUGRID] Sending scheduler request: Project initialization. Requesting 1 seconds of work, reporting 0 completed tasks
21-Dec-2008 14:47:14 [GPUGRID] Scheduler request completed: got 0 new tasks
21-Dec-2008 14:47:14 [GPUGRID] Message from server: No work sent
21-Dec-2008 14:47:14 [GPUGRID] Message from server: Full-atom molecular dynamics for Cell processor is not available for your type of computer.
21-Dec-2008 14:47:14 [GPUGRID] Message from server: Full-atom molecular dynamics on Cell processor is not available for your type of computer.
21-Dec-2008 14:47:16 [GPUGRID] Started download of logops3grid.png
21-Dec-2008 14:47:16 [GPUGRID] Started download of project_1.png
21-Dec-2008 14:47:18 [GPUGRID] Finished download of logops3grid.png
21-Dec-2008 14:47:18 [GPUGRID] Finished download of project_1.png
21-Dec-2008 14:47:18 [GPUGRID] Started download of project_2.png
21-Dec-2008 14:47:18 [GPUGRID] Started download of project_3.png
21-Dec-2008 14:47:20 [GPUGRID] Finished download of project_2.png
21-Dec-2008 14:47:20 [GPUGRID] Finished download of project_3.png
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:46:57 [---] Running CPU benchmarks
21-Dec-2008 14:47:28 [---] Benchmark results:
21-Dec-2008 14:47:28 [---] Number of CPUs: 8
21-Dec-2008 14:47:28 [---] 2332 floating point MIPS (Whetstone) per CPU
21-Dec-2008 14:47:28 [---] 6505 integer MIPS (Dhrystone) per CPU
21-Dec-2008 14:47:29 [---] Resuming computation

Western Scientific Tech S...
Send message
Joined: 22 Dec 08
Posts: 2
Credit: 193,501
RAC: 0
Level

Scientific publications
watwat
Message 4716 - Posted: 22 Dec 2008 | 7:07:39 UTC

Another thing that may be a coincidence...

When I set the boinc manager gui to 'simple view' the gui states that I am a member of the PS3 Grid. That oddity combined with the fact the when I do a manual update the server responds that it has no work for a Cell based client makes me think that the server somehow thinks I am a PS3 instead of a dual Xeon with four Tesla C1060s.

Ideas?

Alain Maes
Send message
Joined: 8 Sep 08
Posts: 63
Credit: 1,616,269,959
RAC: 82,103
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4717 - Posted: 22 Dec 2008 | 8:31:59 UTC - in response to Message 4714.
Last modified: 22 Dec 2008 | 8:37:23 UTC

Standard BOINC preferences limit your RAM when active/in use to 1983.78 MB (see your message file) which is less than 512 MB per GPU required (I think) by GPUGRID.

So try changing this in your account settings to more than 512 MB/GPU, being at least 60% of 4 GB.

On my 64-bit Ubuntu with one GTX260 I changed this setting to 90% after I noticed that apparently only occasionally even one WU drives acemd.. app up to using 2.3 GB even for one WU.

Worth trying I think.

edit - also make sure which CUDA driver youhave, 178... as me? Then try the new 180.48

Kind regards and happy crunching. Machines like yours do make a difference (wish I had such a powerhouse for myself!).

Alain

Desti
Send message
Joined: 10 Jul 07
Posts: 19
Credit: 1,272,950
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 4720 - Posted: 22 Dec 2008 | 9:07:34 UTC - in response to Message 4716.

Another thing that may be a coincidence...

When I set the boinc manager gui to 'simple view' the gui states that I am a member of the PS3 Grid. That oddity combined with the fact the when I do a manual update the server responds that it has no work for a Cell based client makes me think that the server somehow thinks I am a PS3 instead of a dual Xeon with four Tesla C1060s.

Ideas?



That is all normal.
____________
Linux Users Everywhere @ BOINC

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4722 - Posted: 22 Dec 2008 | 9:42:47 UTC - in response to Message 4714.

Hi,
it seems that we still have a problem in the server.
I have sent your log to boinc dev.

Thanks for reporting.

gdf.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4738 - Posted: 22 Dec 2008 | 17:36:32 UTC

@Alain: the requirement for GPU-Grid was to have at least 256 MB or local video memory, which the Teslas easily fullfil. It's even been lowered a bit by now. This 2.3 GB of system memory which you've been seeing is some new bug (discussed elsewhere) and shouldn't happen.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4771 - Posted: 22 Dec 2008 | 23:32:43 UTC - in response to Message 4714.

We are looking carefully why you cannot download WUs.
We need you to attach to the project and request work to test it.

thanks, gdf.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 4774 - Posted: 23 Dec 2008 | 3:19:31 UTC

I was hoping to have more for you today ... sigh ...

I was getting the now work message while I was finishing up the last task. Then I could not fetch work because of low resource share. So, I fixed THAT and reset my LTD so that I could fetch work and now I have to run down the queue again to see if it will fetch normally.

Post to thread

Message boards : Graphics cards (GPUs) : Linux-x86_64, 4 Tesla c1060s, fresh install of 6.4.5 cannot get any work

//