Author |
Message |
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
CUDA 3.1 has been released to the General Public by NVidia.
This does not mean GPUGrid is releasing CUDA 3.1 compiled apps/tasks just yet, but the techs said they have compiled and tested 3.1 apps and get some improvement in performance. I expect they will move when the existing 3.0 compiled applications are completed, and following a few runs with the new apps (Betas/tests). They have to give Fermi users time to install the new driver as well as finish the existing CUDA 3.0 compiled tasks.
So, if you have a Fermi, install this latest driver.
Note CUDA 3.1 will only improve Fermi's!
Note the Beta was 257.15!
New in Version 257.21
* Adds support for Blu-ray 3D with NVIDIA 3D Vision technology. Learn more about the hardware and software requirements here .
* Increases performance for GeForce GTX 400 Series GPUs in several PC games. The following are examples of some of the most significant improvements measured with GeForce GTX 480. Results will vary depending on your GPU and system configuration:
o Up to 14% in Aliens vs. Predator (1920x1200 noAA/AF – Tessellation on)
o Up to 4% in Batman: Arkham Asylum (1920x1200 4xAA/16xAF PhysX=High)
o Up to 5% in BattleForge (1920x1200 4xAA/16xAF – Very High settings)
o Up to 5% in Call of Duty: Modern Warfare 2 (1920x1200 4xAA/16xAF)
o Up to 4% in Crysis: Warhead (1920x1200 4xAA/16xAF – Enthusiast setting)
o Up to 24% in Enemy Territory: Quake Wars (1920x1200 no AA/AF)
o Up to 9% in Far Cry 2 (2560x1600 8xAA/16xAF)
o Up to 25% in Just Cause 2 (2560x1600 no AA/AF - Concrete Jungle)
o Up to 7% in Metro 2033 (1920x1200 no AA/16xAF – Tessellation on)
o Up to 40% in Metro 2033 with SLI ((1920x1200 4xAA/16xAF – Tessellation on)
o Up to 8% in S.T.A.L.K.E.R.: Call of Pripyat (1920x1200 no AA/AF – Day)
o Up to 110% in Stone Giant with SLI (2650x1600 – Tessellation on, DoF on)
o Up to 6% in The Chronicles of Riddick: Dark Athena (2560x1600 no AA/AF)
o Up to 9% in Unigine: Tropics (2560x1600 no AA/AF – OpenGL)
o Up to 5% in 3DMark Vantage (Performance and Extreme Presets)
o Up to 19% with Transparency AA (1920x1200 4xTrSS – measured in Crysis)
* Upgrades PhysX System Software to version 9.10.0223.
* Adds support for OpenGL 4.0 for GeForce GTX 400 Series GPUs.
* Adds support for CUDA Toolkit 3.1 which includes significant performance increases for double precision math operations. See CUDA Zone for more details.
* Adds support for new extreme Antialiasing modes for 3-way SLI PCs, including up to SLI48x AA for GeForce 200 series GPUs and up to SLI96x AA for GeForce GTX 400 series GPUs.
* Adds support for a new ‘Quality’ mode for NVIDIA’s Ambient Occlusion control panel feature.
* Adds a new NVIDIA Control Panel setup page for SLI and PhysX for ultimate control over multi-gpu configurations.
* Adds a new NVIDIA Control Panel feature for ultimate control over CUDA GPUs, allowing the user to effectively choose which GPU will power each CUDA application.
* 3D Vision customers can download the v257.21 3D Vision drivers here.
* Includes numerous bug fixes. Refer to the release notes on the documentation tab for information about the key bug fixes in this release.
Additional Information:
* Installs HD Audio driver version 1.0.9.1 (for supported GPUs).
* Supports the new GPU-accelerated features in Adobe CS5.
* Supports GPU-acceleration for smoother online HD videos with Adobe Flash 10.1. Learn more here.
* Supports the new version of MotionDSP's video enhancement software, vReveal, which adds support for HD output. NVIDIA customers can download a free version of vReveal that supports up to SD output here.
* Supports DirectCompute with Windows 7 and GeForce 8-series and later GPUs.
* Supports OpenCL 1.0 (Open Computing Language) for all GeForce 8-series and later GPUs.
* Supports OpenGL 3.3 for GeForce 8-series and later GPUs.
* Supports single GPU and NVIDIA SLI technology on DirectX 9, DirectX 10, DirectX 11, and OpenGL, including 3-way SLI, Quad SLI, and SLI support on SLI-certified Intel X58-based motherboards.
* Supports GPU overclocking and temperature monitoring by installing NVIDIA System Tools software.
NVidia |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
I can't actually find CUDA3.1 out.
http://developer.nvidia.com/object/cuda_archive.html
GDF |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Sorry, that was a bad title choice.
By public I meant end user (cruncher) support, rather than the toolkit (which is not required to crunch with, just compile with).
Cuda 3.1 Support was released with the latest 257.21 driver.
However the CUDA 3.1 Developer Tools, as you point out, are still in Beta form!
It appears that NVidia have one leg that wants to go for a run and the other is still sleeping, or is the saying, the left hand does not know what the right hand is doing?
Still, you can at least test with the Beta, if you need to.
|
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
What makes it faster are the dll distributed with the 3.1 toolkit.
Only when that is publicly available we can distribute it.
gdf |
|
|
|
CUDA 3.1 is now really released by NVidia.
GDF started a new thread about it in the news section
|
|
|
|
Driver updated on my machine ... I am ready to rock whenever GPUGrid is.
GDF - will the new version be tested in beta first?
Should I check "Run test applications" in my preferences?
Thank you,
____________
Thanks - Steve |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
Yes,
first in beta.
gdf |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
There are 20 beta units out with CUDA3.1.
gdf |
|
|
ftpd Send message
Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level
Scientific publications
|
I received 3 of them 6.28 application for gtx295.
I was thinking that cuda 3.1 was special for fermi-cards?
I will give you the times for this wu's.
Time is 10 min 37 secs for two wu's windows xp driver 257.21
Good result?
____________
Ton (ftpd) Netherlands |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Not good, I would say; think they were ment for Fermi cards to test on.
One of ftpd's runs on his GTX295
Tried to configure my settings to only pick up test apps, but we can no longer do that! |
|
|
|
I just missed those 3.1 WUs. But I think they should take much longer than that, especially on a GTX295. Anyway, how could a GTX295 get a fermi WU? |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
CUDA3.1 works well also for all other cards. So any card can take it.
Some fermi card has picked it up and the performance is good.
gdf |
|
|
|
I just had one finish on a GTX285 clocked up to 1656 on Win7 x64
Time to complete = 10 minutes 1 sec.
Slightly heavier in CPU usage but not to bad (.23 normal, .29 beta)
Utilization is still low but we know that is an NVidia / Microsoft issue on Vista/ Win7 and not GPUGrid's app.
____________
Thanks - Steve |
|
|
|
Another one ... this time 480GTX clocked at 1688 on WinXP 32
runtime = 3 minutes 39 seconds.
and another showed up as I was posting, same system
runtime = 3 minutes 29 seconds
both with SWAN_SYNC = 0
I have not tried without SWAN_SYNC to see if it is still necessary. My guess is that these run so fast I'll not be able to catch one before it runs so we might need to wait until they are full size and widely distributed before we can answer.
____________
Thanks - Steve |
|
|
nenymSend message
Joined: 31 Mar 09 Posts: 137 Credit: 1,308,230,581 RAC: 0 Level
Scientific publications
|
65nm GTX 260 (CUDA2.2/2.3 on GPUGRID unusable) drivers 257.21 on XP x_64:
factory OC 1.4GHz: 8 minutes 1 second
a little OCed by Riva 1.48GHz: 7 minutes 46 second
both with SWAN_SYNC = 0
P.S. Sorry for my deleting standard tasks, there is no way how to run only test tasks. |
|
|
nenymSend message
Joined: 31 Mar 09 Posts: 137 Credit: 1,308,230,581 RAC: 0 Level
Scientific publications
|
Another test tasks received.
Results can been seen at host ID 31329, I am trying some different OC. |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
Another one ... this time 480GTX clocked at 1688 on WinXP 32
runtime = 3 minutes 39 seconds.
and another showed up as I was posting, same system
runtime = 3 minutes 29 seconds
both with SWAN_SYNC = 0
I have not tried without SWAN_SYNC to see if it is still necessary. My guess is that these run so fast I'll not be able to catch one before it runs so we might need to wait until they are full size and widely distributed before we can answer.
It takes 4.351 ms/step. This is the fastest result I have ever seen (due to the overclock). Windows XP is as fast as Linux it seems.
gdf |
|
|
ftpd Send message
Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level
Scientific publications
|
Another one on gtx480 takes 4 min 06 secs.
OK?
____________
Ton (ftpd) Netherlands |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
Yes. 4.870 ms/step is about right.
I think that the beta works well.
gdf |
|
|
|
GTX470 OC 700/1400/1850
no SWAN_SYNC environment variable
# Time per step (avg over 50000 steps): 6.745 ms
# Approximate elapsed time for entire WU: 337.234 s
|
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
This cuda3.1 WU ran OK on Win XP using a GTX470 OC'd to 707MHz; Time per step 5.371 ms
Approximate elapsed time for entire WU: 268.547 s
Got a cuda30 client failure just before picking this WU up!
Full dump here, http://www.gpugrid.net/result.php?resultid=2597382
Looks like something else tried used the RAM.
Would be nice to see normal sized Cuda 3.1 Fermi Work Units. These are too small to judge performance gain from.
|
|
|