Advanced search

Message boards : Graphics cards (GPUs) : gtx680

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 22760 - Posted: 20 Dec 2011 | 16:48:44 UTC
Last modified: 20 Dec 2011 | 16:52:47 UTC

http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units#GeForce_600_Series

I don't think that the table is correct. Flops are too high.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22774 - Posted: 20 Dec 2011 | 19:36:29 UTC - in response to Message 22760.

Far too high; unless the 256 cuda cores of a GTX650 for $179 really will outperform the 1024 cuda cores of a one year old GTX590 ($669). No chance; that would kill their existing market, and you know how NVidia likes to use the letter S.

My calculated guess is that a GTX680 will have a GFlops peak of around 3000 to 3200 - just over twice that of a GTX580, assuming most of the rest of the info is reasonably accurate.

When it comes to crunching, a doubling of the 500 generation performance would be a reasonable expectation, but 4.8 times seems too high.

I don't see how XDR2 would in itself double performance, and I doubt that architectural enhancements will squeeze out massive performance gains given that it's dropped in size from 520 to 334mm.sq.; transistor count will apparently remain the same.
Perhaps for some enhanced application that fully uses the performance of XDR2 you might see such silly numbers, but for crunching I wouldn't expect anything more than a 2.0 to 2.5 times increase in performance (generation on generation).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22775 - Posted: 20 Dec 2011 | 19:39:20 UTC - in response to Message 22760.

Wow, that looks totally stupid! Looks like a boy's christmas wish list. Well, before every new GPU generation you'll find a rumor for practically every posible (and impossible) configuration floating around..

- XDR2 memory: seems like rumor guys have fallen in love with it. It's not on GCN, though. And if I were nVidia I'd consider it too risky to transition the entire new product lineup at once. There'd need to be serious production capacity ready by now.

- Traditionally nVidia rather goes with wider memory busses than higher clocks, which matches well with their huge chips. I don't see any reason for this to change.

- The core clocks are much higher than even on AMDs HD7970 (925 MHz on pre-release slides). Traditionally nVidias core clocks have been lower and I see no reason why this should change now.

- The shader clocks are totally through the roof. They hit 2.1 GHz on heavily OC'ed G92s, but the stock clocks have been hoovering around 1.5 GHz for a long time. Going anywhere higher hurts power efficiency.

- They introduced a fixed factor of 2 between base and shader clock with Fermi. Why, if they'd change it again with Kepler? I'd expect this to stay, for some time at least.

- 3.0 billion transistors for the flag ship would actually be lower than GF100 and GF110 at ~3.2 billion. At the same time the shader count is said to increase to 640. And the shaders support more advanced features (i.e. must become bigger). Unless Fermi was a totally inefficient chip (I'm talking about the design, not the GF100 chip!), I don't expect this to be possible.

- Just 190 W TDP for their flagship? They've designed power constrained monster chips since some time. If these specs were true, rest assured that Kepler would have gotten a lot more shaders.

- The proposed die size of 334 mm² actually looks reasonable for a 3.0 billion transsitor chip at 28 nm.

- The astronomic FLOPS are a direct result of the insane clocks speeds. Not going to happen.

Overall the proposed data looks more like a traditional ATI "mean & lean" design than a nVidia design.

They may be able to push clocks speeds way higher if they used even more hand crafted logic rather than synthesized one (like in a CPU). Count me in for a pleasant surprise if they actually pulled that off (it requires tons and megatons of work).

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23109 - Posted: 23 Jan 2012 | 20:26:01 UTC

Rumors say there's a too large bug in the Kepler A1 stepping in the PCIe 3 part, which means introduction will have to wait another stepping -> maybe April.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Damaraland
Send message
Joined: 7 Nov 09
Posts: 152
Credit: 16,181,924
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwat
Message 23247 - Posted: 4 Feb 2012 | 15:04:16 UTC

Now there are rumours that they will be realsed on this month.

tomshardware "Rumor: Nvidia Prepping to Launch Kepler in February"

Evil Penguin
Avatar
Send message
Joined: 15 Jan 10
Posts: 42
Credit: 18,255,462
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 23249 - Posted: 4 Feb 2012 | 16:48:29 UTC
Last modified: 4 Feb 2012 | 16:48:45 UTC

I don't expect anything from nVidia until April.
In any case, I hope AMD and nVidia continue to compete vigorously.

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23280 - Posted: 7 Feb 2012 | 11:40:01 UTC

Anther article with a table of cards/chip types:

here

Its a bit blurry. I can't claim credit for finding this one it was posted by one of the guys over at Seti. Interesting spec sheets though.
____________
BOINC blog

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23283 - Posted: 7 Feb 2012 | 12:42:24 UTC - in response to Message 23280.

That's the same what's being posted here. Looks credible to me fore sure. Soft evolution of the current design, no more CC 2.1 style super sacalar shaders (all 32 shaders per SM). Even the expected performance compared to AMD fits.

However, in the comments people seems very sure that there's no "hot shader clock" in Kepler. That's strange and would represent a decisive redesign. I'd go as far to say: nVidia needs the "2 x performance per shader" from the hot clock. If they removed this they'd either have to increase the whole chip clock (unlikely) or perform a serious redesign of the shaders, make them more power efficient (easy at lower clocks) and either greatly improve their performance (not easy) or make them much smaller (this was not being done here, according to these specs).

So overall.. let's wait for April then :D

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 23291 - Posted: 7 Feb 2012 | 15:50:25 UTC

Charlie's always good for a read on this stuff - he seems to have mellowed in his old age just lately :)

GK104:
http://semiaccurate.com/2012/02/01/physics-hardware-makes-keplergk104-fast/

GK110:
http://semiaccurate.com/2012/02/07/gk110-tapes-out-at-last/

I hope the increasing rumours on performance are true - whether its real raw power, or slight of hand aimed at gamers - either which way is a win for consumers as prices will trend down with competition, an aspect thats been sorely lacking in the last 3 years.

2012 shaping up to be a fun year :)

Regards
Zy

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23293 - Posted: 7 Feb 2012 | 17:20:56 UTC - in response to Message 23291.

GK110 release in Q3 2012.. painful for nVidia, but quitre possible given they w´don't want to repeat Fermi and it's a huge chips. Which needs another stepping before final tests can be made (some other news 1 or 2 weeks ago).

And the other article: very interesting read. If Charlie is right (and he has been right in the past) Kepler is indeed a dramatic departure from the current designs.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23294 - Posted: 7 Feb 2012 | 17:29:02 UTC - in response to Message 23291.
Last modified: 7 Feb 2012 | 22:25:20 UTC

No more CC 2.1-like issues will mean choosing a GF600 NVidia GPU to contribute towards GPUGrid will be easier; basically down to what you can afford.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23295 - Posted: 7 Feb 2012 | 19:39:42 UTC - in response to Message 23294.

They are probably going to be CC 3.0. What ever that will mean ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23298 - Posted: 8 Feb 2012 | 1:16:17 UTC - in response to Message 23295.

I'm concerned about the 256-bit memory path and comments such as "The net result is that shader utilization is likely to fall dramatically". Suggestions are that unless your app uses physics 'for Kepler' performances will be poor, but if they do use physics 'for Kepler' performances will be good. Of course only games sponsored by Nvidia will be physically enhanced 'for Kepler', not research apps.

With NVidia (and AMD) going out of their way to have patches coded for games that tend to be used for benchmarking, the Internets Dubious Information On Technology will be even more salty and pinched. So wait for a Kepler app and then see the performances before buying.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 23304 - Posted: 8 Feb 2012 | 9:17:32 UTC - in response to Message 23298.

If the added speed is due to some new instructions, then we might be able to take advantage of them or not. We have no idea. Memory bandwidth should not be a big problem.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23319 - Posted: 8 Feb 2012 | 19:07:00 UTC

GK104 is supposed to be a small chip. With 256 bit bandwidth you can easily get HD6970 performance in games without running into limitations. Try to push performance considerably higher and your shaders will run dry. That's what Charlie suggests.

This is totally unrelated to GP-GPU performance: just take a look at how little bandwidth MW requires. It "depends on the code".. as always ;)

And, as GDF said, if nVidia made the shaders more flexible (they probably did) and more efficient for game physics, this could easily benefit real physics (the equations will be different, the general scheme rather similar).

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 23386 - Posted: 10 Feb 2012 | 23:48:45 UTC - in response to Message 23319.

Some interesting info:
http://wccftech.com/alleged-nvidia-kepler-gk104-specs-exposed-gpu-feature-1536-cuda-cores-hotclocks-variants/

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 23410 - Posted: 12 Feb 2012 | 8:44:04 UTC - in response to Message 23386.

http://www.brightsideofnews.com/news/2012/2/10/real-nvidia-kepler2c-gk1042c-geforce-gtx-670680-specs-leak-out.aspx

If this is real, it seems that kepler multiprocessors are doubled GF104 MP. I hope that it works better than GF104 for compute.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23416 - Posted: 12 Feb 2012 | 13:47:56 UTC - in response to Message 23410.

If you can only use 32 of the 48 cuda cores on 104 then you could be looking at 32 from 96 with Kepler, which woud make them no better than existing hardware. Obviously they might have made changes that allow for easier access, so we don't know that will be the case, but the ~2.4 times performance over 104 should be read as 'maximum' performance, as in 'up to'. My impression is that Kepler will generally be OK cards with some exceptional performances here and there, where physics can be used to enhance performance. I think you will have some development to do before you get much out of the card, but hey that's what you do!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 23419 - Posted: 12 Feb 2012 | 14:35:38 UTC - in response to Message 23416.
Last modified: 13 Feb 2012 | 8:55:34 UTC

no, it should be at least 64/96, but still i hope they have improved the scheduling.
Anyway, with such changes there will be time for optimizations.
gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23427 - Posted: 12 Feb 2012 | 21:46:50 UTC

32 from 96 would mean going 3-way superscalar. They may be green, but they're not mad ;)
As GDF said, 64 of 96 would retain the current 1.5-way superscalar ratio. And seeing how this did OK, but not terribly good I'd also say they rather increase the the number of wave fronts in flight than this ratio. I wouldn't be surprised if they processes each of the 32 threads/warps/pixels/whatever in a wave front in one clock, rather than 2 times 16 in 2 clocks.

And don't forget that shader clock speeds are down, so don't expect a linear speed increase with shader number. Anyway, it's getting interesting!

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 23441 - Posted: 13 Feb 2012 | 8:58:25 UTC - in response to Message 23427.

I wouldn't be surprised if they processes each of the 32 threads/warps/pixels/whatever in a wave front in one clock, rather than 2 times 16 in 2 clocks.
MrS


That's what it seems from the diagram, they have 32 load/store units now.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23572 - Posted: 20 Feb 2012 | 19:54:04 UTC - in response to Message 23509.

They've got the basic parameters of the HD7970 totally wrong, although it's been officially introduced 2 months ago. Performance is also wrong: it should be ~30% faster than HD6970 in games, but they're saying 10%. They could argue that their benchmark is not what you'd typically get in games.. but then what else is it?

I'm not going to trust their data on unreleased hardware ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 23781 - Posted: 5 Mar 2012 | 19:50:54 UTC - in response to Message 23572.

It seems that we are close
http://semiaccurate.com/2012/03/05/nvidia-will-launch-gk104keplergtx680-in-a-week/

gdf

Profile Damaraland
Send message
Joined: 7 Nov 09
Posts: 152
Credit: 16,181,924
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwat
Message 23835 - Posted: 8 Mar 2012 | 22:06:49 UTC - in response to Message 23781.

More news.
March 8, is the day where the press that Nvidia
March 12, Nvidia will paper launch the cards
March 23-March 26 sellings

http://semiaccurate.com/2012/03/08/the-semiaccurate-guide-to-nvidia-keplergk104gtx680-launch-activities/
____________
HOW TO - Full installation Ubuntu 11.10

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 23914 - Posted: 12 Mar 2012 | 10:28:17 UTC

More rumours ... Guru3D article:

http://www.guru3d.com/news/nvidia-geforce-gtx-680-up-to-4gb-gddr5/

Regards
Zy

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 23961 - Posted: 14 Mar 2012 | 18:37:04 UTC

Alledged pics of a 680...

http://www.guru3d.com/news/new-nvidia-geforce-gtx-680-pictures-surface/

Regards
Zy

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 23973 - Posted: 15 Mar 2012 | 21:04:22 UTC - in response to Message 23961.

Better pictures, benchmarks and specifications.
http://www.tomshardware.com/news/Nvidia-Kepler-GeForce-GTX680-gpu,15012.html

It should be out 23th March, but by the time it gets to Barcelona is going to be May or June.
If somebody cand give one to the project we can start porting the code earlier. This seems to be an even bigger change than Fermi cards were.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23974 - Posted: 16 Mar 2012 | 0:46:47 UTC

I still have a bad feeling about the 1536 CUDA cores....

JLConawayII
Send message
Joined: 31 May 10
Posts: 48
Credit: 28,893,779
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 23976 - Posted: 16 Mar 2012 | 1:24:44 UTC

What sort of "bad feeling"?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23977 - Posted: 16 Mar 2012 | 1:59:29 UTC - in response to Message 23976.

I had the same sort of "bad feeling" - these cuda cores are not what they use to be, and the route to using them is different. Some things could be much faster if PhysX can be used, but if not who knows.
http://www.tomshardware.com/news/hpc-tesla-nvidia-GPU-compute,15001.html Might be worth a look.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

JLConawayII
Send message
Joined: 31 May 10
Posts: 48
Credit: 28,893,779
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 23979 - Posted: 16 Mar 2012 | 2:59:16 UTC

I wouldn't worry about it. I'm pretty sure the 6xx cards will be great. If they're not, you can always buy more 5xx cards at plummeting prices. There's really no losing here I think.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 23980 - Posted: 16 Mar 2012 | 7:03:44 UTC - in response to Message 23979.

Well at the very least they are seem to be like the fermi gpus with 48 cores per multiprocessor which we know that have a comparative poor performance.

I hope that they figured it out, otherwise without code changes it might well be on par with a gtx580.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23982 - Posted: 16 Mar 2012 | 9:29:54 UTC
Last modified: 16 Mar 2012 | 9:30:25 UTC

There's a price going to be paid for increasing the shader count by a factor of 3 while even lowering TDP. 28 nm alone is by far not enough for this.

Seems like Kepler is more in line with AMDs vision: provide plenty of raw horse power and make "OK to use", but not as bad as with VLIW, and not as easy as previously. Could be the two teams are converging to rather similar architectures with Kepler and GCN. The devil's just in the details and software.
(I haven't seen anything but rumors on Kepler, though)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23988 - Posted: 16 Mar 2012 | 13:44:06 UTC - in response to Message 23982.

Suggested price is $549, and suggested 'paper' launch date is 22nd March.

With the 1536 shaders being thinner than before, similar to AMD's approach, getting more work from the GPU and reaching the shaders might be the challenge this time.

The proposed ~195W TDP sits nicely between an HD 7950 and 7970, and noticeably lower than the 244W of the GTX580 (25% higher), so even if it can just match a GTX580 the energy savings are not insignificant. The price however is a bit daunting and until a working app is developed (which might take some time) we will have no idea of performances compared to the GTX500's.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23994 - Posted: 16 Mar 2012 | 17:03:14 UTC - in response to Message 23976.

What sort of "bad feeling"?

I have two things on my mind:
1. The GTX 680 looks to me more like an improved GTX 560 than an improved GTX 580. If the GTX 560's bottleneck is present in the GTX 680, then GPUGrid could utilize only the 2/3rd of its shaders (i.e. 1024 from 1536)
2. It could mean that the Tesla and Quadro series will be improved GTX 580s, and we won't have an improved GTX 580 in the GeForce product line.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23996 - Posted: 16 Mar 2012 | 19:23:15 UTC
Last modified: 16 Mar 2012 | 19:44:44 UTC

Hi, I am most strange is the following relationship:

GTX580 = 3,000 = Transistors Mill. 512 cores. GF 110
GTX680 = 3,540 = Transistors Mill. 1536 cores. GK 104

I do not understand that with few transistors can triple cores.


GTX 285 = 1400 Trans. Mill 240 cores Die 470mm2
GTX 580 = 3000 Trans. Mill 512 cores Die 520mm2
GTX680 = 3540 Trans. Mill 1536 cores Die 294mm2

These numbers do not add up to me, the relationship of these values ​​between GTX200 and GTX500 do not fit the GTX600 evolution.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23999 - Posted: 16 Mar 2012 | 21:07:01 UTC - in response to Message 23996.
Last modified: 16 Mar 2012 | 21:08:52 UTC

That's because Kepler fundamentally changes the shader design. How is not exactly clear yet.
@Retvari: and that's why comparisons to GTX560 are not relevant here. I'm saying it's going to be great, just that it'll be very different.

BTW: in the past nVidia chips got rather close to their TDP in "typical" loads, i.e. games. There an HD7970 hovers around the 200 W mark. 250 W is just the power tune limit.

Edit: further information for the brave.. original is chineese.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24002 - Posted: 16 Mar 2012 | 22:48:47 UTC - in response to Message 23999.

That's because Kepler fundamentally changes the shader design. How is not exactly clear yet.
@Retvari: and that's why comparisons to GTX560 are not relevant here. I'm saying it's going to be great, just that it'll be very different.

I know, but none of the rumors comfort me. I remember how much was expected of the GTX 460-560 line, and they are actually great for games, but not so good at GPUGrid. I'm afraid that nVidia want to separate their gaming product line from the professional product line even more than before.
I'd like to upgrade my GTX 590, because it's too noisy, but I'm not sure it will worth it.
We'll see it in a few months.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 24003 - Posted: 17 Mar 2012 | 7:12:53 UTC - in response to Message 24002.

They cannot afford to separate gaming and computing. The chips will still need to be the same for economy of scale and there is a higher and higher interest in computing within games.

Changes are good, after all there are more shaders, we just have to learn how to use them. As it is the flagship product we are prepared to invest a lot on it.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24007 - Posted: 17 Mar 2012 | 12:43:38 UTC - in response to Message 24002.

Well.. even if they perform "only" like a Fermi CC 2.0 with 1024 or even 768 Shaders: that would still be great, considering they accomplish it with just 3.5 billion transistors instead of 3.2 billion for 512 CC 2.0 shaders. That's significant progress anyway.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Damaraland
Send message
Joined: 7 Nov 09
Posts: 152
Credit: 16,181,924
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwat
Message 24008 - Posted: 17 Mar 2012 | 12:58:21 UTC - in response to Message 24007.
Last modified: 17 Mar 2012 | 12:59:19 UTC

Well.. even if they perform "only" like a Fermi CC 2.0 with 1024 or even 768 Shaders: that would still be great, considering they accomplish it with just 3.5 billion transistors instead of 3.2 billion for 512 CC 2.0 shaders. Thats significant progress anyway.S

Agreed! Don't fotget power consumtion too. I want a chip not a stove!
Industry will never make a huge jump. They have to put in value the research investment. It's always more profitable two small steps than a big one.
____________
HOW TO - Full installation Ubuntu 11.10

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24009 - Posted: 17 Mar 2012 | 15:02:13 UTC - in response to Message 24008.

Otherwise people will be disappointed the next you "only" make a medium step.. ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 24015 - Posted: 18 Mar 2012 | 1:10:01 UTC
Last modified: 18 Mar 2012 | 1:10:36 UTC

Pre-Order Site in Holland - 500 Euros

http://www.guru3d.com/news.html#15424

3DMark 11 benchmark, which if verified is interesting. I am being cautious about games claims until I know about any emdedded PhysX code. The 3DMark 11 bench is however more interesting. If that translates into the Compute side as well as it indicates .... could be interesting. Still, lets await reality, but I hope it is as good as the 3DMark 11 result, competition is sorely needed out there.

http://www.guru3d.com/news/new-gtx-680-benchmarks-surface/

Regards
Zy

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24018 - Posted: 18 Mar 2012 | 12:14:58 UTC - in response to Message 24015.
Last modified: 18 Mar 2012 | 12:17:04 UTC

I was thinking about what could be the architectural bottleneck, which results the under utilization of the CUDA cores in the CC2.1 product line.
The ratio of the other parts versus the CUDA cores in a shader multiprocessor is increased compared to the CC 2.0 architecture, except the load/store units.
While the CC2.0 has 16 LD/ST units for 32 CUDA cores, the CC2.1 has 16 LD/ST units for 48 CUDA cores.
And what do I see in the latest picture of the GF104 architecture?

There are 32 LD/ST units for 192 CUDA cores. (there were 64 LD/ST units on the previous 'leaked' picture)
If these can utilize only 64 CUDA cores here at GPUGrid, then only 512 of the 1536 shaders could be utilized here.
Now that's what I call a bad feeling.
But I'm not a GPGPU expert, and these pictures could be misleading.
Please, prove me that I'm wrong.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24024 - Posted: 18 Mar 2012 | 14:39:22 UTC - in response to Message 24018.

I wouldn't expect things to work straight out of the box this time. I concur with Zoltan on the potential accessibility issue, or worsening of. I'm also concerned about potential loss of cuda core function; what did NVidia strip out of the shaders? Then there is a much speculated reliance of PhysX and potential movement onto the GPU of some functionality. So, looks like app development might keep Gianni away from mischief for some time :)
The memory bandwidth has not increased from the GTX580, leaving space for a GTX700 perhaps, and there is no mention of OpenCL 1.2, or DirectX 11.1 that I can see of. In many respects NVidia and AMD have either swapped positions or equilibration this time (TDP, die size, transistor count). Perhaps NVidia will revert to type in a future incarnation.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24028 - Posted: 18 Mar 2012 | 17:32:23 UTC - in response to Message 24018.

The problem with CC 2.1 cards should have been the superscalar arrangement. It was nicely written down by Anandtech here. In short: one SM in CC 2.0 cards works on 2 warps in parallel. Each of these can issue on instruction per cycle for 16 "threads"/pixels/values. With CC 2.1 the design changed: there are still 2 warps with 16 threads each, but both can issue 2 instruction per clock if the next instruction is not dependent on the result of the current one.

Load/Store units could also be an issue, but I think this is much more severe.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 24034 - Posted: 19 Mar 2012 | 11:22:48 UTC

680 SLI 3DMark 11 benchmarks (Guru3D via a VrZone benching session)

http://www.guru3d.com/news/geforce-gtx-680-sli-performance-uncovered/

Regards
Zy

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24045 - Posted: 19 Mar 2012 | 19:57:37 UTC - in response to Message 24028.

The problem with CC 2.1 cards should have been the superscalar arrangement. It was nicely written down by Anandtech here. In short: one SM in CC 2.0 cards works on 2 warps in parallel. Each of these can issue on instruction per cycle for 16 "threads"/pixels/values. With CC 2.1 the design changed: there are still 2 warps with 16 threads each, but both can issue 2 instruction per clock if the next instruction is not dependent on the result of the current one.

Load/Store units could also be an issue, but I think this is much more severe.

MrS

The Anandtech's article you've linked was quite enlightening.
I missed to compare the number of warp schedulers in my previous post.
Since then I've find a much better figure of the two architectures.
Comparison of the CC2.1 and CC2.0 architecture:


Based on that Anandtech article, and the picture of the GTX 680's SMX I've concluded that it will be superscalar as well. There are twice as many dispatch units as warp schedulers, while in the CC2.0 architecture their number is equal.
There are 4 warp schedulers for 12 CUDA cores in the GTX 680's SMX so at the moment I think GPUGrid could utilize only the 2/3 of its shaders (1024 of 1536), just like of the CC2.1 cards (there are 2 warp schedulers for 6 cuda cores), unless nVidia built some miraculous component in the warp schedulers.
In addition, based on the transistor count I think the GTX 680's FP64 capabilities (which is irrelevant at GPUGrid) will be reduced or perhaps omitted.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24046 - Posted: 19 Mar 2012 | 21:09:55 UTC - in response to Message 24003.

They cannot afford to separate gaming and computing. The chips will still need to be the same for economy of scale and there is a higher and higher interest in computing within games.

Changes are good, after all there are more shaders, we just have to learn how to use them. As it is the flagship product we are prepared to invest a lot on it.

gdf

I remember the events before the release of the Fermi architecture: nVidia showed different double precision simulations running much faster in real time on Fermi than on GT200b. I haven't seen anything like that this time. Furthermore there is no mention of ECC at all in the rumors of GTX 680.
It looks to me that this time nVidia is going to release their flagship gaming product before the professional one. I don't think they simplified the professional line that much.
What if they release a slightly modified GF110 made on 28nm lithography as their professional product line? (efficiency is much more important in the professional product line than peak chip performance - of course it would be faster than the GF110 based Teslas)

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24047 - Posted: 19 Mar 2012 | 21:20:29 UTC - in response to Message 24045.
Last modified: 19 Mar 2012 | 21:21:09 UTC

Glad to hear it was the right information for you :)

I think there's more going on. Note that in CC 2.1 they had 3 blocks of 16 shaders, which are arranged in 6 columns with 8 shaders each in the diagram. In the GK104 diagram, however, there are columns of 16 shaders. If these were still blocks of 16 shaders, there would be 12 of the blocks, which in turn would require 12 dispatch units - much more than available.

This wouldn't make sense. What I suppose they did instead is to arrange the shaders in blocks of 32, so that all threads within a warp can be scheduled at once (instead of taking 2 consecutive clocks). In this case there'd be "only" 6 of these blocks to distribute among 4 warps with 8 dispatch units.

Worst case we should stil see 2/3 of the shaders not utilized. However, there are 4 warps instead of 2 now. Still (as in CC 2.1) every 2nd warp needs to provide some instruction suitable for parallel execution, but load balancing should be improved.

And there's still the chance they increased the "out of order window", which is the amount of instructions that the hardware can look ahead to find instructions suitable for superscalar execution. As far as I understand this had only been the next instruction in CC 2.1.

I too suppose it's not going to be a DP monster - and it doesn't have to be as a mainly consumer / graphics oriented card. Leave that for GK100/GK110 (whatever the flag ship will be).

MrS
____________
Scanning for our furry friends since Jan 2002

Profile SMTB1963
Avatar
Send message
Joined: 27 Jun 10
Posts: 38
Credit: 524,420,921
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 24058 - Posted: 20 Mar 2012 | 17:35:54 UTC
Last modified: 20 Mar 2012 | 17:46:33 UTC

Looks like some guys over at XS managed to catch tom's hardware with their pants down. Apparently, tom's briefly exposed some 680 performance graphs on their site and XS member Olivon was able to scrape them before access was removed. Quote from Olivon:

An old habit from Tom's Hardware. Important is to be quick

LOL!

Anyways, the graphs that stand out:





Other relevant (for our purposes) graphs:


Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24075 - Posted: 22 Mar 2012 | 12:29:39 UTC - in response to Message 24061.
Last modified: 22 Mar 2012 | 12:30:59 UTC

Release date is supposed to be today!
I expect Europe has to wait for the US to wake up, before the Official reviews start. Until then tweaktown's unofficial review might be worth a look, but no CUDA testing, just games.

There is an NVidia Video here.
The card introduces GPU Boost (Dynamic Clock Speed), and 'fur' fans will be pleased!

LegitReviews posted suggested GK110 details, including release date.

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24076 - Posted: 22 Mar 2012 | 13:03:44 UTC - in response to Message 24075.
Last modified: 22 Mar 2012 | 13:07:57 UTC

2304 is another fancy number, regarding the powers of 2.
Probably the next generation will contain 7919 CUDA cores. :)

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24077 - Posted: 22 Mar 2012 | 13:19:32 UTC

Interesting and tantalizing numbers. Can't wait to see how they perform.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 24078 - Posted: 22 Mar 2012 | 18:21:50 UTC - in response to Message 24077.

it appears that they are actually available for real at least in the UK.
So it is not a paper launch.
gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24079 - Posted: 22 Mar 2012 | 19:27:18 UTC - in response to Message 24078.
Last modified: 22 Mar 2012 | 20:00:00 UTC

nvidia-geforce-gtx-680-review by Ryan Smith of AnandTech.

Compute performance certainly isn't great and FP64 is terrible (1/24)!

They can be purchased online from around £400 to £440 in the UK, though the only ones I can see in stock are £439.99! Some are 'on order'. So yeah, real launch, but somewhat limited and expensive stock. Also, they are the same price as an HD 7970. While AMD launched both the HD 7970 and HD 7950, NVidia had but one, as yet... This is different from the GTX480/GTX470 and the GTX580/GTX570 launches.
We will have to wait and see how they perform when GPUGrid get's hold of one, but my expectations are not high.

Other Reviews:


Tom’s Hardware

Guru 3D

TechSpot

HardOCP

Hardware Heaven

Hardware Canucks

TechPowerUp

Legit Reviews

LAN OC

Xbit Labs

TweakTown

Phoronix

Tbreak

Hot Hardware
Link Ref, http://news.techeye.net/hardware/nvidia-gtx-680-retakes-performance-crown-barely#ixzz1psRj9zuD

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24081 - Posted: 22 Mar 2012 | 20:03:18 UTC - in response to Message 24079.
Last modified: 22 Mar 2012 | 20:08:40 UTC

Here in Hungary I can see in stock only the Asus GTX680-2GD5 for 165100HUF, that's 562.5€, or £468.3 (including 27% VAT in Hungary)
I can see a PNY version for 485.5€ (£404), and a Gigabyte for 498€ (£414.5) but these are not in stock, so these prices might be inaccurate.

JLConawayII
Send message
Joined: 31 May 10
Posts: 48
Credit: 28,893,779
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 24082 - Posted: 22 Mar 2012 | 20:15:24 UTC

So its compute power has actually decreased significantly from the GTX 580?! The Bulldozer fiasco continues. What a disappointing year for computer hardware.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24083 - Posted: 22 Mar 2012 | 20:27:26 UTC - in response to Message 24082.

It's build for gaming, and that's what it does best. We'll have to wait a few more months for their new compute monster (GK110).

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 24084 - Posted: 22 Mar 2012 | 20:57:47 UTC - in response to Message 24083.

So far we have no idea of how the performance will be here.
I don't expect anything super at start (gtx580 like performance), but we are willing to spend time optimizing for it.

gdf

Profile Damaraland
Send message
Joined: 7 Nov 09
Posts: 152
Credit: 16,181,924
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwat
Message 24085 - Posted: 22 Mar 2012 | 21:06:15 UTC
Last modified: 22 Mar 2012 | 21:08:09 UTC

$499.99 in USA :(
Amazon

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24086 - Posted: 22 Mar 2012 | 21:25:31 UTC - in response to Message 24079.

Summarizing the reviews: gaming performance is like we have expected it, computing performace still not known, since folding isn't working on the GTX680 yet, probably the GPUGrid client won't work either without some optimization.

zombie67 [MM]
Avatar
Send message
Joined: 16 Jul 07
Posts: 209
Credit: 3,938,561,456
RAC: 45,300,411
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24087 - Posted: 22 Mar 2012 | 21:51:49 UTC - in response to Message 24079.

http://www.tomshardware.com/reviews/geforce-gtx-680-review-benchmark,3161-14.html

Moreover, Nvidia limits 64-bit double-precision math to 1/24 of single-precision, protecting its more compute-oriented cards from being displaced by purpose-built gamer boards. The result is that GeForce GTX 680 underperforms GeForce GTX 590, 580 and to a much direr degree, the three competing boards from AMD.

Does GPUGRID use 64-bit double-precision math?
____________
Reno, NV
Team: SETI.USA

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 24088 - Posted: 22 Mar 2012 | 21:56:14 UTC - in response to Message 24087.

If somebody can run here on a gtx680, let us know.

thanks,
gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 24089 - Posted: 22 Mar 2012 | 21:56:52 UTC - in response to Message 24087.

Almost nothing, this should not matter.
gdf

http://www.tomshardware.com/reviews/geforce-gtx-680-review-benchmark,3161-14.html

Moreover, Nvidia limits 64-bit double-precision math to 1/24 of single-precision, protecting its more compute-oriented cards from being displaced by purpose-built gamer boards. The result is that GeForce GTX 680 underperforms GeForce GTX 590, 580 and to a much direr degree, the three competing boards from AMD.

Does GPUGRID use 64-bit double-precision math?

JohnSheridan
Send message
Joined: 26 Jun 09
Posts: 10
Credit: 104,093
RAC: 0
Level

Scientific publications
watwat
Message 24091 - Posted: 22 Mar 2012 | 23:01:30 UTC - in response to Message 24088.

If somebody can run here on a gtx680, let us know.

thanks,
gdf


Should be getting an EVGA version tomorrow morning here in UK - cost me £405.

Already been asked to do some other Boinc tests first though.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24092 - Posted: 23 Mar 2012 | 8:48:12 UTC - in response to Message 24091.

It would be nice if you also reported here what you find for other projects - thanks!

MrS
____________
Scanning for our furry friends since Jan 2002

Evil Penguin
Avatar
Send message
Joined: 15 Jan 10
Posts: 42
Credit: 18,255,462
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 24093 - Posted: 23 Mar 2012 | 9:32:21 UTC

I wonder how this tweaked architecture will perform with these BOINC projects.
So far compute doesn't seem like Kepler's strong point.

Also, a little off topic...
But is then any progress being made on the AMD side of things?
I haven't heard a single peep about it for over a month.
If the developers still don't have a 7970, fine.
Please at least confirm as much...

Thanks.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 24094 - Posted: 23 Mar 2012 | 9:39:13 UTC - in response to Message 24093.

We have a small one, good enough for testing. The code works on Windows with some bugs. We are assessing the performance.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24096 - Posted: 23 Mar 2012 | 11:47:49 UTC - in response to Message 24094.

It would seem NVidia have stopped support for XP; there are no XP drivers for the GTX 680!
http://www.geforce.com/drivers/results/42929 I think I posted about this a few months ago.

Suggestions are that the 301.1 driver is needed (probably Win).
http://www.geforce.com/drivers/beta-legacy
A Linux 295.33 driver was also released on the 22dn, and NVidia's driver support for Linux is >>better than AMD's.

The cards fan profile is such that the fans don't make much noise; so it might get hot. This isn't a good situation. If we can't use with WinXP, then we are looking at W7 (and presumably an 11% or more hit in performance)? If we use Linux we could be faced with cooling issues.

The 301.1 driver might work on a 2008R2 server, but probably not on earlier servers.

Good luck,
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

JohnSheridan
Send message
Joined: 26 Jun 09
Posts: 10
Credit: 104,093
RAC: 0
Level

Scientific publications
watwat
Message 24099 - Posted: 23 Mar 2012 | 14:41:40 UTC

OK have now got my 680 and started by running some standard gpu tasks for Seti.

On the 580 it would take (on average) 3m 40s to do one task. On the 680 (at normal settings) it takes around 3m 10s.

The card I have is an EVGA so can be overclocked using their Precision X tool.

The first overclock was too aggressive and it was clearly causing the gpu tasks to error out however lowering the overclock resulted in gpu tasks now taking around 2m 50s each.

Going to try and get a GPUGRID task shortly to see how that goes.

JohnSheridan
Send message
Joined: 26 Jun 09
Posts: 10
Credit: 104,093
RAC: 0
Level

Scientific publications
watwat
Message 24101 - Posted: 23 Mar 2012 | 15:19:17 UTC

Tried to download and run 2 x GPUGRID tasks but both crashed out before completing the download saying acemd.win2382 had stopped responding.

So not sure what the problem is?

JohnSheridan
Send message
Joined: 26 Jun 09
Posts: 10
Credit: 104,093
RAC: 0
Level

Scientific publications
watwat
Message 24102 - Posted: 23 Mar 2012 | 15:23:05 UTC

Just reset the graphics card back to "normal" ie. no overclock and still errors out - this time it did finish downloading but crashed out as soon as it started on the work unit so looks like this project does not yet work on the 680 ?

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1618
Credit: 8,587,344,351
RAC: 16,143,758
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24106 - Posted: 23 Mar 2012 | 16:14:11 UTC

stderr_txt:

# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 680"
# Clock rate: 0.71 GHz
# Total amount of global memory: -2147483648 bytes
# Number of multiprocessors: 8
# Number of cores: 64
SWAN : Module load result [.fastfill.cu.] [200]
SWAN: FATAL : Module load failed

Assertion failed: 0, file swanlib_nv.c, line 390

We couldn't be having a 31-bit overflow on that memory size, could we?

JohnSheridan
Send message
Joined: 26 Jun 09
Posts: 10
Credit: 104,093
RAC: 0
Level

Scientific publications
watwat
Message 24107 - Posted: 23 Mar 2012 | 16:22:55 UTC - in response to Message 24106.

stderr_txt:

# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 680"
# Clock rate: 0.71 GHz
# Total amount of global memory: -2147483648 bytes
# Number of multiprocessors: 8
# Number of cores: 64
SWAN : Module load result [.fastfill.cu.] [200]
SWAN: FATAL : Module load failed

Assertion failed: 0, file swanlib_nv.c, line 390

We couldn't be having a 31-bit overflow on that memory size, could we?


In English please?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24109 - Posted: 23 Mar 2012 | 16:35:11 UTC - in response to Message 24107.
Last modified: 23 Mar 2012 | 17:48:19 UTC

The GPUGRID application doesn't support the GTX680 yet. We'll have test units soon and - if there are no problems - we'll update over the weekend or early next week.

MJH


English - GPUGrid's applications don't yet support the GTX680. MJH is working on an app and might get one ready soon; over the weekend or early next week.

PS. Your SETI runs show the card has some promise ~16% faster @stock than a GTX580 (244W TDP). Or 30% faster overclocked. Not sure that will be possible here, but you'll know fairly soon. Even if the GTX680 (195W TDP) just matches the GTX580 the performance/power gain might be note worthy; ~125% performance/Watt, or 45% at 116% performance of a GTX580.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

JohnSheridan
Send message
Joined: 26 Jun 09
Posts: 10
Credit: 104,093
RAC: 0
Level

Scientific publications
watwat
Message 24110 - Posted: 23 Mar 2012 | 16:36:15 UTC

Thanks for that simple to understand reply :)

I will suspend GPUGRID on that machine until the project does support the 680.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24111 - Posted: 23 Mar 2012 | 16:39:03 UTC - in response to Message 24110.
Last modified: 23 Mar 2012 | 16:41:45 UTC

Good idea; no point returning lots of failed tasks!

I expect you will see an announcement when there is a working/beta app.

Thanks,
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

JAMES DORISIO
Send message
Joined: 6 Sep 10
Posts: 8
Credit: 3,135,072,626
RAC: 4,586,425
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24114 - Posted: 23 Mar 2012 | 18:12:00 UTC

HPCwire - NVIDIA Launches first Kepler GPUs at gamers; HPC version waiting in the wings.
http://www.hpcwire.com/hpcwire/2012-03-22/nvidia_launches_first_kepler_gpus_at_gamers_hpc_version_waiting_in_the_wings.html?featured=top

matlock
Send message
Joined: 12 Dec 11
Posts: 34
Credit: 86,423,547
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 24115 - Posted: 23 Mar 2012 | 18:59:04 UTC - in response to Message 24096.

Why would there be cooling issues in Linux? I keep my 560Ti448Core very cool by manually setting the fan speed in the nvidia settings application, after setting "Coolbits" to "5" in the xorg.conf.


It would seem NVidia have stopped support for XP; there are no XP drivers for the GTX 680!
http://www.geforce.com/drivers/results/42929 I think I posted about this a few months ago.

Suggestions are that the 301.1 driver is needed (probably Win).
http://www.geforce.com/drivers/beta-legacy
A Linux 295.33 driver was also released on the 22dn, and NVidia's driver support for Linux is >>better than AMD's.

The cards fan profile is such that the fans don't make much noise; so it might get hot. This isn't a good situation. If we can't use with WinXP, then we are looking at W7 (and presumably an 11% or more hit in performance)? If we use Linux we could be faced with cooling issues.

The 301.1 driver might work on a 2008R2 server, but probably not on earlier servers.

Good luck,

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24116 - Posted: 23 Mar 2012 | 19:40:10 UTC - in response to Message 24115.

Well, if we make the rather speculative presumption that a GF680 would work with Coolbits straight out of the box, then yes we can cool a card on Linux, but AFAIK it only works for one GPU and not for overclocking/downclocking. I think Coolbits was more useful in the distant past, but perhaps it will still work for GF600's.
Anyway, when the manufacturer variants appear, with better default cooling profiles, GPU temps won't be something to worry about on any OS.
Cheers for the tip/recap, it's been ~1year since I put it in an FAQ.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

matlock
Send message
Joined: 12 Dec 11
Posts: 34
Credit: 86,423,547
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 24117 - Posted: 23 Mar 2012 | 21:14:15 UTC - in response to Message 24116.

It appears there may be another usage of the term "Coolbits" (unfortunately) for some old software. The one I was referring to is part of the nvidia Linux driver, and is set within the Device section of the xorg.conf.

http://en.gentoo-wiki.com/wiki/Nvidia#Manual_Fan_Control_for_nVIDIA_Settings

It has worked for all of my nvidia GPUs so far.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24118 - Posted: 23 Mar 2012 | 23:06:20 UTC - in response to Message 24117.

Thanks Mowskwoz, we have taken this thread a bit off target, so I might move our fan control on linux posts to a linux thread later. I will look into NVidia CoolBits again.

I see Zotac intend to release a GTX 680 chip clocked at 2GHz!
An EVGA card has already OC'ed to 1.8GHz, so the markets should see some sweet bespoke GTX680's in the future.

So much for PCIE 3.0

I see NVidia are listing a GT 620 in their drivers section...
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile oldDirty
Avatar
Send message
Joined: 17 Jan 09
Posts: 22
Credit: 3,805,080
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwat
Message 24120 - Posted: 24 Mar 2012 | 0:21:29 UTC

Wow, this 680 monster seems to run with Handbrakes on, poor performance on CL, more worst than 580 and of corse HD79x0.
nVidia want to protect their quadro/tesla Cards.
Or i get it wrong?
http://www.tomshardware.com/reviews/geforce-gtx-680-review-benchmark,3161-15.html
and
http://www.tomshardware.com/reviews/geforce-gtx-680-review-benchmark,3161-14.html
____________

JLConawayII
Send message
Joined: 31 May 10
Posts: 48
Credit: 28,893,779
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 24121 - Posted: 24 Mar 2012 | 0:49:14 UTC

No, that seems to be the case. OpenCL performance is poor at best, although in the single non-OpenCL bench I saw it performed decently. Not great, but at least better than the 580. Double precision performance is abysmal, it looks like ATI will be holding onto that crown for the forseeable future. I will be curious to see exactly what these projects can get out of the card, but so far it's not all that inspiring on the compute end of things.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24129 - Posted: 24 Mar 2012 | 11:18:59 UTC

For the 1.8 GHz LN2 was neccessary. That's extreme and usually yields clock speds ~25% higher than achievable with water cooling. Reportedly the voltage was only 1.2 V, which sounds unbelievable.

2 GHz is a far stretch from this. I doubt it's possible even with triple stage phase change cooling (by far not as cold as LN2, but sustainable). And the article says "probably only for the chinese market". Hello? If you go all the way to produce such a monster you'll want to sell them on Ebay, worldwide. You'd earn thousands of bucks a piece.

And something like "poor OpenCL performance" can not be said. It all depends on the software you're running. And mind you, Kepler offloads some scheduling work to the compiler rather than doing it in hardware. This will take some time to mature.

Anyway, as others have said, double precision performance is downright ugly. Don't buy these for Milkyway.

MrS
____________
Scanning for our furry friends since Jan 2002

Evil Penguin
Avatar
Send message
Joined: 15 Jan 10
Posts: 42
Credit: 18,255,462
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 24153 - Posted: 26 Mar 2012 | 2:27:57 UTC - in response to Message 24094.

We have a small one, good enough for testing. The code works on Windows with some bugs. We are assessing the performance.

gdf

That's pretty good news.
I'm glad that AMD managed to put out three different cores that are GCN based.
The cheaper cards still have most if not all of the compute capabilities of the HD 7970.

Hopefully there will be a testing app soon and I'll be one of the first in line. ;)

Palamedes
Send message
Joined: 19 Mar 11
Posts: 30
Credit: 109,550,770
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24155 - Posted: 26 Mar 2012 | 17:19:15 UTC

Okay so this thread has been all over the place.. can someone sum up?

Is the 680 good or bad?

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24156 - Posted: 26 Mar 2012 | 17:45:31 UTC

They're testing today.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24172 - Posted: 28 Mar 2012 | 19:54:16 UTC - in response to Message 24156.
Last modified: 28 Mar 2012 | 20:22:39 UTC

Hello: The summary of what I've read several analyzes on the performance of GTX680 in caculation is as follows:

Simple Presision............. +50% to +80%
Double Precision............. -30% to -73%



" Because it’s based around double precision math the GTX 680 does rather poorly here, but the surprising bit is that it did so to a larger degree than we’d expect. The GTX 680’s FP64 performance is 1/24th its FP32 performance, compared to 1/8th on GTX 580 and 1/12th on GTX 560 Ti. Still, our expectation would be that performance would at least hold constant relative to the GTX 560 Ti, given that the GTX 680 has more than double the compute performance to offset the larger FP64 gap "

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24180 - Posted: 29 Mar 2012 | 21:10:42 UTC - in response to Message 24172.

Hey where's that from? Is there more of the good stuff? Did Anandtech update their launch article?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24185 - Posted: 30 Mar 2012 | 12:18:57 UTC - in response to Message 24180.

Yes, looks like Ryan added some more info to the article. He tends to do this - it's good reporting, makes their reviews worth revisiting.
http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/17

Any app requiring doubles is likely to struggle, as seen with PG's.

Gianni said that the GTX 680 is as fast as a GTX580 on a CUDA 4.2 app here.
When released the new CUDA4.2 app is also supposed to be 15% faster for Fermi cards, which is more important at this stage.
The app is still designed for Fermi, but can't be redesigned for the GTX680 until the dev tools are less buggy.
In the long run it's likely that there will be several app improvement steps for the GF600.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24186 - Posted: 30 Mar 2012 | 13:07:20 UTC
Last modified: 30 Mar 2012 | 13:09:54 UTC

Why does nvidia caps his 6xx series in this way? when they think it kills there own tesla series cards....why they still sold them when they perform that bad in comparsion to the modern desktop cards??? It would much cheaper for us and nvdia would sold much more of there desktop cards to grid computing...or they set an example of 8 of uncensored gtx680 chips on one tesla card for that price a tesla costs..
____________
DSKAG Austria Research Team: http://www.research.dskag.at



frankhagen
Send message
Joined: 18 Sep 08
Posts: 65
Credit: 3,037,414
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 24187 - Posted: 30 Mar 2012 | 14:10:21 UTC - in response to Message 24186.

Why does nvidia caps his 6xx series in this way? when they think it kills there own tesla series cards....


plain simple?

they wanted gaming performance and sacrificed computing capabilities which are not needed there.

why they still sold them when they perform that bad in comparsion to the modern desktop cards??? It would much cheaper for us and nvdia would sold much more of there desktop cards to grid computing...or they set an example of 8 of uncensored gtx680 chips on one tesla card for that price a tesla costs..


GK-104 is not censored!

it's plain simple a mostly pure 32-bit desgin.

i bet they will come up with something completey different for kepler cards.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24188 - Posted: 30 Mar 2012 | 14:14:38 UTC - in response to Message 24186.
Last modified: 30 Mar 2012 | 14:22:08 UTC

Some of us expected this divergence in the GeForce.
GK104 is a Gaming Card, and we will see a Compute card (GK110 or whatever) probably towards the end of the year (maybe Aug but more likely Dec).

Although it's not what some wanted, it's still a good card; matches a GTX580 but uses less power (making it about 25% more efficient). GPUGrid does not rely on OpenCL or FP64, so these weaknesses are not an issue here. Stripping down FP64 and OpenCL functionality helps efficiency on games and probably CUDA to some extent.

With app development, performance will likely increase. Even a readily achievable 10% improvement would mean a theoretical 37% performance per Watt improvement over the GTX580. If the performance can be improved by 20% over the GTX580 then the GTX680 would be 50% more efficient here. There is a good chance this will be attained, but when is down to dev tools.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24189 - Posted: 30 Mar 2012 | 15:32:26 UTC
Last modified: 30 Mar 2012 | 15:32:59 UTC

ok i read ya both answers and understood, i only read anywhere that it is cut in performance for not matching there tesla. Seems to be a wrong article then ^^ (dont ask where i read that, dont know anymore). So i beleave now the gtx680 is a still good card then ;)
____________
DSKAG Austria Research Team: http://www.research.dskag.at



frankhagen
Send message
Joined: 18 Sep 08
Posts: 65
Credit: 3,037,414
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 24191 - Posted: 30 Mar 2012 | 17:52:16 UTC - in response to Message 24189.
Last modified: 30 Mar 2012 | 17:53:46 UTC

So i beleave now the gtx680 is a still good card then ;)


well, it is - if you know what you get.

taken from the CUDA_C guide in CUDA 4.2.6 beta:

CC 2.0 compared to CC 3.0

OP's per clock-cycle and SM/SMX:

32-bit floating-point: 32 : 192
64-bit floating-point: 16 : 8
32-bit integer add: 32 : 168
32-bit integershift, compare : 16: 8
logical operations: 32: 136
32-bit integer : 16 : 32

.....

+ optimal warp-size seems to have moved up from 32 to 64 now!

it's totally different and the apps need to be optimized to take advantage of that.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24192 - Posted: 30 Mar 2012 | 20:56:11 UTC

Another bit to add regarding FP64 performance: apparently GK104 uses 8 dedicated hardware units for this, in addition to the regular 192 shaders per SMX. So they actually spent more transistors to provide a little FP64 capability (for development or sparse usage).

MrS
____________
Scanning for our furry friends since Jan 2002

Post to thread

Message boards : Graphics cards (GPUs) : gtx680

//