Graphics Cards

Nvidia Preparing Itself For Fermi, Releases CUDA Toolkit 3.0 Beta


Nvidia releases a preview of it's toolkit update, sheds some light into new functionality.

The CUDA Toolkit 3.0 Beta is now available to GPU Computing registered
developers.

Highlights for this release include:

* CUDA Driver / Runtime Buffer Interoperability, which allows
applications using the CUDA Driver API to also use libraries
implemented using the CUDA C Runtime.

* A new, separate version of the CUDA C Runtime (CUDART) for debugging
in emulation-mode.

* C++ Class Inheritance and Template Inheritance support for increased
programmer productivity

* A new unified interoperability API for Direct3D and OpenGL, with
support for:
* OpenGL texture interop
* Direct3D 11 interop support

* cuda-gdb hardware debugging support for applications that use the CUDA
Driver API

* New CUDA Memory Checker reports misalignment and out of bounds errors,
available as a debugging mode within cuda-gdb and also as a
stand-alone utility.

* CUDA Toolkit libraries are now versioned, enabling applications to
require a specific version, support multiple versions explicitly, etc.

* CUDA C/C++ kernels are now compiled to standard ELF format

* Support for all the OpenCL features in the latest R195.39 beta driver:
* Double Precision
* OpenGL Interoperability, for interactive high performance
visualization
* Query for Compute Capability, so you can target optimizations for
GPU architectures (cl_nv_device_attribute_query)
* Ability to control compiler optimization settings, etc. via support
for NVIDIA Compiler Flags (cl_nv_compiler_options)
* OpenCL Images support, for better/faster image filtering
* 32-bit Atomics for fast, convenient data manipulation
* Byte Addressable Stores, for faster video/image processing and
compression algorithms
* Support for the latest OpenCL spec revision 48 and latest official
Khronos OpenCL headers as of 11/1/2009

* Early support for the Fermi architecture, including:
* Native 64-bit GPU support
* Multiple Copy Engine support
* ECC reporting
* Concurrent Kernel Execution
* Fermi HW debugging support in cuda-gdb

Well, they should have some samples up and running just fine or software wouldn't be this advanced two whole months(at least) before the release. Despite faked "Fermi" boards, the actual goal of having a very fast HPC accelerator seems well underway.
Since I'm not a C++ developer, and I can link C kernels to C++ code either way, the most exciting features are definitely the ones provided by Fermi. If you care for PhysX, you should also be curious about the performance benefits of Concurrent Kernel Execution, as I believe this will bring performance gains with the new chip. How? The current cards can't process more than a kernel at a time and PhysX is just that, a kernel that is processing on the GPU and can't be run at the same time as graphic duties. If you aren't using a separate card, you'll likely notice some drop in graphics performance, one bigger than accountable for the extra processing load, one also accountable for context switching - an overhead that has also seen improvement with "Fermi", it's 20x faster.

Moore's law states that transistor count that can be placed inexpensively on a chip doubles every two years. Lately it's every 18 months, as GPU manufacturers have come to prove, and which was less than that at the start of this decade.
With that in mind, I'll tell you the 7800GTX was released in June 2005, the 8800GTX in November 2006 and the GTX 280 in... wait for it... June 2008. Eighteen months in between with a 8800GTX a month sooner and the GTX 280 a month later - not a bad execution. Such a common schedulle would put "Fermi" in 1/2010, a date that is still manageable by all means. Being the very advanced HPC multicore processor that it is, it's not something that can be manufactured sooner than what has been the norm for the last 4 years. Nvidia's problem right now is that AMD managed to undercut them by three months - at least.
While the 5870 missed the optimal die size for a $300 GPU that the 4870 set before it, the lack of competition from Nvidia made it very appealing and, despite what I might think about the hastily executed, memory bandwidth starved card, it's still a the fastest single GPU card around and one that didn't brought more serious consequences for being the first DX11 GPU to be sold.

Hastiness implied some sloppiness during the design of the new card, but it certainly is paying off for AMD. If AMD hadn't released the new cards, I would've said Nvidia had everything on track.

Processors

VIA Releases The Nano Series 3000


Yesterday I talked about VIA. Today, they are back in the news by announcing an updated Nano processor which, sadly, is not the anticipated dual core version of the chip.

The new chip is part of the new 3000 series Nano processor and VIA is touting some impressive idle power numbers, with clockspeeds remaining about the same:


100mW of idle power is a very good thing, the problem is VIA doesn't quite seem to want to release load power figures. Since the 2000 series are quite power hungry, especially the 1.8GHz L model at 25W, and the PR guys want to keep those figures under wraps, it's probably safe to think that load power wasn't improved. The U2000 models are quite decent though, at 5-8W TDP, performance is still better than the Atom and you get the x86-64 support that I love so much.

Performance has increased on some applications, although VIA doesn't mention how - probably some architectural improvements:


The new stepping of the "Isaiah" also supports SSE4 and has support for hardware virtualization, already present in some steppings of the Nano 2000.
The Nano is now an even better alternative to the Atom then before and graphics capabilities of the VX800 or VX855 are around double those of Intel's GMA 950 - which is still nothing spectacular.
If you're on the market for netbooks or ultraportables, you might want to take a look at these new processors.

Graphics Cards

Sapphire 5870 Vapor-X 1GiB


Sapphire unleashes an overclocked, better cooled and higher priced 5870.

The new card comes clocked at 870MHz for the core and 5GHz for the RAM, versus 850 and 4.8 in the standard model. It also comes with a better cooler, one that Sapphire says will shave 15ÂșC of the load temperature of typical 5870 coolers. It will also heat up your case but it's a decent compromise: if you had a look at the 5870 article, you know that the card desperately needed a better cooler.


The PCB is still the same, so no surprises here. One of the first manufacturers to come out with a custom design will probably be XFX but we have to wait and see. The PCB is already quite good, featuring solid capacitors and a digital VRM.



A design reminiscent of the good old GeForce 7900GTX, with a big fan to help bring the noise down while insuring a very good amount of airflow.


Not much air will go out here, especially as the cooler's shield doesn't force any air out of the exhaust.

The card is already available on the retail for $409 since you pay the premium for better cooling and higher clocks. My choice is the Radeon HD 5850 but the fastest single GPU card is still the 5870.Take a look at newegg for availability.

Industry

Rumor Says Nvidia To Enter x86 Market


Nvidia looking for some x86 love?

"We believe Nvidia could enter the x86 CPU business," said analyst Doug Freedman of Broadpoint AmTech. "Nvidia could become a supplier of x86 CPUs by necessity to preserve both GPU and chipset revenue."

Ok, so, that much we already knew it could be a good way to go for Nvidia: Intel has locked Nvidia out of it's platforms and will be doing the same for graphics if "Larrabee" ends up being competitive in the long run.
One of the paths Nvidia might take to protect it's business is building something that can go together with in-house built chipsets - nothing really surprising nowadays.

I had already mentioned that Nvidia might want to buy VIA soon, by taking a look at "Pineview" and the Ion 2, which likely won't be going hand in hand like today's Atom does, but EETimes is reporting that Nvidia may very well build the CPU from scratch, by putting to work the engineers it grabbed out of Transmeta. Transmeta has previously built a VLIW processor that could run x86 code by using a translation layer, it was slow and not as power efficient as the company had hoped.
But there's also a pretty good explanation for having engineers that worked at a former CPU company, they have built a new GPU, one that looks a lot like a CPU: Fermi. For me, that is enough to put the Transmeta rumor to sleep. If they had not built "Fermi" this way, this would be an entirely different scenario.

Now consider this: building a new CPU takes time, too much time. So let's take a look at the other option: by grabbing VIA, Nvidia would have access to a fully functional out-of-order x86-64 CPU, one which should also be available in a dual core variant any time now.
VIA, on itself, is way behind the competition right now. While the Nano is a decent CPU, it's too hot for most netbooks and comes paired with a northbridge that's not exactly up to par: the VX855. While it may be good for projects like the OLPC, the VX855 only supports DDR2. Today, that is a big issue. The chipset is already underpowered and memory prices are already the same for DDR2 and DDR3, with a tendency to change to DDR3 becoming cheaper. Which OEM would want a slower southbridge that only takes slower and higer priced memory? Now bring a dual core Nano and the refresh to the GeForce 9400M chipset used in Ion and you've got yourself an interesting piece of mobile hardware, plus a much stronger company behind it.

I don't know about you but I certainly would appreciate to still have Nvidia around for a "couple more" years, while hoping that Intel doesn't end up with the foothold they seem to be aiming at anytime soon. An Intel so big in all fronts would almost certainly mean bad things for everyone but stockholders.
I really don't bite the "HPC and Tegra will save our business" motto that Nvidia has currently adopted - at least not on the short term for Tegra - so this may very well be their only way to keep the company with the same dimension.