Slashdot Log In
Sun To Release 8-Core Niagara 2 Processor
Posted by
CowboyNeal
on Fri Aug 03, 2007 03:11 AM
from the screw-everything-we're-going-eight-cores dept.
from the screw-everything-we're-going-eight-cores dept.
An anonymous reader writes "Sun Microsystems is set to announce its eight-core Niagara 2 processor next week. Each core supports eight threads, so the chip handles 64 simultaneous threads, making it the centerpiece of Sun's "Throughput Computing" effort. Along with having more cores than the quads from Intel and AMD, the Niagara 2 have dual, on-chip 10G Ethernet ports with cryptographic capability. Sun doesn't get much processor press, because the chips are used only in its own CoolThreads servers, but Niagara 2 will probably be the fastest processor out there when it's released, other than perhaps the also little-known 4-GHz IBM Power 6."
This discussion has been archived.
No new comments can be posted.
Sun To Release 8-Core Niagara 2 Processor
|
Log In/Create an Account
| Top
| 214 comments
| Search Discussion
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Trust me... (Score:4, Insightful)
(http://slashdot.org/ | Last Journal: Saturday November 03, @04:58AM)
Re:yes but ... (Score:4, Funny)
(http://utnapistim.blogspot.com/)
Re:Trust me... (Score:5, Informative)
(http://www.ki.se/ | Last Journal: Tuesday August 28, @07:06AM)
http://www.opensparc.net/ [opensparc.net]
They are openly discussing making the Niagara 2 available as open source as well, but note that there are some roadblocks such as the US government's restrictions [opensparc.net] on crypto technology.
Re:Trust me... (Score:5, Informative)
(http://theravensnest.org/ | Last Journal: Sunday October 07, @07:05AM)
- The T2 is mainly focussed on integer ops with only one floating point pipeline per core. A GPU typically is close to 100% floating point pipelines, and doesn't bother with integer arithmetic.
- The T2 uses multiple contexts to hide memory latency, mostly caused by incorrectly predicted branches. A GPU typically doesn't bother much with branch prediction, since it runs code that is very light on conditional branches (on average, branches happen every 7 ops in general purpose code. In GPU code, they happen every few hundred).
- GPUs usually focus on 4-way vector instructions, since most of their data is of this form (RGBA colours, XYZW vertexes). The T2 only has scalar instructions.
I posted in my journal recently suggesting that it would be easier to produce a modern GPU than an older card, since modern GPUs have much less application-specific logic and do more in software, relying on just having lots of cores / pipelines to give speed.Good floating point too (Score:5, Interesting)
(http://ian.testers.homelinux.net/ | Last Journal: Sunday March 18 2007, @01:47PM)
This processor will also have a floating-point unit for each core, unlike the UltraSPARC T1 (Niagara) which only had one shared amongst all 8 cores. This should make it much more suitable than the T1 for a wide variety of applications. The T1 did great on multithreaded server-type tasks (e.g web, email, database) but would have been pretty hopeless for anything doing more than a bare minimum of FP work.
Yes, but.. (Score:2, Funny)
Re:Yes, but.. (Score:4, Funny)
It has a Vista emulation mode - move the power switch to OFF and you get something just as useful but more stable.
Re:Yes, but.. (Score:4, Funny)
No, Vista requires 640 cores, which ought to be enough for anybody.
Smokin'... (Score:1)
(http://www.solutium.co.uk/ | Last Journal: Friday June 08, @03:29AM)
Interesting (Score:5, Interesting)
(Last Journal: Tuesday October 30, @04:48AM)
I'd like to see some benchmarks, and more technical specs, on these babies.
Will it be water-cooled? (nt) (Score:3, Funny)
Regurgitating "Quad" market speak (Score:5, Informative)
quad is a quad and I want a cheap 8-way desktop (Score:4, Interesting)
That said I've always wanted to get my hands on some of these new multicore UltraSparcs. I think they have a lot of potential, and the new ones seem extremely powerful.
Now if only Sun would but the low end one in a mac mini form factor and sell it as a java developers kit then maybe I could play with one. The low end sun fires are something I could almost afford, but I don't really want to keep a 1u on my desk just to try out the technology.
I think the big 64-bit address space and the ability to run lots of threads seems to fit well with Sun's Java. Not that I am a Java developer, I just think it's a good match, and it seems to be that's why people were using the older CoolThreads systems, enterprise Java.
Re:quad is a quad and I want a cheap 8-way desktop (Score:5, Insightful)
I agree with you on this point.
I don't agree with you here. What matters to the customer are costs and performance. They shouldn't have to care about how the package works, as long as it works correctly.
From Intel's perspective, they had two options:
From the customer's perspective, those two options correspond to:
What do you think Intel and their customers prefer?
Re:quad is a quad and I want a cheap 8-way desktop (Score:4, Insightful)
Thus it makes it a worthwhile design to go with. I could see it continuing too. Maybe their next gen chips are 4 cores on a single unit which goes mainstream, and then an 8 core 2 unit job for higher end stuff. At some point there may be too many cores per unit to do with without bus contention, but them maybe not since the speed of the bus keeps getting increased. Also I could see OSes being made aware of this, if it continues, and knowing that each X number of processors is a unit and you can shuffle all you like withing that, but shuffling across units incurs more penalties and thus isn't done unless it has to be. So if a process had 4 threads, and a unit was 4 cores, it'd make sure all the threads were running on the same unit.
Regardless, you are correct that at this point it is an excellent idea. Doesn't matter if it is the most technically correct solution or not, what matters is that it works well and is cheap.
We make concessions like that all the time in the computer world. Memory would be a good example. For a good while on desktops, memory, the FSB, and the processor ran at the same speed. You had a 30MHz 386, you were running 30MHz memory. Multipliers weren't a things you worried about. Then, we started to run in to limits of what memory could do. We could scale processors faster than RAM, or at least faster than RAM could be done cheaply. Thus the start of clock multiplied chips. This works, but at some point the memory is just too slow. So then we start getting in to tricks like DDR RAM, which transfers twice per clock cycle, and interleaving RAM, so that the processor has two channels to get faster access and so on. Currently you can have a CPU at one speed, an FSB at another, and memory at a third. Right now I've got a 2.66GHz CPU, a "1333MHz" FSB (it's not really 1333MHz, FSBs are quad pumped so it really runs at 333MHz) and "667MHz" RAM (again not really, it's DDR so the actual memory clock is 166MHz, bus clock is 333MHz, it just does 667 million data transfers per second hence the rate) and this is not an uncommon setup.
None of this is an ideal setup. Ideally, the FSB would run at the same speed as the processor and so would the RAM. This would lead to the processor having almost no wait time for memory data and very little need for trickery to try and prefetch data and such. However alas, if it were possible at all it would be too expensive to do. Thus we have this somewhat hacked solution. However in reality it matters little, though a hack it may be, it works real well. It has given us memory that can get the data to the CPU in a timely fashion and doesn't break the bank.
Oh no, not again.... (Score:5, Insightful)
The quads from Intel provide four physical cores per socket. That is the definition of a quad in this context. The exact workings of how many bits of silicon there are, how they talk to each other and to the rest of the system is, to 99.999% of users and computer buyers, background fluff.
This was the same as when Intel put two single-core chips into a package to release a 'dual core'. Lots of people like you jumped up and down and pointed out it wan't *real* dual core, and how the FSB issue would cripple performance. Amazingly, it wasn't the case - they sold in droves, and real-world performance was good enough to carry Intel through to the 'true' dual core, the Core 2 Duo.
If the competition had anything out that was the same cost and performed significantly better than the 'fake' quad cores, you would have an argument. But they haven't and you don't. Bear in mind I'm talking about the huge x86/x64 market, not the relatively low volume non-x86 server market.
What Intel did back then and again now is perfectly sensible. They have millions of high yield, robust dual core chips being churned out, and they have built into the infrastructure the ability to put two into a package, lower the speed a bit to drop the per-core heat output, and sell reasonably priced (now) quad core chips. When the drop to 45nm happens, they will release their 'real' quad cores, and pretty quickly put two of those into a package to start selling oct-core (whatever we're going to call them). And so it goes.
What's the alternative? Not sell quads until 45nm comes out? Not working out too well for AMD is it? I've asked the question before here and on realworldtech.com - at what point will the FSB problem actually become a painful problem for the Intel chips? Well, not yet (4 core) is the answer, despite dire predictions from the AMD camp for years. My gues is that, shock of shocks, Intel have actually thought it through - and that's why CSI is coming. When the number of cores gets to the point where FSB will actually hurt performance relative to the AMD architecture, that's when CSI will kick in. Maybe at 8 cores, maybe at 16.
What, you don't need quad core yet? Fine, stop your bitching and choose what's right for you. Vive la difference, and 3 cheers for a market that gives us the choice.
on-chip 10G Ethernet ports (Score:2)
Re:on-chip 10G Ethernet ports (Score:5, Informative)
In the future, it is likely that all the wired buses in your motherboard will be replaced by an internal Ethernet-like network. We are already seeing a trend towards simpler and faster interconnects such as SATA. The next step is to use Ethernet-style connections for every chip-to-chip link, and within the chips themselves too. If this seems unlikely, consider that your PCs memory bus already is basically a network connection. The device at one end (CPU) is in a different clock domain to the device at the other (memory). Data is sent in packets (called bursts) to offset the latency of setting up a transfer.
Freudian Processor? (Score:3, Funny)
Re:Freudian Processor? (Score:5, Funny)
(http://ettlz.blogspot.com/ | Last Journal: Sunday February 12 2006, @06:53PM)
Not going to be the fastest, but... (Score:5, Informative)
(http://stephen.evilcoder.com/)
I think the Niagara is a pretty solid design, but it's not the processor to end all processors. For service workloads, I don't think you can get a better processor, but you probably don't want one of these processors in your workstation. Sun Microsystems is also headed in the right direction, establishing an open-community around these processors and Solaris.
Re:Not going to be the fastest, but... (Score:5, Informative)
(http://www.alioth.net/ | Last Journal: Friday November 09, @03:53PM)
Niagara (Score:2, Funny)
(http://cctoide.simguy.net/)
2x10Gb Ethernet (Score:2)
(http://slashdot.org/~Doc%20Ruby/journal | Last Journal: Thursday March 31 2005, @01:48PM)
If they made these CPUs cheap enough, we could put them on PCI-e cards in a Xeon, and run a Linux cluster over the PCI-e, coordinated by apps running on the Xeon. Or maybe stuff a Niagara/PCI-e box with extras, like we used to do with Mac Quadra 950/NuBus cards. But this time with 20Gbps ethernet per node, for a networked grid of nodes.
Price.. (Score:1)
I've got 2 T2000/32GB ram boxes here and if you remember their limitations and run what they are designed for, they are awesome.
The new Sun Moto: (Score:3, Insightful)
(http://utropicmedia.net/)
It's like it's 1999 all-over again, except this time Sun actually has revenue in-line with expectations. I continue to maintain Sun is this century's Bell Labs and Xerox PARC all rolled into one.
Niagra 2 my precious (Score:1)
These would make great backup servers (Score:2)
I could replace it, and get more throughput from a T2000, but the issue was doing restores would lose that edge from poor single thread performance
The Niagara 2 series is set to have 1.4X the single thread performance, plus the higher simultaneous threads (Though a slightly longer pipeline).
Since I am moving away from tape and going to Virtual Tape Library tech, I won't be constrained by how many backups I can do and avoid over multiplexing. I plan on doing 24-32 (or even more) simultaneous backups to virtual tape drives without skipping a beat. The only thing then will be keeping the network from being over-saturated.
Don't have any 10Gbe switches in house yet, but that can't be too far off. I'd likely put in 2 4 port 1Gbe cards and pump them like no tomorrow. I'm getting about 20-30MB/sec from each machine, so assuming 140MB/sec on a GigE port, and 8 of them, I can handle over 1100MB/sec, but doing 32 backups would be about 950MB/sec. It is close, but should work.
Reminds me of a classic Onion story... (Score:2)
Fuck Everything, We're Doing Five Blades [theonion.com]
(The funniest thing about this article is that a year after they published it, Gillette actually did release a five-bladed shaver!)
Threads Are the Work of the Devil (Score:3, Funny)
(http://www.rebelscience.org/Cosas/Reliability.htm | Last Journal: Wednesday September 05, @12:03PM)
Wow! Only 64 threads, eh? That's the problem with threads, you can't have too many of them because switching from one thread to another is very expensive, cycle-wise. In other words, as long as threads remain the only multitasking mechanism used by the computer industry, super fast, fine-grained multiprocessing will remain a dream. It gets worse. There is another problem with threads that is even worse than this. Threads are inherently asynchronous. Until and unless the computer industry comes to its senses and realizes that asynchronous processing makes it impossible to implement programs with deterministic timing, we will continue to pay the heavy price of software unreliability. Switch to a non-algorithmic, signal-based, synchronous software model (with the supporting CPU architecture), and the problem will disappear. Threads suck! Period. One man's opinion.
OT: IBM Power6 (Score:1)
Sacrifices everything for multithread throughput (Score:2)
The first test was a "make" test. On my desktop machine (generic dual-core Athlon), configure for some large software package (BerkeleyDB, I think, to run more benchmarks on) took a minute, and make -j 3 took 5. On the Niagara, configure took 5 minutes, and make -j 40 took only one.
For high-concurrency database benchmarks, the cost of synchronization made the Niagara slower than a standard AMD-based server. For a less concurrent load, the Niagara was of course much faster. Interestingly, a dual-core server performed much better here than a dual-processor single-core server, because the synchronization cost was lower.
For web applications, the Niagara did well for simple applications, but introduced unacceptable latencies for more CPU-intensive ones.
For anything floating-point, the original Niagara choked due to its single FPU, but that's what the T2 is supposed to fix.
Sun's Home in the Datacenter (Score:1)
Where I work our datacenter is a bit constrained on space, power, and cooling. Adding these bad boys allows us to support many more applications, websites, and whatever else the business wants with less power and cooling and capital cost than what we used last year. And, yes, you can get a three year warranty on brand name Intel servers but the reliability and serviceability of Sun gear lasts way beyond three years.
I think their desktops suck. And I wasn't too much of a fan of Solaris until Sol 10. It was boring. Run Solaris x86 if you want to try it cheap. Linux has made it much better by forcing new features liek Dtrace and ZFS. The cost of entry is a bit steep (and over powered) for SME but if you want serious computing power you can do much worse than Sun. They've been written off more times than I care to count (kind of like Apple) but they're still standing.
Netbackup media server (Score:1)
Re:low...... (Score:1, Insightful)
(http://www.dspanel.com/)
Re:Niagara (Score:1)
Re:low...... (Score:2)
Re:Sun doesn't get much processor press (Score:5, Interesting)
Also, if the last thing you have touched is a V440 then you are not exactly up to speed with the cutting edge of Sun products. I promise you that if you had actually ever seen a system running a T1 chip you would not say "their processor division has been kinda lagging". The cool threads stuff is amazing and they are the only people doing anything quite like it. I am not sure if you picked this up from the article but with one chip you get _64_ hardware based threads.
In our internal benchmarks a £20k T2000 with 1 x 8 core T1 outperformed a £100k+ V880 with 8 x 2 core Sparc. Freakin' cool and excellent value for money. Plus all this fits in two rack units.
Working in small companies is nice but I promise you that out there in the big wide world "most" companies don't think that $US20k is very much at all to spend on a system that will be part of a critical service.
Re:Sun doesn't get much processor press (Score:5, Insightful)
(http://kehoes.org/ | Last Journal: Friday August 10, @04:32AM)
1) Sun is not trying to win the hearts and minds of home users - that is not their market. Sun would see few benefits from pushing their products in the mainstream media. Trade press is where they reach the decision makers. How many Oracle adverts do you see in game magazines and tabloid newspapers? Not very many, they tend to advertise in business oriented outlets such as The Economist.
2) Some small businesses don't care about computers at all. The companies that need Sun will buy Sun. The companies who can run their business out of a box of post-it notes will do the former.
3) When you buy mission critical hardware, you don't look for a '3 year warranty'. You look for a service and support contract based on how critical the hardware is to your business. If you can run your business on a home-made 486dx system running Minix then that is probably the best option.
4) Sun being worth 10% of Intel is irrelevant. The Economist sells far fewer copies than The Sun (a pretty terrible UK tabloid) but I know which one I'd chose for a serious overview of world news.
5) This is a techie web site so news like this seems pretty relevant here, even if most of us can't afford to buy the kit.
Re:Sun doesn't get much processor press (Score:5, Interesting)
(Last Journal: Thursday July 07 2005, @09:59AM)
It's all realative. Your 'high performance' Dell or Gateway wouldn't do much other then run bind at one of our locations. You are comparing apples to oranges. These systems are not for you to surf the net with, and as for price, well there is a lot to be gained from stability. I still have sparc systems with OEM (minus the disks) that are close to 20 years old running at some locations. Bet your Dell can't say that.
Re:low...... (Score:2)
Re:So, will it get rid of Vista/boot delays? (Score:2)
(http://www.vems.co.nz/)
Re:Sun doesn't get much processor press (Score:1)
Re:So, will it get rid of Vista/boot delays? (Score:2, Informative)
(Last Journal: Wednesday September 05, @07:19AM)
Linspire (back in the day - I've been on Ubuntu for quite a while now) worked this way. IIRC you had to hold down a key to rescan for hardware, otherwise it assumed nothing changed and booted very briskly. I'm surprised it didn't catch on with more popular distros.
Also, I thought http://www.linuxbios.org/Welcome_to_LinuxBIOS [linuxbios.org] would get through POST and to the payload in just a couple of seconds.
Re:Sun doesn't get much processor press (Score:2)
Re:So, will it get rid of Vista/boot delays? (Score:2)
(http://honeypot.net/ | Last Journal: Friday April 07 2006, @09:33AM)
One: it will get rid of Vista in the sense that Vista isn't ported to it.
Two: you'd reboot these things once every many months. Who cares if it takes half an hour each time?
Re:So, will it get rid of Vista/boot delays? (Score:1)
(http://carnagepro.com/)
I'm on my 5th one. It's worth turning monitors on and off but not the computer.
Re:Sun doesn't get much processor press (Score:2)
(Last Journal: Tuesday December 31 2002, @08:24AM)