Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

Comment Re:When will China have their 60's? (Score 1) 270

You really think China is going to 'flip a switch' and replace their coal power plants with solar and wind overnight? Even though China uses 50% of the *world's* coal production right now? I think your mental model may be off by a few orders of magnitude in places. China has an estimated 950 Gigawatts of electricity production right now, and 82% of that comes from coal. Of the non-coal 18%, nearly all of that is hydroelectric, with a little bit of nuclear. There just aren't enough rivers left for china to significantly increase their hydro (how many more 3 gorges can they build? Zero! ). China's renewable energy growth might be 'torrid,' but it's still insignificant. China's energy use more than *doubled* in the last 10 years, and it's not slowing down. And what are they meeting that growth with? More coal, making the pollution worse.

In 10 years, I'd bet you almost anything that China is using even more coal than they are today.

Comment ARTICLE TEXT (Score 4, Informative) 140

As part two (see previous attempt) of my ongoing series in ‘computational necromancy,’ I’ve spent the last year and a half or so constructing my own 1/10-scale, binary-compatible, cycle-accurate Cray-1. This project falls purely into the “because I can!” category - I was poking around the internet one day looking for a Cray emulator and came up dry, so I decided to do something about it. Luckily, the Cray-1 hardware reference manual turned out to be useful enough that implementing most of this was pretty straightforward. The Cray-1 is one of those iconic machines that just makes you say “Now that’s a super computer!” Sure, your iPhone is 10X faster, and it’s completely useless to own one, but admit it . . you really want one, don’t you?

The Cray-1A Architecture

Now, let’s get down to specs - What is this bad boy running? The original machine ran at a blistering 80 MHz, and could use from 256-4096 kilowords (32 megabytes!) of memory. It has 12 independent, fully-pipelined execution units, and with the help of clever programming, can peak at 3 floating-point operations per cycle. Here’s a diagram of the overall architecture:

cray_architecture

It’s a fairly RISC-y design, with 8 64-bit scalar (S) registers , 8 64-bit/64-word vector (V) registers, and 8 24-bit address (A) registers. Rather than a traditional cache, it uses a ’software-managed’ cache with an additional 64 64-bit words (T registers) and 64 24-bit words (B registers). There are instructions to transfer data between memory and registers, and then register-to-register ‘compute’ instructions.

One of the coolest aspects of this machine is that everything is fully pipelined. This machine was designed to be fast, so if you’re careful, you can actually get one (or more) instruction every cycle. This has some interesting implications - there’s no ‘divide’ instruction, for instance, because it can take a variable amount of time to finish. To perform a divide, you need to first compute the ‘reciprocal approximation’ (something we *can* do in exactly 13 cycles, it turns out) of the denominator value, and then perform a separate multiply of that result with the numerator.

The vector instructions are particularly cool. A vector Add operation might take only 5 cycles to start producing results (remember, each vector can hold 64 values, so it takes 5 + 64 cycles to finish adding). Why wait for it to finish though? We can take the result output from the adder, and “chain” it straight into another vector unit (say a multiplier). And *that* only takes another 10 cycles or so, so we can chain that result into yet another unit (say, reciprocal approximation). Now, rather than waiting for the first operation to finish, we’re computing up to 3 floating point calculations per cycle. Clever programmers could sustain about 2 floating point operations per cycle, or 160 million instructions per second.

vector_chainingVector Chaining in Action!
The Hardware

The actual design was implemented in a Xilinx Spartan-3E 1600 development board. This is basically the biggest FPGA you can buy that doesn’t cost thousands of dollars for a devkit. The Cray occupies about 75% of the logic resources, and all of the block RAM.

spartan3_1600

This gives us a spiffy Cray-1A running at about 33 MHz, with about 4 kilowords of RAM. The only features currently missing are:

-Interrupts

-Exchange Packages (this is how the Cray does ‘context-switching’ - it was intended as a batch-processing machine)

-I/O Channels (I just memory-mapped the UART I added to it).

If I ever find some software for this thing (or just get bored), I’ll probably go ahead and add the missing features. For now, though, everything else works sufficiently well to execute small test programs and such.
The Software

When I started building this, I thought “Oh, I’ll just swing by the ol’ Internet and find some groovy 70’s-era software to run on it.” It turns out I was wrong. One of the sad things about pre-internet machines (especially ones that were primarily purchased by 3-letter Government agencies) is that practically no software exists for them.

***** If Anyone has any Cray-1 software, please contact me!! If you work at one of the National Labs, please take a look!****

After searching the internet exhaustively, I contacted the Computer History Musuem and they didn’t have any either. They also informed me that apparently SGI destroyed Cray’s old software archives before spinning them off again in the late 90’s. I filed a couple of FOIA requests with scary government agencies that also came up dry. I wound up e-mailing back and forth with a bunch of former Cray employees and also came up *mostly* dry. My current best hope is a guy I was able to track down that happened to own an 80 MB ‘disk pack’ from a Cray-1 Maintenance Control Unit (the Cray-1 was so complicated, it required a dedicated mini-computer just to boot it!), although it still remains to be seen if I’ll actually get a chance to try to recover it.

Without a real software stack (compilers, operating systems, etc.), the machine isn’t terribly useful (not that it would be all that useful if I did have software for it). All of the opcodes and registers for the Cray-1 are described in Base-8 (octal), so I did at least write a little script to translate octal machine code into the hexadecimal format that Xilinx’ tools require. All of my programming so far has just been in straight octal machine-code, just assembling it in my head. I have started work on re-writing the CAL Assembler, but that may take awhile, as it employs some tricky parsing that I’m having to teach myself.
Makin’ it look pretty

What’s the point of owning a Cray-1 if it doesn’t *look* like a Cray-1?? Unfortunately, the square-shaped FPGA board isn’t conducive to actually making it the traditional “C” shape, but I think it turned out pretty cool anyway. My friend Pat was nice enough to let me use his CNC milling machine to cut out the base pieces (and help with assembly). It’s a combination of MDF, balsa wood and pine. There was also a healthy dose of blood, sweat and tears (and gorilla glue) involved.

Some random photos from the build process:

Finally, Computer Engineer Barbie has an appropriate place to sit down!

This is awesome! How can I build my own?

This is very much a work-in-progress, but if you’d like to join in the fun, feel free! All you need is a copy of the RTL (almost all Verilog-2001) and a Spartan-3 1600 or equivalent FPGA board. The code is likely riddled with bugs and questionable implementation choices at this point, so on the off-chance anyone actually downloads this, feel free to lend a hand and send me any bug fixes you might make!

Technology

Submission + - Homebrew Cray-1 (chrisfenton.com)

egil writes: Chris Fenton built his own fully functional 1/10 scale Cray-1 supercomputer. True to the original, it includes the couch-seat, but is also binary compatible with the original. Instead of the power-hungry ECL technology, however, the scale model is built around a Xilinx Spartan-3E 1600 development board. All software is available if you want to build one for your own living room. The largest obstacle in the project is to find original software.

Submission + - Homebrew working Cray-1 (chrisfenton.com)

Brietech writes: This is a working binary-compatible, cycle-accurate 1/10th-scale Cray-1A, completely reverse-engineered from the hardware reference manual. The machine boasts 4 kilo-words of memory, a 33 MHz clock speed and its own tiny, pleather sitting-area. This fully-pipelined, 64-bit 1970's era supercomputer can peak at 3 floating point operations per second!
Hardware

Submission + - Hacker creates 1/10th scale Cray-1 supercomputer (geek.com)

An anonymous reader writes: Chris Fenton, an electrical engineer living in New York, always wanted his very own supercomputer, but did not have the space or money to acquire one. So he took a different approach and decided to build his own supercomputer from the 70s in scaled-down form based on the Cray-1.

The end result of his hacking produced a homebrew Cray-1A, which is 1/10th the size of the original. Unfortunately, getting software for the machine proved almost impossible. Fenton searched everywhere and eventually found out that SGI destroyed all the old software archives. Not even former Cray employees could help him.

Comment I'll probably sign up for this (Score 4, Insightful) 488

It obviously depends how much they try to charge, but I'll probably sign up for this. I really like reading the NYT (I actually live in NYC) - they provide an incredibly valuable service, which at the moment they basically give away. Realistically, though, I don't really buy the things they advertise. Half the time when I'm reading their site, it's on a computer with adblock installed so I don't even *see* the ads they have up. I was all about the "everything should be free" movement when I was a student, but now that I have a job, I don't mind compensating people for their work. Especially if the alternative is a world where the only 'news' comes from crappy bloggers that can't spell or do legitimate research.

Comment I got a bit stung (Score 2, Interesting) 1231

I upgraded from 9.04 to 9.10, and everything went smoothly except for the following: 1. My sound hardware is no longer recognized for some reason. I have a Dell Dimension computer with integrated audio, and it had worked fine after installing 9.04, but stopped working when I upgraded. It now claims I have no sound hardware installed, and I'm not entirely sure how to correct it. 2. After rebooting, the screen now goes blank (video card stops outputting) when X should start and bring up the login screen. I'm also not sure what caused this. I dropped down to a console, tried to kill the running X process, and then things seemed to miraculously work. I actually had to get something done, so I just went with it, but I'm not sure exactly what happened (or what I did to fix it). Maybe this is related to the proprietary Nvidia drivers I'm using? Everything else seemed to work just fine as far as I can tell. When I have a few hours to dig through forums, I'll try to fix the sound and the screen blanking thing.

Comment Re:Transputers, anyone? (Score 2, Insightful) 115

Well, if you take that idea to the limit using modern technologies, you basically wind up with rockin' new Nehalem processors using Quickpath Interconnect (QPI) between them, with PCI Express (serial links) to peripherals. But that's huge, is incredibly power hungry, and is basically the opposite of this architecture. But let's think this over some more. To access L1 cache, you can do it in a single cycle. L2 might be 10-20 cycles, etc. Now going over PCIe, the fastest thing going besides QPI, has a latency of like 400-800 nS. Even on a lowly 1 GHz processor, that's like 800 clock cycles, so you might as well be watching grass grow while you try to do something that's not embarrasingly parallel. As soon as you pump up the clock rate more, and add large caches and DRAM and all that, then you have *huge* power problems, and you still have somewhat crappy performance. Large-scale multi-core basically *does* use this architecture, only it's all on one die. It also uses an interconnect that doesn't suck, and manages to be cache coherent, so you can actually use it. Each core has it's own cache though (which is larger than the RAM on these chips), and the clock is nearly 2 orders of magnitude higher. Like i said, this is fun for a microcontroller project, but the performance would be atrocious for anything except embarrasingly parallel problems (and even then it would suck using these microcontrollers).

Comment Re:Transputers, anyone? (Score 5, Insightful) 115

The connection machine was still SIMD, even though it did have 64k (1-bit!) processors. This is just like the transputer architecture though! There are a couple of *really* big problems with this: 1) none of their microcontrollers are individually capable of running a large modern program. They have a few kilobytes of code, and no large backing RAM. 2) How do you get to I/O devices? If you need shared access to devices, this just makes all the problems of a normal computer enormously worse. 3) What about communication latency (and bandwidth) between nodes? They're using serial communications between 72 MHz processors. We're probably talking several microseconds of latency, minimum, and low-bandwidth (just not enough pins, and not nearly fast enough links) communication between nodes. As fun as something like this would be to build and play around with, there are reasons architectures like the transputer died out. The penalty for going 'off-chip' is so large (and orders of magnitude larger nowadays than it was back then), and the links between chips suck so much, that a distributed architecture like this just can't compete with a screaming fast 3 GHz single-node (especially multi-core).

Comment Re:Ideas aren't worth anything (Score 1) 539

To be fair, Japan has a ton of examples of how *not* to run a business. They've spent nearly the last 2 decades in economic stagnation, they have an incredibly inflexible labor market (their employees are basically either hired for life or never get beyond "temp" status with essentially no protections and crappy pay/benefits), and they have a culture that discourages disagreeing with your superiors. As a country, they also have an incredibly dysfunctional government and a demographic (more specifically aging) problem that is rapidly destroying their competitiveness.

Comment Re:Shoe on the other foot (Score 3, Insightful) 263

It might be more like if John McCain had of won San Fransisco with 70% in november, and the Democrats took to the streets to protest a rigged election. The Libertarian party has not shown itself capable of becoming a mass movement in any real sense. In regards to the last part of your comment, 1) I'm pretty sure the constitution doesn't define a "fiscal system," even though you probably meant economic system, and 2) the violence has largely been on the part of the Baseej, a super-nationalist militia, against the fairly peaceful protesters

Slashdot Top Deals

Say "twenty-three-skiddoo" to logout.

Working...