Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×

Comment Re:Not sure whats more impressive... (Score 1) 150

That's less of an issue if your throughput comes from thread-level parallelism. There are some experimental architectures floating around that get very good i-cache usage and solid performance from a stack-based ISA and a massive number of hardware threads.

The other day, I had the mildly insane idea that perhaps our abilities to explore the architectural space are limited by all existing architectures having been painstakingly handcrafted. Thus, if it were possible somehow to parametrically generate an architecture, and then synthesize a code generator backend for a compiler and a suitable hardware implementation, we might be able to discover some hidden gems in the largely unexplored universe of machine architectures. But it sounds like a pipe dream to me...

Comment Re:The 19 year old is a lunatic (Score 1) 150

You may be underestimating the level of effort to which some people will go to get at the performance. Right now they're running nVidia cards, for pete's sake. Show a way and they will come. After all, that "full ecosystem of tools and vendors" can be ultimately achieved with an open development model.

Comment Re:Not sure whats more impressive... (Score 1) 150

In either case, it sounds like a hell of a challenge, as (if I understand correctly) you'd presumably need to pre-evaluate logic flow and track how resources are accessed in order to embed the proper data cache hints. However, those sort of access patterns can change depending on the state of program data, even within the same code. Or, I suppose you could "tune" the program for optimal execution through multiple pass evaluation runs, embedding data access hints in a second pass after seeing the results of a "profiling" run.

Sounds like LuaJIT on steroids.

Comment Re:Not sure whats more impressive... (Score 1) 150

From what I can tell, his design looks like it might be flops per watt comparable to GPUs, but with different memory abstractions that result in similar limitations. I suspect that if you write custom code for it, and have the right kind of problem, it will do significantly better than available options, but in the general case and/or non specialized code it won't do anything much better than a GPU, but it it may be competitive.

In other words, very much like Chuck Moore's Forth cores. ;-) Which is fine, there's quite a range of applications for hardware like that. Especially in the military.

Slashdot Top Deals

To program is to be.

Working...