Forgot your password?
typodupeerror

Comment: Re:What am I missing here... (Score 2) 166

by Skowronek (#37240212) Attached to: Like a Redstone Cowboy

Dunno, I really enjoy Minecraft otherwise (building, pretty landscapes, running around with friends), but redstone logic feels not only like work, but like the most unpleasant part of my work - what I'd call "the drudgery".

And not only it's like work, but it's also ultimately pointless. You are not exploring new frontiers, you are trying to recreate 1960s tech with wood sticks.

Comment: Re:What am I missing here... (Score 2) 166

by Skowronek (#37239938) Attached to: Like a Redstone Cowboy

I'm very much an addictive personality *and* an actual digital designer. My personal reaction to Minecraft's "redstone" crap was a resounding "meh", though.

In the same amount of effort and time that it takes to build a slow, useless piece of Minecraft logic, I can build something actually interesting in an FPGA by instantiating and placing LUTs (a marginally higher abstraction level than redstone), and have it run at 500 MHz.

After doing ASIC design, even FPGA design feels a bit like playing with Lego bricks after building real bridges (there's nothing wrong with Lego bricks, but they do not have the same mechanical qualities). But at least it's not plasticine, like Minecraft (again - nothing wrong with plasticine, but it really does limit what you can do).

People should be dissuaded from doing otherwise unremarkable things in the most painful way possible (16-bit CPU in Redstone - great, but you could achieve the same CPU in 100-200 lines of Verilog, and it would be quite fast). Those guys are probably quite bright, and could do much more impressive things if they didn't restrict themselves to plasticine. Start on the Lego :)

Comment: This has been done before - AMD XGP (Score 1) 207

by Skowronek (#37030768) Attached to: External Thunderbolt Graphics Card On Its Way

AMD released this kind of product before - http://en.wikipedia.org/wiki/ATI_XGP . It was generally considered to be a failure, partially because the software support was not perfect, and partially because people just didn't want to lug a dock / GPU box around. The hardware bandwidth was more satisfying, though, at 8 PCIe lanes and not 4 like Thunderbolt.

I find it amusing that the same ideas return and return in this industry, presented as an innovation every single time.

Comment: Re:So much for build quality... (Score 1) 531

by Skowronek (#35343308) Attached to: New MacBook Pro Teardown Reveals 'Shoddy Assembly'

There have been, long in the past, much better screens in laptops.

For instance, my IBM ThinkPad T60p has a 15", 2048x1536 screen. This has, as a matter of fact, been an option on that model.

IBM has pushed screen technology so far that we still have not hit that watermark. Their 22" monitors with 3840x2400 resolution have hit the market in 2001. Of course, when Apple releases a 3840x2160 (16:9, of course, so it's a bit cheaper) as a product, nobody will even remember they existed...

Comment: Re:Open Source drivers? (Score 4, Informative) 240

by Skowronek (#32456944) Attached to: AMD's Fusion Processor Combines CPU and GPU

The documentation needed to write 3D graphics drivers has been consistently released by ATI/AMD since R5xx. In fact, yesterday I was setting up a new system with a RV730 graphics card which was both correctly detected and correctly used by the open source drivers. Ever since AMD started supporting the open source DRI project with both money, specifications and access to hardware developers things have improved vastly. I know some of the developers personally; they are smart and I believe that given this support, they will produce an excellent driver.

It's sad to see that with Poulsbo Intel did quite an about-face, and stopped supporting open source drivers altogether. The less said about nVidia the better.

In conclusion, seeing who is making this Fusion chip, I would have high hopes for open source on it.

Comment: Re:people who do less useful work earn more (Score 2, Interesting) 172

by Skowronek (#32063620) Attached to: Open Source vs. Wall Street Bonuses

50% management? This would imply that, on average, every manager has almost 2 underlings (for a large company it tends to 2 - proof for the reader). The conclusion of this, from Dirichlet's principle, is that if there is a manager who manages 2 or more underlings, there is at least one manager that manages no more than 1 person. And that's terrifying.

Comment: Re:Could be worse (Score 1) 307

by Skowronek (#31904562) Attached to: Cross With the Platform

I did not mention command buffers as a way to submit IM operations; I agree with you entirely that it would result in very sub-par performance.

Some machines, generally older than the R300 or in fact any of the consumer hardware (the ones that OpenGL came from), were optimized for IM and executed it through direct MMIO writes from userland. This avoids the problem of both command buffer size limitations and the high price of ioctl-style command buffer submits.

Comment: Re:Could be worse (Score 1) 307

by Skowronek (#31904510) Attached to: Cross With the Platform

No, but there's this argument going around that "immediate mode necessarily performs worse" which is simply not true, if your hardware is not constrained. My view comes from spending a good few years designing GPU chips - we could do whatever we wanted as long as there was enough demand for it.

There isn't enough demand for immediate mode, and for OpenGL in general. That doesn't mean it's not possible to make it perform well.

Additionally, Apple likes to make their own hardware (just look at their recent ex-ATI employee acquisitions) so they can do whatever they please with their chips. Especially in portables.

Comment: Re:Could be worse (Score 1) 307

by Skowronek (#31898218) Attached to: Cross With the Platform

Actually, there are several operations that make the vertex fetch operation less parallel than you might think. In particular, vertex sharing by multiple primitives is usually handled with a small depth (32-64 vertices) buffer. In other words, if a vertex index occurs twice in your object, but those occurences are >128 locations apart in your index buffer, the vertex will be processed twice and you won't get peak performance. Another example is the primitive assembly stage, when processed vertices that are output from the Vertex Shader are merged into primitives according to the data in the index buffer. This is a significant performance bottleneck that makes VBO bandwidth lower than fragment shaders.

The assumption that improvement of immediate mode performance would necessarily come with a large area hit is not supported. Most of the GPU area is not control units (which this would end up being, since the datapaths are already in place); it's the memory controller, raster backends and shaders. A quick look at the floorplan of a modern (>2008) GPU proves this quite readily. Even the primitive assembly / triangle setup unit usually occupies less than 2% of the GPU, and it is the current VS rate bottleneck.

Comment: Re:Could be worse (Score 5, Interesting) 307

by Skowronek (#31894550) Attached to: Cross With the Platform

Entirely correct @ shaders.

However, I have to take exception with your description of immediate mode - the reason it performs so poorly now is that modern graphics chips are designed pretty much exclusively for DirectX (at least, this goes for ATI).

On machines where immediate mode performance was actually some kind of a priority (for instance, SGI Octane IMPACTSR and relatives), executing a glVertex command amounted to 3 memory writes into a command FIFO that was mapped into a fixed address in userspace which was accessible with a short form of a SW opcode (remember, this is MIPS, there is a range of 64k addresses that can be accessed without loading a base register: -32768 to 32767).

The hardware even managed the hiwater/lowater status of the fifo, and notified the kernel to perform a context switch to a non-gfx process when the gfx process was filling up the command FIFO. Those switches were as a matter of fact "virtualized" (before it was cool) by a combination of hardware, kernel (if hardware contexts are exceeded) and userspace - not entirely unlike what DX10 ADM was supposed to be, except this was in 1995.

For large static meshes (only transforms applied with Vertex Shaders), buffers are definitely going to perform better, because the meshes can be located in local memory (VRAM). However, if something is dynamically generated, immediate mode in a good implementation is no slower than a memcpy, and it does not require a kernel transition to submit a command buffer to card's ring (like modern cards like to do).

There is never time to do it right, but always time to do it over.

Working...