Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×

Comment Re:Reimagine for touch (Score 1) 85

Some people are also retarded. Try redesigning a first person shooter for a touch screen. Every touch-screen FPS I have tried is beyond terrible, requiring tons of aim assist and alteration of game mechanics to suit the reduction in player control. Removing content or mechanics to suit an unsuitable control scheme is not "redesigning" anything, it's crippling it.

Comment Re:Too much competition (Score 1) 85

Not necessarily. Smartphones and tablets handle traditional control schemes very poorly. Try playing an FPS on a smartphone, or anything that requires a degree of precision and/or responsiveness. If a game can be designed/redesigned for a touch-screen interface, great. However, many genres simply play better using mechanical controls and the PSP excels at this.

Comment Re:Blizzard Shizzard (Score 1) 252

That's possible, and I wouldn't be surprised if some do, but it is my understanding that most cheats inject themselves into the program code at runtime rather than replace the program code entirely. It may be more appropriate to say that they are carefully crafted to work with the copyrighted binaries rather than ship with the copyrighted binaries themselves.

Comment Re:RTFA (Score 1) 345

SMT on a dynamically scheduled architecture requires resolving and tagging data dependencies between instructions from two or more contexts as those instructions enter the reservation station, are dispatched to execution units, and eventually enter the reorder buffer. Speculative and cancelled instructions also need to be resolved from two or more contexts at once. That's not particularly easy to do, and the difficulty grows with the number of execution ports and accompanying execution units. Intel has been working at this for many, many years. Even when HT was not used in x86 (Core 2 era) it was being developed into the Itanium family of microprocessors.

If the ideality condition for efficiency is having all backend execution ports busy on every cycle CMT can theoretically reach parity with SMT by having a demanding thread running on each logical processor and without other factors such as cache latency becoming a bottleneck (assume ideality on these). However, outside of synthetic benchmarks this is incredibly hard to accomplish. As soon as one thread blocks (IO syscall for example) or enters a long-latency event (page fault for example) the operating system can either toss the thread on the waiting queue and context switch, or simply do nothing and issue stalls until the event is resolved. Excessive context switches cause overhead, and should be avoided, and stalls are inefficient by definition. If no context switch is performed, the CMT frontend must stall which means that the backend execution units will be idled. A SMT frontend must also stall, but the backend execution units may still be used unless the complementary thread also encounters a long-latency event.

Intels performance advantage most certainly does erode when highly-concurrent tasks are employed, but AMD's microprocessors require significantly more transistors and significantly more power to obtain the same level of performance.

Comment Re:RTFA (Score 1) 345

Not everyone runs workloads that are poorly vectorized and parallelized, you insensitive clod.

Very few consumers actually run workloads that are properly vectorized and parallelized. Explicit vectorization requires manual creation of multiple codepaths based on the level of vector support in the hardware. Autovectorization is substantially more flexible but requires a decent compiler to pick it up; ICC is by far the best for this, and GCC still struggles after many years of development. Furthermore, the proliferation of virtual-machine based languages means that consumer application developers have largely absolved themselves responsibility for writing code that is vectorized much less suitable for auto vectorization by a JIT that actually does so. Heck, SIMD support in Javascript is just beginning to materialize.

Hyperthreading (Intel's implementation of SMT) is what gives Intel's i7 series microprocessors a huge advantage

The P4 had hyperthreading too. If that really would be such a huge advantage, one would think it would have been a bit more competitive than it was...

Netburst had a ton of issues with it that crippled performance across the board. The HT design was also rather immature. The implementation of HT in later releases of the Itanium series and Nehalem were vastly improved.

Disabling one of the CMT frontends...

...assuming the workload is not keeping all the frontends busy most of the time.

There are only a handful of common consumer applications that keep 6 or even 8 frontends busy at all times. AMD's FX series microarchitectures tends to keep up with Intel Core microarchitectures in such applications, yet fall behind in the ones that consumers spend most of their time running. Javascript, the language that powers the web for some strange reason, is inherently single-threaded.

...only reduces competition for resources that are shared, which on AMD FX series microprocessors includes some of the cache and floating point hardware.

Not with AVX-intensive workloads; there, a single thread can keep the whole shared FPU busy with AVX instructions.

That's correct. The vector unit in both AMD's FX series and Intel's Core series microprocessors are shared between two front ends. although on Intel's architecture the add and multiply vector EUs are on separate ports and can accept issues from separate threads in the same cycle (albeit in lieu of two scalar arithmetic instructions), I'm not sure if AMD's architecture works the same way (although I think that the instruction latency is longer). What I was discussing is that under AMD's CMT design the architecture is unable issue instructions to ALUs on the module's paired core, whereas SMT allows this by virtue of having a completely common backend with a unified reservation station. If one of the frontends on an AMD FX series microprocessor is disabled, the two ALUs are disabled along with it and the result is a typical 4-way SMP with 2 ALUs per logical processor. If Hyperthreading is disabled, the result is a 4-way SMP with 4 ALUs per logical processor as ALUs can still be issued instructions from the unified reservation station. SMT allows for flexibility that simply doesn't exist under CMT.

CMT is inherently less efficient than SMT. It's also a simpler design that's easier for a smaller company to implement.

{citation needed} on both accounts.

There are piles upon piles of benchmarks out there demonstrating this. Intel's architecture excels in instruction throughput, transistor budget, and power efficiency.

Look at the price of AMD's microprocessors on any online retailer's website. Intel's i7-3930k still sells for around $600 and its successor is around $630. AMD's flagship FX-9590 fell from $1000... to $600... to $300 in a matter of weeks as it just can't keep up where it counts.

Comment Re:RTFA (Score 1) 345

Hyperthreading (Intel's implementation of SMT) is what gives Intel's i7 series microprocessors a huge advantage over AMD's FX series equivalents.

In terms of pure scalar arithmetic an i7-4700 series microprocessor Haswell and FX-8300 series Piledriver microprocessor have nearly identical clock-for-clock capabilities. Haswell has 4 scalar ALUs per core, whereas Piledriver has 2. SMT allows for the backend execution resources (which includes the scalar ALUs) to be balanced between the two SMT frontends, whereas CMT does not. Disabling or idling one of the SMT frontends improves the instruction throughput of the complementary frontends as they will have greater access to the shared resources due to reduced fighting. Disabling one of the CMT frontends only reduces competition for resources that are shared, which on AMD FX series microprocessors includes some of the cache and floating point hardware. The unshared arithmetic hardware, which is responsible for the bulk of the instructions in most programs, is idled along with the frontend that it belongs to.

CMT is inherently less efficient than SMT. It's also a simpler design that's easier for a smaller company to implement.

Comment Re:Because C and C++ multidimensional arrays suck (Score 1) 634

A contiguous multi-dimensional array in C/C++ is just a block of data with unit stride along the columns and n times unit stride along the rows.

Calculating offsets to do matrix multiplication is incredibly easy as long as the base, dimensions, and stride of each input matrix are known. Now, for particularly large matricies the memory access pattern may be optimized through some clever transposing but that's a different matter.

Comment Re:Because C and C++ multidimensional arrays suck (Score 1) 634

No language has real multidimensional arrays because most computer address spaces are linear by design. Multidimensional arrays are arrays of arrays, either by index or by reference. There is no other way to do it. Mimicking multi-dimensional arrays in C/C++ is trivial if the style, dimensions, and stride are known. Just allocate an appropriately sized block of data and dereference it with appropriate offsets (or twice if it's jagged). The code generator converts [x][y] notation into this format anyway and most autovectorizers should pick it up.

Slashdot Top Deals

I have hardly ever known a mathematician who was capable of reasoning. -- Plato

Working...