Forgot your password?
typodupeerror
Intel

Intel's Dual-core strategy, 75% by end 2006 306

Posted by CmdrTaco
from the thats-a-lotta-core-ap dept.
DigitumDei writes "Intel is moving ahead rapidly with their dual core chips, anticipating 75% of their chip sales to be dual core chips by the end of 2006. With AMD also starting to push their dual core solutions, how long until applications make full use of this. Some applications already make good use of multiple cpu's and of course multiple applications running at the same time instantly benifit. Yet the most cpu intensive applications for the average home machine, games, still mostly do not take advantage of this. When game manufacturers start to release games designed to take advantage of this, are we going to see a huge increase in game complexity/detail or is this benifit going to be less than Intel and AMD would have you believe?"
This discussion has been archived. No new comments can be posted.

Intel's Dual-core strategy, 75% by end 2006

Comments Filter:
  • Dual Core Gaming (Score:2, Interesting)

    by carninja (792514) on Wednesday March 02, 2005 @09:44AM (#11822499)
    One has to wonder if this is going to provide Intel with a competitive edge against Sony's Cell processor in the gaming front...
  • by Gopal.V (532678) on Wednesday March 02, 2005 @09:48AM (#11822541) Homepage Journal
    AMD demo'd [amd.com] their dual core x86 a year ago. Also from what I read, the Pentium extreme is NOT going to share the memory controller - which means unlike the AMD, we might need a new motherboard for the dual core ones (well, AMD promised that we wouldn't). So this is costlier, uglier and more power hungry.

    All in all I see that Intel is going down unless they do something quick. And remember Competition is good for the Customer.

  • by Anonymous Coward on Wednesday March 02, 2005 @09:52AM (#11822563)
    I find this interesting, every machine Apple sells except at the definite low end is dual CPU SMP now, and it's been this way for awhile. Now Intel/AMD seem to be realizing "oh yeah, dual cpus, maybe that's something we should start targeting for the mass market instead of just the high end" (though AMD seems to be pretty comfy with the idea already). I wonder why Apple doesn't seem interested in dual cores though. Intel/AMD seem to be treating multicore tech as their way of getting SMP out of the power-user range, Apple doesn't seem to want to have anything to do with it even though POWER has had multicore ability for a really long time. What's up with this, is there something I'm missing?
  • by svanstrom (734343) <tony@svanstrom.org> on Wednesday March 02, 2005 @09:53AM (#11822575) Homepage
    It isn't really the game itself that needs to be written to take advantage of a second CPU (or whatever), it's the code that's always being reused (either something inhouse, or the engine that they're using).

    People are lazy, and when things will work today as it is, most companies will rather focus on releasing the game asap than spending alot of time recoding what they've already got...

    It comes down to how much money they can make in as little time as possible.

    But, of course, once a company starts pushing their better performance/more features/more "beautiful" games, then everyone else has to catch up instantly; or else their profit goes down.

    It's a "all of us or none of us"-kind of a game, with really really high stakes.
  • by TimeTraveler1884 (832874) on Wednesday March 02, 2005 @09:58AM (#11822609)
    For example on the Intel HT processors, all I have to do is write my applications to use multiple threads for operations that are CPU intensive and voila! I have almost doubled the speed of my app. Otherwise, a single thread app will only use one of the cores.

    Often, it's almost trivial to write an app as a multi-threaded app. The only difficult part is when a the problem your application is solving does not lend itself well to paralellization. So sequential problems don't really benefit from it.

    However, this is almost always -something- that can be done in paralell. Even if the problem the app is solving is highly sequential, if you need to read the disk or anything, you can always implement look-ahead and caching code that runs in a different thread. Or whatever. Because it's rare you will just cruch numbers and not display it, require data, or send it across a network. Usually, the GUI itself will have it's own thread and benefit from a dual-core processor

  • by bigtallmofo (695287) on Wednesday March 02, 2005 @09:59AM (#11822621)
    Check your licensing agreements before you buy one of these dual-core processors. Make sure that your software vendor isn't going to double the price on you.

    Oracle and others [com.com] have announced plans to increase their revenue by charging people for multiple cores in their single processor.
  • OpenGL Performer (Score:2, Interesting)

    by Anonymous Coward on Wednesday March 02, 2005 @10:00AM (#11822629)
    This problem has already been solved by OpenGL Performer [sgi.com]

    Applications, even 'games', written using Performer, will immediately benefit from multiple CPUs.
  • Games and multi core (Score:5, Interesting)

    by Anonymous Coward on Wednesday March 02, 2005 @10:03AM (#11822645)
    As already mentioned games already do make use of the GPU and the CPU so we're fairly used to some mutliprocessor concerns.

    To say that most PC games are GPU bound however is a mistake - most games I've come across (and worked on as a games core technology/graphics programmer) are CPU bound - often in the rendering pipeline trying to feed that GPU.

    Anyhow games are already becoming dual-core aware. Most if not all multiplayer games make use of threads for there network code - go dual core (or hyperthreading) and you get a performance win. Again most sound systems are multi threaded often with a streaming/decompression thread, again a win on multi core. These days streaming of all manner of data is becoming more important (our game worlds are getting huge) and so again we will be (are) making use of dual core there too.

    I personally have spent a fair amount of time performance enhancing our last couple of games (mostly for HT but the same applies to true dual core) to make sure we get the best win we can. For example on dual core machines our games do procedural texture effects on the second core that you just don't get on a single core machine and still get a 20% odd win over single core. I'm sure most software houses take this as seriously as us and do the same. It's very prudent for us to do so - the writings been on the wall about multi processors being the future of top end performance for a while now.

    At the end of the day though us games developers have little choice but to embrace multi core architectures and get the best performance we can. We always build software that pushes the hardware to the full extent of it's known limits because that's the nature of the competition.

    Just think what the next generation of consoles is going to do for the games programmers general knowledge of concurrent programming techniques. If we're not using all of the cores on our next gen XBox or PS3 then our competition will be and our games will suck in comparison.
  • by TheRaven64 (641858) on Wednesday March 02, 2005 @10:04AM (#11822648) Journal
    The problem is that x86 is a horrible architecture for multithreading. On a standard x86 chip, a context switch is about 100 times more expensive than a function call (on an architecture like PowerPC or SPARC function calls and context switches have similar overheads). Designing a multithreaded game is relatively easy, but would give a huge performance penalty on uni-processor machines.
  • by TomorrowPlusX (571956) on Wednesday March 02, 2005 @10:06AM (#11822656)
    Actually, for what it's worth I'm writing a game in my free time which already splits rendering and (physics/game logic) into two threads. The idea being that the physics runs while the rendering thread is blocking on an opengl vsync. While the behavior is synchronous, it runs beautifully on both single and dual processor machines.

    In principle this should have detrimental effects on single processor machines but my relatively meager 1.3 ghz powerbook plays beautifully at 30 fps and 60 physics frames per second.

    Anyway, this isn't *really* what's being discussed since the behavior is in fact synchronous, but I'm just saying it ain't hard. I'm surprised more games aren't multithreaded. It's not as if MT is hard. You just have to be careful.
  • by jbb999 (758019) on Wednesday March 02, 2005 @10:07AM (#11822666)
    Do these new chips share the highest speed cache? I can think for several ways to make use of them without using traditional threads. For example: Set up a pool of threads each one of which just reads a function address from a queue of work and then calls that function, waiting when there is no work. The main program can then just push function pointers onto the queue knowing that a thread will pick up the work.
    I'm thinking that instead of writing something like
    for(int i = 0; i < NumberOfModels; i++) {
    UpdateModelAnimation(i);
    }
    you could write
    ThreadPool* pool = new ThreadPool();
    for(int i = 0 ; i < NumberOfModels; i++) {
    pool->QueueAsyncCall(UpdateModelAnimation, i);
    }
    pool->WaitForAllToFinish();
    The queueing of work could be made pertty low overhead and so if there were only a few thousand CPU instructions in the call you'd get a big speed up, but only if each processor already had the data they were working on in cache. If each core has a separate cache this would be a lot less efficient. Does anyone know?
  • Not nessesarly, as both cores share the same memory controller and registered memory, latency from core to core is essentially zero. I wonder if someone could write some really smart code that has one core doing all memory prefetching and the second core doing the actual computations. Could be interesting.
  • by Anonymous Coward on Wednesday March 02, 2005 @10:09AM (#11822692)
    A lot of pieces have to be in place first. Multi-core cpus have exist first. That's just starting to happen. You have to have decent OS and api support for multiprocessing that exploits it rather than putting in locks to make it seem single threaded which slows things down considerably. Then you get the apps to start using it. Big learning curve on that last bit. Pretty spectacular program crashes when it's done wrong. Lot's of gibbage, which make debugging from a core dump challenging.
  • by JSBiff (87824) on Wednesday March 02, 2005 @10:11AM (#11822710) Journal
    I would like to see a more multi-threaded approach to game programming in general, and not all the benefits would necessarily be about performance.

    One thing that has bugged me a long time about a lot of games (this has particular relevence to multi-player games, but also single player games to some extent) is the 'game loading' screen. Or rather, the fact that during the 'loading' screen I lose all control of, and ability to interact, with the program.

    It has always seemed to me, that it should be possible, with a sufficiently clever multi-threaded approach, to create a game engine where I could, for example, keep chatting with other players while the level/zone/map that I'm transitioning to is being loaded.

    Or maybe I really want to just abort the level load and quit the game, because something important in Real Life has just started occuring and I want to just kill the game and move on. With most games, you have to wait until it is done loading before you can then quit out of the game.

    In other words, even ignoring performance benefits for a moment, if a game engine is correctly multi-threaded, I could continue to have 'command and control', and chat, functionality while the game engine, in another thread, is loading models and textures.
  • by Anonymous Coward on Wednesday March 02, 2005 @10:13AM (#11822722)
    They haven't released any machines with dual core CPUs, because none are available.

    None are available?

    IBM's had multicore POWERs forever, at least since like 2002 [top500.org], I think before. The G4 and G5 have both had the technical capacity to be made in a multiple core configuration. I think Apple isn't interested in dual core because getting dual core PPCs would have been as simple as just asking IBM "hey, could you start making us some dual core PPCs", but they haven't bit.
  • by barrkel (806779) on Wednesday March 02, 2005 @10:17AM (#11822749) Homepage
    I believe that we're going to see a performance plateau with processors and raw CPU power for the next 5 years or so.

    The only way CPU manufacturers are going to get more *OPS in the future is with many cores, and that's going to require either slower or the same kind of speeds (GHz-wise) as things are today. To get programs to run faster under these circumstances you need some kind of explicitly parallel programming.

    We haven't seen the right level of parallelism yet, IMHO. Unix started out with process-level parallelism, but it looks like thread-level paralellism has beaten it, even though it is much more prone to programmer errors.

    On the other end of the scale, EPIC architectures like Itanium haven't been able to outcompete older architectures like x86 because the explicitly parallel can be made implicit with clever run-time analysis of code. Intel (and, of course, AMD) are their own worst enemy on the Itanium front. All the CPU h/w prediction etc. removes the benefit of the clever compiler needed for EPIC.

    Maybe some kind of middle ground can be reached between the two. Itanium instructions work in triples, and you can effectively view the instruction set as programming three processors working in parallel but with the same register set. This is close (but not quite the same) to what's going to be required to efficiently program multi-core CPUs, beyond simple SMP-style thread-level parallelism. Maybe we need some kind of language which has its concurrency built in (something sort of akin to Concurrent Pascal, but much more up to date), or has no data to share and can be decomposed and analyzed with complete information via lambda calculus. I'm thinking of the functional languages, like ML (consider F# than MS Research is working on), or Haskell.

    With a functional language, different cores can work on different branches of the overall graph, and resolve them independentantly, before they're tied together later on.

    It's hard to see the kind of mindset changes required for this kind of thinking in software development happening very quickly, though.

    We'll see. Interesting times.
  • by jest3r (458429) on Wednesday March 02, 2005 @10:18AM (#11822755)
    AMD will be releasing Quad Core chips as early as 2007 according to Arstechnica. Where does that leave Dual Core?

    http://arstechnica.com/news.ars/post/20040813-40 99 .html
  • by Rhys (96510) on Wednesday March 02, 2005 @10:20AM (#11822773) Homepage
    Beyond the GPU, any intensive computation application gets benefits from the second CPU.

    Our local (to UIUC) parallel software master, working on the turing xserve cluster is pulling about 95% (I think, don't quote me) of theoretical peak performance in linpack running on 1 cpu on 1 xserve. Bring that up to both cpus in one and he said it dropped to around 50%.

    Why? The OS has to run somewhere. When it's running, that processor is stuck with it. The other processor is stuck waiting for the OS, and then things can pick up again.

    Now, we haven't yet finished tuning the systems to make the OS do as little as possible. (they're still running GUIs, so we can remote desktop into them amoung other things.) But still that's quite a performance hit!

    He said two machines running 1 CPU each over myrinet were still in the 90%ish of theoretical peak.

    So can we quite rehashing this stupid topic every time dual core CPUs comes up? Yes it'll help. No it won't double your game performance (unless it's written for a dual-core cpu), and probably it won't even double it then, because there's still teamspeak/windows/aim/virus scan/etc running that need cpu time.
  • by nounderscores (246517) on Wednesday March 02, 2005 @10:29AM (#11822865)
    In other words, even ignoring performance benefits for a moment, if a game engine is correctly multi-threaded, I could continue to have 'command and control', and chat, functionality while the game engine, in another thread, is loading models and textures.

    That would put the pressure back where it should be - on the level designers - to make sure that each segment was challenging enough so that a player couldn't pass through two loadzones simply by running so fast that the first zone hasn't fully loaded yet and wind up in a scary blank world full of placeholder objects.
  • by amorsen (7485) <benny+slashdot@amorsen.dk> on Wednesday March 02, 2005 @10:37AM (#11822928)
    (on an architecture like PowerPC or SPARC function calls and context switches have similar overheads).

    I have no idea where you got that from. x86 is a relatively fast architecture when it comes to context switches. SPARC has the huge register file to save and reload.

    I can't find recent results though. If anyone has recent comparative lmbench numbers I'd like to see them.

  • by Anonymous Coward on Wednesday March 02, 2005 @10:49AM (#11823026)
    Unless you have an inside track on Apple's plans that no one else has, you cannot know what they are intending. Apple is notoriously tight-lipped. That said, Apple's dual chip implementations are meant for the high-end, too. Yes, Apple's dual CPU machines generally cost much less than the Dell equivalents, but they are still intended for the high-end user.

    That said, I am curious why everyone mentions Intel and AMD efforts for dual-core processors and never a word about the fact that IBM has had dual-core processors for quite a while now. Yes, they are server processors, but they, at least, have working units today. And they have announced intentions to bring that technology to the desktop. Hmmmmm...very strange. Well...maybe not, /. does tend to be a Microsoft/Linux site more than just a general high-end computing site.
  • by swb (14022) on Wednesday March 02, 2005 @11:06AM (#11823185)
    Where are AMD's dual core chips? Sure as hell can't buy them today...

    I had a vendor's SE tell me that AMD's dual core chips are "practically sitting in boxes at a warehouse" so that the day Intel starts shipping developer samples they can start shipping actual products to end users immediately, giving them a huge head start in terms of marketing and, if you believe they've already been manufacturing them, the ability to discount them faster than Intel can.

    I think that's a strange strategy, but I was also told that AMD has gotten burned by being too far ahead of the curve before (Athlon?); apparently having Intel do it, too, lends credibility and mindshare to technologies, enabling greater acceptance of an AMD solution.

    Of course this is conspiracy theory and marketing speak from an SE, so who knows, but it's not completely implausable. Having a huge supply of readily-MB-compatible dual core CPUs you can start shipping immediately as your competitor's product is just beginning production (and requires new mainboard designs to boot) could allow you to steal their marketing hoopla for your _available_ product.
  • by shapr (723522) on Wednesday March 02, 2005 @11:27AM (#11823460) Homepage Journal
    This is discussed in great detail in this thread on lambda-the-ultimate.org The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software [lambda-the-ultimate.org]. The summary as I see it is
    • declarative parallelism will always scale better than threads or whatever else
    • micro-optimizations will always be faster than declaractive parallelism
    Manual parallelism won't scale well from one core to sixty-four cores, but will be faster in static situations like running on one Cell CPU in the PS3 where the configuration is known at design time of the app.
    This is the same trade-off as manual memory allocation versus garbage collection. Garbage collection is easier and more automatic than manual memory control in C, but if you put enough effort in a C program will be more efficient than a GC-using program.
    So the essence of the answer is that declaractive parallelism gives you development speed, but manual parallelism gives you execution speed. You choose what you want.
    I have a two CPU machine right now, and I'm very much looking forward to the rumored SMP version of Haskell that uses Software Transactional Memory [microsoft.com]. That's gonna rock!
  • by ivan256 (17499) * on Wednesday March 02, 2005 @11:53AM (#11823751)
    The cpu is left to do tasks such as opponent AI

    Funny that you call that a "basic task."

    Game AI can easily use all the computing power you can throw at it. Look at how much CPU it takes to beat the best players at chess... And that has signifigantly less potential computational strategy involved than, say, a realistic tactical war sim...

    The problem is that most current games these days are tests of reflexes and memory. Few games employ adaptive strategy. Of the games that do, I can't think of any that use it in real-time. That's probably why you see AI as a simple task... In reality, game AI is simple because computers are too slow for it to be complex.
  • by Ulric (531205) on Wednesday March 02, 2005 @12:01PM (#11823839) Homepage
    It is small for certain applications.

    Quoting Twelve Ways to Fool the Masses When Giving Performance Results on Parallel Computers [pdc.kth.se]:

    2. Present performance figures for an inner kernel, and then represent these figures as the performance of the entire application.

    It is quite difficult to obtain high performance on a complete large-scale scientific application, timed from beginning of execution through completion. There is often a great deal of data movement and initialization that depresses overall performance rates. A good solution to this dilemma is to present results for an inner kernel of an application, which can be souped up with artificial tricks. Then imply in your presentation that these rates are equivalent to the overall performance of the entire application.

    4. Scale up the problem size with the number of processors, but omit any mention of this fact.

    Graphs of performance rates versus the number of processors have a nasty habit of trailing off. This problem can easily be remedied by plotting the performance rates for problems whose sizes scale up with the number of processors. The important point is to omit any mention of this scaling in your plots and tables. Clearly disclosing this fact might raise questions about the efficiency of your implementation.

    8. If MFLOPS rates must be quoted, base the operation count on the parallel implementation, not on the best sequential implementation.

    We know that MFLOPS rates of a parallel codes are often not very impressive. Fortunately, there are some tricks that can make these figures more respectable. The most effective scheme is to compute the operation count based on an inflated parallel implementation. Parallel implementations often perform far more floating point operations than the best sequential implementation. Often millions of operations are masked out or merely repeated in each processor. Millions more can be included simply by inserting a few dummy loops that do nothing. Including these operations in the count will greatly increase the resulting MFLOPS rate and make your code look like a real winner.

    10. Mutilate the algorithm used in the parallel implementation to match the architecture.

    Everyone is aware that algorithmic changes are often necessary when we port applications to parallel computers. Thus in your parallel implementation, it is essential that you select algorithms which exhibit high MFLOPS performance rates, without regard to fundamental efficiency. Unfortunately, such algorithmic changes often result in a code that requires far more time to complete the solution. For example, explicit linear system solvers for partial differential equation applications typically run at rather high MFLOPS rates on parallel computers, although they in many cases converge much slower than implicit or multigrid methods. For this reason you must be careful to downplay your changes to the algorithm, because otherwise the audience might wonder why you employed such an inappropriate solution technique.

  • Re:Boon for Game AI (Score:3, Interesting)

    by tartley (232836) <user tartley at the domain tartley.com> on Wednesday March 02, 2005 @12:06PM (#11823899) Homepage
    While I agree very strongly with the sentiment that improvements in games have to go beyond tarting up graphics, if considererd carefully this exposes a fundamental problem.

    Any aspect of a game may be programmed to scale with the hardware upon which the game is run (eg. graphics get more detailed, framerates improve, physics is more realistic, AI gets smarter)

    However, the problem here is that if these improvements to the game are in any way substantial rather than superficial - if they actually affect the gameplay in any way - then users playing the game on a high-end machine will end up playing a substantially different game than users on a low-end machine.

    In the case of more detailed graphics, or better framerates, the changes are superficial enough that this does not matter. But for anything deeper - such as AI - the developer has to ask themselves whether it is really desireable for the intelligence of in-game aliens to depend on the nature of the PC the game is run on. Will a low-end PC make the aliens so stupid that the game is substantially easier? Will a high-end PC result in aliens which consistently frustrate the player?

    In order to fix this, developers might consider preventing the software from running on systems which are deemed 'too slow', or they may disable features such as 'AI scaling' on systems that are 'too fast' - ironically these desparate measures would of course be in direct opposition to the original intent of making the game scale well across a wide variety of hardware.

  • by adam31 (817930) <adam31@NOSpaM.gmail.com> on Wednesday March 02, 2005 @12:44PM (#11824326)
    But there are some tasks that can be done by both CPU and GPU but are generally assigned to the GPU. For instance, you can generate a stencil shadow volume in a vertex shader... it's just very wasteful of the GPU. You can also animate characters on the GPU, but they have to be retransformed to do multi-pass effects. So if the game is GPU-bound, a good idea is moving these tasks to the CPU.

    Honestly, working on a dual-core CPU, you could create 2 threads-- 1 that just does character animation and silhouette/shadow volume generation, and another that does physics/AI. And you'd have very well balanced processor usage and better GPU efficiency (depending on the game of course).

  • by edxwelch (600979) on Wednesday March 02, 2005 @10:30PM (#11830461)
    If I remember correctly Intel's dual core debute is a workstation processor, while AMD will have their Opteron dual cores first.
    Dual core processors make more sense in a server than a workstation.

The first Rotarian was the first man to call John the Baptist "Jack." -- H.L. Mencken

Working...