Continuing in the series, we are now onto problems faced due to concurrency. I think anyone that has experience in the area will agree with me in saying: concurrency is hard. The bugs can be very vicious, since they can be timing related. It is also possible to run into situations that really are not bugs but substantially interfere with enjoyment of the game, for example overloading one part of the system so that it becomes a bottleneck.
It is hard to predict which part of the program will be a bottleneck and programmers are infamous for incorrectly guessing where the program's bottlenecks lie. Additionally, these problems can be extraordinarily transient, based on interactions of competing threads, up to and including classical concurrency problems such as deadlock and priority inversion.
Additionally, dealing with concurrency means making a number of trade-offs at a high-level. That is choosing a strategy and coming up with an architecture that embodies those choices. What processes and data live in what thread? How is communication handled between the different threads of execution?
Ultimately, moving to concurrent systems is not an instantaneous win like getting a processor that processes more instructions per second. There are inefficiencies that exist in concurrent systems, like synchronization and buffering that simply don't have to dealt with in a single threaded context.
How a game approaches concurrency is another challenge. One obvious way is to use threads is as mini "servers." The best choices for what to turn into a server is one that manages output that doesn't have to be fed back into the game immediately: such as a "sound" server, a "graphics" server, or a "network" server (for outgoing traffic). Finding high-level tasks for each server is paramount. The building up the message to the server for the task and submitting it, must be cheaper than actually performing the task in order to realize the win.
This approach works fine for the XBox360, but not for the PS3. The PS3 contains many special purpose processors, that basically require small programs dma'ing memory to produce some output. One approach to utilize these SPUs is to make a job, that is, a batch of data and a program bundle that is executed on one of the SPUs in a pool of all the SPUs. The coordination effort involved in teasing out the concurrency is significantly more difficult. For example, one might have a job that evaluates animation on a mesh. At some point there needs to be a return trip on the job, that is, in order to do useful work, the job must finish and the result fed into another process to cause the desired side-effect. This wasn't necessary in the server case, in fact, the servers were explicitly designed to avoid the need for round-trip communication. The limiting factor of performance is that the driver thread must not spend its time waiting for the job to complete. Not an easy feat since the whole point of sending it off to another processor was because it will take a while to do. Finding useful work while the job completes is no small feat, and requires some significant cleverness.
Getting peak performance out of the PS3 is going to take awhile. But unfortunately, PS3 consoles aren't exactly selling like hot-cakes, which brings up a dilemma. Good games will drive PS3 sales, but good games will take a long time develop. Will the PS3 survive the lull? Only time will tell, but Sony does face a chicken and egg problem. It is hard for 3rd parties to justify large development budgets for consoles that don't control a large percentage of the market. However, it is hard to control a large percentage of the market without the games to drive the sales of the console (which is why having a good launch is crucial). At this point, Sony's best bet is to drive adoption of PS3 themselves with 1st and 2nd party titles. Sony might get lucky, but it is probably not the best business strategy.
In conclusion, concurrency is hard. Concurrency has a large impact on architecture which forces migration or adoption of new technologies. Concurrency opens up new classes of bugs that can be hard to kill. These forces conspire to push development targets out, making games more costly to develop. We have looked at what architectures one might adopt, "server" or "job" based. "Server" will work well on the XBox360, but not so well on the PS3 because the SPU's are not symmetric. A "job" based architecture will work for both platforms and on that ground, the PS3 should start out performing the XBox360 because of its higher theoretical peak performance. Although, a "job" based architecture is harder to implement. Sony in particular will need to justify this higher development cost, their best way remedying the situation is with killer 1st and 2nd party titles, which will drive extra console sales and justify the additional expense to 3rd parties.