硕士原创英语论文
Special architectures: PlayStation 2
EE6442 Assignment 3
Fan Zhang
University of Limerick
MEng. Computer and Communication Systems
ID: 0526401
Abstract: I am a video game fan, but not an addict. Since this topic attracted me a lot, I decided to choose this one as my topic for the third assignment of Processor Architecture Module. I started to play video games since I was five. While I was playing games, I found the game console itself just like a mystery, how could they react our actions to the controller then reflects so amazing pictures on TV? Although I have read a lot about it in game magazines, I admit that I didn’t try to find the answer until I found this topic. This is a great chance for me to answer the question myself. At the same time, I want to present you this paper, which should be fun.
This paper concerns the differences of architecture between PC and PlayStation 2. Since the purposes of PC and PlayStation 2 are different (or maybe I should say the purposes of PC include that of PlayStation 2), the different objectives decide the different design orientation. I think PlayStation 2 is a good game console for the comparison. First, a lot of documentations about PlayStation 2’s Emotion Engine can be found in the Internet. Second, as far as I know, PlayStation 2’s design has straightforward purposes: 3D games and multimedia, which makes the game console is seemed to be born for these two reasons. Contrasts to PlayStation, current PCs do very well on these two aspects, but the cost is the unstoppable upgrade of hardware. PlayStation 2 is a product born 5 years ago. Today tens of millions of people are still enjoy PlayStation games at home. 5-year-old PCs have been washed out already.
Keywords: PC, processor, video card, system controller, bus, Emotion Engine, Vector Unit, Graphics Synthesize.
1. INTRODUCTION
1.1 The evolution of game performance
The computer technology has achieved rapid evolution this year. From Figure 1.1 to Figure 1.5 you can see, in almost twenty years, how great changes of game performance are, both PC and game consoles.
Figure 1.1: Final Fantasy I (FC) 1987 by SQUARE
Figure 1.2: Final Fantasy XII (PlayStation 2) 2006 by SQUARE ENIX
Figure 1.3: Prince of Persia (PC) 1989 by Broderbund
Figure 1.4 Prince of Persia: The Two Thrones (PC) 2006 by Ubisoft
The screenshots above are the evidences of technique developments. In these twenty years, computers are almost 10 times faster than in the 1980’s. The cost of buying a computer is decreasing simultaneously. However, the development orientations of both PC and game consoles didn’t change much during these 20 years. Here I want to say game consoles and PC are different, although they both can be classified to ‘computer’ class, although PC includes all game consoles’ functions (but the software are not compatible each other). The differences include many areas, the architecture, the media, the software producing and selling model, and the customers.
1.2 Why they are different?
I would rather to say it is because of the distinct purposes. Of course PC can play games, can do anything that game consoles do, and in the present, PlayStation 2, the most famous game console in the world, can connect to Internet, can print paper, even can run complete Linux operating system, but PC is general purpose, this means PC should care too much things, and be good at almost everything. For instance, PC should be good at text processing, games, printing, Internet connection, a huge amount of protocols are settled for it; PC also need to compatible with all components and software that are designed and implemented by current standards. But game consoles are different. They need only care about games, which mean most designs are flexible. At the same time, the standards which PC has to obey do not affect it at all. No extra cost, no burden, only focus on games.
Figure 1.5: Sony’s PlayStation 2
1.3 Multimedia
From later 20th century, multimedia has become one of the main purposes of PC. Corresponding new technology for enhancing the capability of multimedia processing on PC has been developed as well. However, the reality of transmission speed bottleneck hasn’t been changed much. Keith Diefendorff and Pradeep K. Dubey published an article named “How Multimedia workloads will change Processor Design” in 1996. They argued the dynamic media processing would be a big challenge for current processor architecture. They also thought it will force the fundamental changes in processor design.
Before Pentium 4, the processors shared the same character: their data cache memory was big, but instruction cache memory was relatively small. It was quite useful for most usage, for instance, word editor, e-business, stock information processing, and so on. However, Diefendorff did not think it is useful, or efficient enough for multimedia processing, for multimedia data come and forth constantly, no need to settle a huge bulk of storage space for holding the information that rarely has chance of reuse. Contrarily, multimedia processing requires more calculation than others. So, for multimedia calculation, the instruction cache memory should become larger, both caches require faster transmission speed as well. We shall see this prediction has realized much in both Pentium 4 and PlayStation 2.
1.4 The purpose and the brief layout of the article
This paper is mainly talk about the architectural differences between PC and PlayStation 2, which is the most famous game console in the world. The article will discuss several aspects, the whole architecture, the CPU, the motherboard, and the graphics. In the following section, the whole architectures are compared. Two processors, Intel’s Pentium 4 and PlayStation 2’s Emotion Engine are discussed and compared in the third section. The fourth section is about the bus and caching comparison. The fifth section mainly talks about PC and PlayStation 2’s graphic devices, Video card and Graphics Synthesizer. The conclusion will be made in the last section.
2. WHOLE ARCHITECTURE COMPARISON
2.1 PC architecture
The basis of PC could root back to 1940’s. John von Neumann (1903-57), who constructed a very basis structure of computer, stayed his name in the history forever. The architecture of modern PC is still based mainly on his architecture. Let’s see a diagram of PC architecture as our basis of illustrating how PC works for game performance in the future.
Figure 2.1: PC architecture--------------------------------->
Different regions in the diagram have different clock speed. We can see the system controller is the heart of whole PC system. It carries data between processor and other components in PC over bridge. The bridge is used to connect interfaces and buses. Two kinds of bridges exist in PC, North Bridge (the system controller) and south bridge (the bus bridge). The system controller provides an interface between the processor and external devices, both memory and I/O. The system controller works with the processor to perform bus cycles.
From the diagram we can see, the system controller makes the whole diagram to be complicated. This is because the system controller has to adjust the bus cycles between the processor and the external device that it wants to access. Briefly, the PC’s working procedure can be described as follow:
PC executes commandsèaccess data with the help of system controllerèreturns the execution resultèexecute commandsè…
System controller also possesses the function of controlling DMA (Direct Memory Access), which is the ability to transfer data between memory and I/O without processor intervention.
2.2 PlayStation 2 Overview
Let’s first see the architecture of PlayStation 2.
Figure 2.2: the architecture of PlayStation 2---------------->
PlayStation 2 is composed of a graphics synthesizer, the Emotion Engine, the I/O Processor (IOP), and a Sound Processor Unit (SPU). The IOP controls peripheral devices such as controller and disk drive and detect controller input, which is sent to the Emotional Engine. According to this signal, the Emotional Engine updates the internal virtual world of the game program within the video frame rate. Many physical equations need to be solved to determine the behavior of the character in the game world. After this is determined, the calculated object position is transformed according to the viewpoint, and a drawing command sequence (display list) is generated. When the graphics synthesizer receives the display list, it draws the primitive shape based on connected triangles on the frame buffer. The contents of the frame buffer are then converted from digital to analogue, and the video image appears on the TV. Finally, the Sound Processor is in charge of sound card thing, it outputs 3D digital sound using AC-3 and DTS. This is the overview of PlayStation 2 working procedure.
2.3 Comparison
Compare Figure 2.1 and Figure 2.2, we can see that the PC’s architecture is far more complex than that of PlayStation 2’s. There are many reasons. PC has more devices has to care. For instance, PlayStation’s I/O processor, which is act as the same role as the system controller bus in PC, the chief responsibility of this chip is to manage the different devices attached to the PS2. 2 PlayStation controller port, and MagicGate-compatible memory card interface, 2 USB ports, and a full-speed 400Mbps IEEE 1394 port, which are much less than PC. The other main reason is processor’s speed increased much faster than other devices; the devices themselves had uneven speed increments as well. In general, PlayStation 2 has simpler architecture and less components and devices.
3. ALL ABOUT PROCESSORS
3.1 Pentium 4 Processor
Pentium 4 adopts Intel’s 7th generation architecture. We can see in detail from the diagram below. Since the birthday of PlayStation 2 waiting for exploring was 4th March 2000, when Pentium 4 was not published yet. It is unfair to PlayStation 2. However, Pentium 4 is the most popular processor in the present, and PlayStation 2 is globally the most popular game console, whatever.
Figure 3.1: Pentium 4 processor architecture
Since the previous generation architecture (Pentium III) Intel began to use hybrid CISC/RISC architecture. The processor has to accept CISC instructions, because it has to be compatible with all current software (most software is written using CISC instructions). However, Pentium 4 processes RISC-like instructions, but its front-end accepts only CISC x86 instructions. A decoder is in charge of the translation. Intel doesn’t create the path for programs using pure RISC instructions.
CISC instructions are rather complex, decoding one may cost several clock cycles. In Pentium III era, once a CISC instruction needed to be processed several times (i.e. a small loop), the decoder had to decode the instruction again and again. In Pentium 4 this situation has been improved by replacing Pentium III’s L1 instruction cache to Trace Cache, which is placed behind the decoder. The trace cache ensures that the processor pipeline is continuously fed with instructions, decoupling the execution path from a possible stall-threat of the decoder units. After decoding stage, Intel introduces the Renamer/Allocator unit to change the name and contents of 32-bit CISC instructions of the registers used by the program into one of the 128 internal registers available, allowing the instruction to run at the same time of another instruction that uses the exact same standard register, or even out-of-order, i.e. this allows the second instruction to run before the first instruction even if they mess with the same register.
The other big advance of Pentium 4 is its SSE2 - The New Double Precision Streaming SIMD Extensions. 128-bit SIMD package offers 144 strong instructions. Intel prepares two SIMD instruction units for Pentium 4 (64-bit each), one for instructions, and the other for data. Let’s recall Section 1.3, Pentium 4’s 128-bit SIMD extension is Intel’s efforts for meeting the future requirements for multimedia implementations. Because of that, video, games implementation capability gained the drastic enforcement.
Pentium 4’s pipeline is the most disputable place. When it was announced, 20-stage pipeline surprised a lot of people. Intel did so because the more stage pipeline can increase the clock rate of processor. However, once the pipeline does not contain the information what processor need, the pipeline refill-time is going to be a long wait. In fact, Pentium 4 is only faster than Pentium III because it works at a higher clock rate. Under the same clock rate, a Pentium III CPU would be faster than a Pentium 4.
Figure 3.2: Pentium 4 Pipeline
The scheduler is a heart of out-of-order engine in Pentium 4. It organizes and dispatches all microinstructions (in other words, uops) into specialized order for execution engines.
Figure 3.3: Pentium 4 scheduler
Four kinds of schedulers deal with different kinds of microinstructions for keeping the processor busy all the time. The ports are Pentium 4’s dispatch ports. If you read the diagram carefully, you can see Port 1 and Port 0 each is assigned a floating-point microinstruction, Port 0 is assigned Simple FP Scheduler (contains simple Floating-point microinstructions) and Port 1 is assigned Slow / Floating Point Scheduler (contains complex floating-point microinstructions). Port 0 and Port 1 also accept the microinstructions came from Fast Scheduler. For the floating point microinstruction may run several clock cycles, Pentium 4’s scheduler monitor decides to transfer the microinstruction to Port 1 if Port 0 is busy, and vice versa. Port 2 is in charge of Load microinstructions and Port 3 deals with Store microinstructions.
3.2 PlayStation 2’s Emotion Engine
PlayStation 2’s designers focus deeply on the purpose of 3D games. At the same time, they had to ensure it was completely compatible with DVD video. For performing 3D games well, PlayStation 2 has to possess perfect vision and audio functions. Emotion Engine acts as the role of Geometry calculator (transforms, translations, etc), Behavior/World simulator (enemy AI, calculating the friction between two objects, calculating the height of a wave on a pond, etc). It also in charge of a secondary job of Misc. functions (program control, housekeeping, etc). In general, Emotion Engine is the combination of CPU and DSP processor.
Figure 3.4: The architecture of Emotion Engine
The basic architecture of Emotion Engine is show in Figure 14. The units are composed of
(1) MIPS III CPU core
(2) Vector Unit (two vector units, VU0 and VU1)
(3) Floating-Point Coprocessor (FPU)
(4) Image Processing Unit (IPU)
(5) 10-channel DMA controller
(6) Graphics Interface Unit (GIF)
(7) RDRAM interface and I/O interface.
Something interesting in the diagram you may have noticed. First, inside the Emotion Engine, there is a main bus connects all components for data communication. However, between MIP III core and FPU, VU0 and MIP III, VU1 and GIF, there are dedicate 128-bit buses connect them. Second, VU0 and VU1 have certain relationship shown in the diagram. This design extremely enhanced the flexibility of programming with Emotion Engine.
MIPS III Core connects with the FPU and VU0 directly with the dedicated buses. The pipeline of MIPS III is 6-stage. The MIPS III is the primary and controlling part, VU0 and the FPU are coprocessors to MIPS III. They compute the behavior and emotion of synthesis, physical calculations, etc For example, in a football game, the flying orbits of the ball, the wind effect, the friction between ball and the ground need to be calculated. At the same time, 21 player’s AI needs to be implemented (the last player is controlled by the user), the activity, the lineup, etc. After the calculation, MIPS III core sends out the display list to GIF.
VU1 has a dedicated 128-bit bus connected to GIF, which is the interface between GS (Graphics Synthesizer) and EE (Emotion Engine). VU1 can independently generate display list and send to GIF via its dedicated bus. Both of these relationships forms a kind of dedicate and flexible structure. The final goal of EE is generating display list and send to GS. The programmer can choose either programming two groups (MIPSIII + FPU + VU0 and VU1 + GIF) separately, send their display list in parallel, or programming purposely, making MIPS III + FPU + VU0 group as the “coprocessor” of VU1, for instance, generate physical and AI information then send to VU1, VU1 then produces corresponding display list. The diagram below shows the two programming methods.
(a) (b)
Figure 3.5: Two programming methods of Emotion Engine
MIPS ISA is an industry standard RISC ISA that found in applications almost everywhere. Sony’s MIPS III implementation is a 2-issue design that supports multimedia instruction set enhancements. It has
(1) 32, 128-bit general purpose registers
(2) 2, 64-bit integer ALUs
(3) 1 Branch Execution Unit
(4) 1 FPU coprocessor (COP1)
(5) 1 vector coprocessor (COP2)
What I really want to cover are two vector processors, VU0 and VU1. This is the main reason why PlayStation 2 is powerful.
VU0 is a 128-bit SIMD/VLIW design. The main job of VU0 is acting as the coprocessor of MIPS III. It is a powerful Floating-point co-processor; deal with the complex computation of emotion synthesis and physical calculation.
The instruction set of VU0 is just 32-bit MIPS COP instructions. But it is mixed with integer, FPU, and branch instructions. VIF is in charge of unpacking the floating-point data in the main bus to 4 * 32 words (w, x, y, z) for processing by FMAC. VU0 also possesses 32 128-bit floating-point registers and 16 16-bit integers.
VU0 is pretty strong. It is equipped with 4 FMACs, 1 FDIV, 1 LSU, 1 ALU and 1 random number generator. FMAC can do the Floating-Point Multiply Accumulate calculation and Minimum / Maximum in 1 cycle; FDIV can do the Floating-Point Divide in 7 cycles, Square Root in 7 cycles, and Inverse Square Root in 13 cycles. In fact, as the coprocessor of MIPS III, VU0 only uses its four FMACs. However, VU0 doesn’t have to stay in coprocessor mode all the time. It can operate in VLIW mode (as a MIPS III coprocessor, VU0 only takes 32-bit instructions. In VILW mode, the instruction can be extended to 64-bit long). By calling a micro-subroutine of VLIW code. In this case, it splits the 64-bit instruction it takes into two 32-bit MIPS COP2 instructions, and executes them in parallel, just like VU1.
VU1 has very similar architecture than VU0. The diagram below is the architecture of VU1 possesses all function that VU0 has, plus some enhancement. First, VU1 is a fully independent SIMD/VLIW processor and deal with geometry processing. Second, VU1 has stronger capability than VU0: it has a 16K bytes’ instruction memory and a 16K bytes’ data memory, which VU0 only has 4K bytes each. VU1 acts as the role of geometry processor; it burdens more instructions and data to be computed. Third, VU1 has three different paths to lead its way to GIF. It can transmit the display list from 128-bit main bus, just as VU0 + CPU + FPU do; or it can transmit via the direct 128-bit bus between its VIF and GIF; the last one is quite interesting, the path comes out from the lower execution unit (which I will talk about later) and goes directly to GIF. Three individual paths ensure two main problems of PC 3D game programming will not happen: first, the bottleneck of bus bandwidth; second, the simplex way of programming.
Figure 3.6: The architecture of VU1
VU1’s VIF does much more than that of VU0 does. The VIF takes and parses in which Sony called 3D display list. The 3D display list constructs of two types of data: the VU1 programming instructions (which goes to Instruction memory) and the data that the instruction deal with (which goes to Data memory). The instruction itself can be divided into two units, Upper instruction and Lower Instruction, which directly operate on two different execution units, Upper execution unit and Lower execution unit. The 64-bit VLIW instruction can be used to deal with two operations in parallel. Recall that VU0 possesses the same function but most of time it acts only as the coprocessor of MIPS III, this mode can only operate 32-bit SIMD instructions. Programmers also rarely ask VU0 to do the same thing what VU1 is good at.
3.3 Comparison
I strongly agree if you think Emotion Engine is more flexible than Pentium 4. The design of Emotion Engine is completely around the performance of 3D games. Two vector units, VU0 and VU1, contribute a lot for the game performance. Pentium 4 architecture is straight, you can trace the path of data from the very beginning, and soon you will be able to know how Pentium 4 works easily. For Emotion Engine, except you are the game designer, you will never know exactly.
I did not put too much digits in this section, the comparison of digits does not make sense at all. The comparison between two PC processors depends on digits, because they are the same kind and work in the same situation. For game consoles, without the burden of compatibility, the designers think a lot for the perfect cooperation. This would results in better performance, plus less cost. Unfortunately the programmers don’t think it is a good idea, it cost them quite a lot of time to investigate the processor to figure how it works.
4. BUSES AND CACHEING
4.1 PC Motherboard
While multimedia processing requires massive quantities of data to move rapidly throughout the system, the speed difference between processor and external devices is the main bottleneck of PC. Processor companies like Intel have put a lot of energy into getting the rest of the system components to run faster, even if other vendors provide these components. Improving the performance of motherboard is a good idea. Figure 4.1 is the main structure diagram of GIGABYTE GA-8TRX330-L Pentium 4 Motherboard. The bandwidth between Processor and system controller, main memory and system controller has reached to equally incredible 6.4GB/S. However, the latency of memory is still impossible to remove. Here I want to talk something about the processor caching mechanism.
In the present, motherboard’s FSB (Front Side Bus) frequency has over 800 megahertz. However, it is slower than that of Pentium 4, which is over 3 gigahertz. Processor runs at a multiple of the motherboard clock speed, and is closely coupled to a local SRAM cache (L1 cache). If processor requires data it will fist look at L1 cache. If it is in L1 cache, the processor read the data at a high speed and no need to do the further search. If it is not, sadly processor has to slow down to the motherboard clock speed (what a drastic brake!) and contact to system controller. System controller will check if L2 cache has the required data. If has, the data is passed to processor. If not, processor has to access the DRAM, which is a relatively slow transfer.
4.2 About PlayStation 2’s buses and caching.
Recall Figure 2.2, we can see 32-bit interfaces between processor and I/O Processor, main memory and I/O Processor, which can achieve 3.2GB/S bus speed. Although slower than Pentium 4, Emotion Engine itself is relatively slow as well, 300MHz MIPS III processor. However, PlayStation 2’s 32-bit interface, 10-channel DMAC, 128-bit internal bus, and small cache memory group to an incredible caching condition. Any data necessary can be store or download in time. This strategy takes 90% of DMA capability. It makes the latency which main memory generates is acceptable for Emotion Engine.
4.3 Comparison
This time we can talk about digits some more. Let’s see a Pentium 4’s cache memory
L1 trace cache: 150K
L1 data memory: 16K
L2 memory: 256K ~ 2MB total: 422~2204K
Let’s see PlayStation 2 next
VU0 data memory: 4K
VU0 instruction memory 4K
VU1 data memory 16K
VU1 instruction memory 16K
MIPS III data memory: 2-way 8K
MIPS III instruction memory: 2-way 16K total: 64K
Contrast to Pentium 4, the cache memory of PlayStation 2 is too small. Its capability is indeed ‘weak’ in the present. Pentium 4 is able to hold more data and does more computations in parallel. However, PC architecture hasn’t been improved along with the processor. No matter how Pentium 4 fast is, present bus architecture is never going to perform Pentium 4 100% capability. PlayStation 2 achieves a nearly perfect structure and mechanism, which helps it exert as much as it can (or maybe I should say because Pentium 4 is too fast, the memory speed is relatively too slow). Besides, it remarkably low down the cost, you can afford a PlayStation 2 plus a controller with the same price of a single Pentium 4 chip.
5. VIDEO PERFORMANCE
5.1 Comparison of performance between PC and PlayStation 2
Figure 5.1 Need for Speed Most Wanted (PlayStation 2) 2006 by EA GAMES
PlayStation 2 Graphics Synthesizer (GS)
· 150 MHz (147.456 MHz)
· 16 Pixel Pipelines
· 2.4 Gigapixels per Second (no texture)
· 1.2 Gigatexels per Second
· Point, Bilinear, Trilinear, Anisotropic Mip-Map Filtering
· Perspective-Correct Texture Mapping
· Bump Mapping
· Environment Mapping
· 32-bit Color (RGBA)
· 32-bit Z Buffer
· 4MB Multiported Embedded DRAM
· 38.4 Gigabytes per Second eDRAM Bandwidth (19.2 GB/s in each direction)
· 9.6 Gigabytes per Second eDRAM Texture Bandwidth
· 150 Million Particles per Second
· Polygon Drawing Rate:
· 75 Million Polygons per Second (small polygon)
· 50 Million Polygons per Second (48-pixel quad with Z and Alpha)
· 30 Million Polygons per Second (50-pixel triangle with Z and Alpha)
· 25 Million Polygons per Second (48-pixel quad with Z, Alpha, and Texture)
· 18.75 Million Sprites per Second (8 x 8 pixel sprites)
Figure 5.2 Needs for Speed Most Wanted (PC) 2006 by EA GAMES
PC Graphics Chip RADEON X300 SE PCI Express
· Bus type PCI Express (x16 lanes)
· Maximum vertical refresh rate 85 Hz
· Display support Integrated 400 MHz RAMDAC
· Display max resolution 2048 x 1536
· Board configuration
· 64 MB frame buffer
· Graphics Chip RADEON X300 SE PCI Express
· Core clock 325 MHz
· Memory clock 200 MHz
· Frame buffer 64 MB DDR
· Memory I/O 64 bit
· Memory Configuration 4 pieces 8Mx16 DDR
· Board configuration
· 128 MB frame buffer
· Specification Description
· Graphics Chip RADEON X300 SE PCI Express
· Core clock 325 MHz
· Memory clock 200 MHz
· Frame buffer 128 MB DDR
· Memory I/O 64 bit
· Memory Configuration 4 pieces 16M x 16 DDR
· Memory type DDR1
· Memory 128 MB
· Operating systems support Windows? 2000, Windows XP, Linux XFree86 and X88.
· Core power 16 W (Max board power)
From the data we can see. GS is too weak, contrast to low-level video card of PC. However, the performance of PlayStation is not too that bad. I don’t want to analyze data here. What I am interested to discuss is about the performance itself.
Let’s see Figure 5.2 in detail. Texture is very clear and exquisite. This is what big video memory offers. The tree leaves in distance need a lot of polygons to build. The video card itself is low-level; possess no special effect for the game rendering. No refection and other sparking place can be found. In general, the game performance is only ok.
Figure 5.3 PC game rendering related architecture
Now let’s see PlayStation 2’s performance, which is in Figure 5.1. We see a good image. If you look the image in detail, you may found the mountain beside the road is weird: the shape of mountain is not that nature, like some spectrum graphics. This is done by VU1, which draws the Bezile, build 3D graphic based on the curve. Although not good enough, how many people will actually notice that when dashing at over 200km/h with his virtual car? VU1 does a lot of job like that and it could generate a lot of shapes without too many polygons to build. Now let’s see the car, the refection of cars is true reflection (which means it is not fake texture pretended to be the reflection), we can distinguish the mountains behind, however very blur. The whole image is not as clear as Figure 5.2 because the limitation of GS’s video memory (4M). However, this image is good enough for most PlayStation 2 players.
5.2 Some more about the video performance
Although Pentium 4 has enough capability to process image real time, the way of implementing games is still no change. The video card read the content of texture into its local memory card, the processor only deal with the data and instructions. After the calculation, the processor stores the display list (a list, recorded with the details of all elements, for instance, one single polygon’s position and texture code) back to the main memory. Video card then access the lists and process them, generate picture, transfer to analogue signal and output. Most special effects depend on the video card. So, no good card, no good performance.
Let’s see figure 2.2, we will see there is no direct connection between GS and main memory. At the PC’s point of view, 4MB video-memory is not enough to show a single frame with 1024*768 pixels. How is PlayStation 2 able to perform like that? The answer is bus. So we come back to section 4 again. The specialized display list (which Sony called 3D display list) is directly sent to GS, along with the required texture. GS has a huge bandwidth (3.8GB/S), its local memory can work as fast as it is (maybe it is more suitable if we call the memory as cache). GS itself supports only a few special effects. However, this situation can be improved by the simulation calculations finished by Emotion Engine… Again, PlayStation 2’s elegant design makes its all components work as a whole.
6. CONCLUSION
Hopefully you have got the idea of how PlayStation 2 and PC architecture differ. Let’s go through it again.
General architecture. PCs are more complex to read, but easier to implement. The system bus directly manages all devices inter-communications. PlayStation 2’s is easy to read, but much harder to implement. The communication between each other is convenient.
Processor architecture. The trend of processor architecture design is meeting the requirement of multimedia. Both PC’s Pentium 4 and PlayStation 2’s Emotion Engine are qualified to run multimedia applications efficiently. Pentium 4 is much stronger than Emotion Engine, but the architecture is very ‘straight’ and has to do extra jobs of translating instructions to be compatible with current applications. Emotion Engine has no this burden, the specialized 3D game performance design make it easy to handle complex calculation jobs with relatively low clock rate.
Buses and Caching. PC has classic bottlenecks and there is no way to overcome it. Current PC buses and cache has improved a lot by increasing the bandwidth and cache volumes, but the latency of main memory cannot be solved. PlayStation 2 works on nearly full load; perfect coordination between components is almost achieved.
Video. Although Pentium 4 can run perfectly on multimedia applications, the PC game developers don’t think so. They still stick to push the texture and other data into the video memory for one time. The awkward situation is, when you want to update your PC for high requirement games, the first component came into your mind must be the video card but processor. It is impossible to ask PlayStation 2 players to update. Emotion Engine is in charge of many jobs what PC’s video card does. The good condition of data transmission makes it is possible to implement ‘true’ multimedia processing in games, that is treating game image as media streams, no need to supply huge data storage to hold that.
Purpose: PC’s general—purpose VS PlayStation 2’s 3D game rendering purpose.
PlayStation 2 is 6 years old now. According to the principle of game console life expectance, it is time to hand the baton to its offspring, PlayStation 3. It is a successful game console of Sony. Contrast to PC, it is too weird, but all its weird compositions seemed so reasonable as well. PC’s architecture is classical; all components have its space for upgrade. Maybe it is too early to say the architecture should evolve. However, PlayStation 2’s architecture gave us a good lesson. If you only were interested in games, you should buy a PlayStation series, not a PC. At least, you need not worry about upgrading your components for the next game. Special architecture can make it becomes the best in specialized region.
7. REFERENCE
[1] William Buchanan and Austin Wilson, “Advanced PC Architecture”, ISBN: 0 201 39858 3
[2] John L. Hennessy and David A. Patterson, “Computer Architecture—A Quantitative Approach”, ISBN: 1 55890 724 2
[3] Keith Diefendorff and Pradeep K. Dubey, "How Multimedia Workloads Will Change Processor Design." Computer, September 1997
[4] Jon "Hannibal" Stokes Sound and Vision: A Technical Overview of the Emotion Engine Wednesday, February 16, 2000
[5] K. Kutaragi et al "A Micro Processor with a 128b CPU, 10 Floating-Point MACs, 4 Floating-Point Dividers, and an MPEG2 Decoder," ISSCC (Int’l Solid-State Circuits Conf.) Digest of Tech. Papers,Feb. 1999, pp. 256-257.
[6] Jon "Hannibal" Stokes “SIMD architectures”
arstechnica88/articles/paedia/cpu/simd.ars
[7] “Graphics Synthesizer – Features and General Specifications”
arstechnica88/cpu/1q99/playstation2-gfx.html
[8] “The Technology behind PlayStation 2”
88ieee88.uk/docs/sony.pdf
[9] Michael Karbo,“PC Architecture“
88karbosguide88/books/pcarchitecture/start.htm
[10] Gabriel Torres, “Inside Pentium 4 Architecture”
88hardwaresecrets88/Article/235/1
[11] Thomas Pabst, “Intel’s new Pentium 4 Architecture”
tomshardware.co.uk/2000/11/20/intel/
[12] KuaiLeDaYuShu, “Video Card Parameters Analysis”
blog.yesky88/Blog/joyelm/archive/2005/07/30/253803.html
[13]Howstuffworks “How PlayStation 2 Works”
entertainment.howstuffworks88/ps21.htm
[14] Craig Steffen “Scientific Computation on PlayStation 2 home page”
arrakis.ncsa.uiuc.edu/ps2/background.php
royzhang1980.spaces.live88/default.aspx
[
返回上一页]
[
打 印]