Report - Danny Matthews
Transcription
Report - Danny Matthews
This PDF has been downloaded from http://www.dmatthews.co.uk. Please feel free to make use of any of the content of this document (including source code) as part of any publication or piece of software provided that it is to be freely distributed. I'd appreciate a reference (with a link back to the site) and an E-Mail if you do make use of anything. Cheers! Danny UNIVERSITY OF SUSSEX Emulating the Nintendo Entertainment System Danny Matthews 42762 Computer Science BSc Department of Informatics Supervised by Dr Des Watson 2008 This report is submitted as part requirement for the degree of Computer Science BSc at the University of Sussex. It is the product of my own labour except where indicated in the text. The report may be freely copied and distributed provided the source is acknowledged. This project uses small amounts of code from the NESCafe (http://www.nescafeweb.com/) and FC64 emulators (http://www.osflash.org/fc64). Both are released under the GNU GENERAL PUBLIC LICENSE Version 2, permitting use of “pieces of it (the software) in new free programs”. Code use is documented within the report and source code. _________________________ Danny Matthews 1 Summary The purpose of this project was to research, design, implement and test a Nintendo Entertainment System (or NES) emulator with software development facilities. The system was developed with the goal of satisfying the needs of both game players who wish to experience their collections in a portable and convenient form and those who develop for the system. It is intended that the software be portable and feature-rich to compete with the systems currently available. The NES is a games console released worldwide in the mid 1980‘s by hardware and software giant Nintendo. The system was supported up until the end of 2007 in Japan and is heralded by many as the most successful games console released to date. Emulators allow software to be executed on a system for which they were not written. This is achieved by writing a piece of software which precisely simulates the original machine. The NES has four main units which must be emulated. These are the Central Processing Unit (CPU), the Picture Processing Unit (PPU), the Audio processing Unit (APU) and the Input units (otherwise known as ―controllers‖). These four units, when combined provide a fully functional NES. The software executed on the NES was distributed in the form of cartridges containing program code and graphical data. Obviously, an emulator cannot execute software in this form (without specialist equipment). Instead, a hardware tool is used to ―dump‖ the cartridge to hard disk (hereon in referred to as a ROM). In order for an emulator to correctly execute ROMs, it must have access to certain information not included in the ―dump‖. The most common representation for this information, and the one chosen for this project is the iNes format, consisting of sixteen bytes appended to the beginning of the ROM. The emulator parses this to provide the correct behaviour. The major development tool provided is a debugger with four major functions: a system status viewer, providing an at-a-glance status summary, a breakpoint system incorporating code stepping, a disassembler which transforms assembled ROM files into a form close to their un-assembled form and a memory viewer to examine machine memory during execution. The other two tools provided are a name table and pattern table viewer. Essentially, these tables are the means used by the NES to store and display graphics. These tools are intended to simplify development by providing useful information to the user. For example, the memory locations of graphics and colour information. The project was largely a success, meeting almost all objectives. The only requirement lacking is a scanline based background rendering routine in the PPU (discussed within the report). I feel that given additional development time, this limitation could be quashed. A number of extensions were added to the software including additional debugging capabilities and a sprite viewer (images capable of independent movement). 2 Table of Contents 1. Introduction ................................................................................................................... 5 1.1. Aims and Objectives ........................................................................................... 5 1.2. Introduction to the problem area ......................................................................... 7 1.2.1. Java Applications .............................................................................................. 7 1.2.2. Emulation .......................................................................................................... 8 1.2.3. Debuggers ......................................................................................................... 8 1.3. Report Overview ...................................................................................................... 9 2. Requirements Analysis ............................................................................................... 10 2.1. Professional Considerations ................................................................................... 10 2.1.1. Code of Conduct ............................................................................................. 10 2.1.2. Code of Practice .............................................................................................. 10 2.2. Needs of Intended Users ........................................................................................ 11 2.2.1. Current Solutions ............................................................................................ 11 2.3. Proposed Solution .................................................................................................. 13 2.3.1. Primary Objectives.......................................................................................... 13 2.3.2. Extensions ....................................................................................................... 14 2.4. Requirements Specification ................................................................................... 15 2.4.1. Overall............................................................................................................. 15 2.4.2. CPU ................................................................................................................. 16 2.4.3. PPU ................................................................................................................ 22 2.4.4. APU ................................................................................................................ 37 2.4.5. Input/Output ................................................................................................... 43 2.4.6. Development Tools ......................................................................................... 45 2.4.7. Further Specifications ..................................................................................... 48 3. Design ........................................................................................................................... 49 3.1. Overall System Design .......................................................................................... 49 3.2. CPU Design ........................................................................................................... 53 3.3. PPU Design ............................................................................................................ 62 3.4. APU Design ........................................................................................................... 68 3.5. Input/Output Design............................................................................................... 80 3.6. Development Design .............................................................................................. 81 3.7. GUI Design ............................................................................................................ 88 4. Implementation ........................................................................................................... 94 4.1. Common Tactical Policies ..................................................................................... 94 4.2. Software Re-use ..................................................................................................... 96 4.3. Threading ............................................................................................................... 96 4.4. Outputting Sound .................................................................................................. 99 4.5. Testing.................................................................................................................. 100 5. Conclusion ................................................................................................................. 102 5.1. Finished Software Screenshots ............................................................................ 102 5.2. Success of the finished product............................................................................ 109 5.3. Future Extensions............................................................................................... 1152 5.4. Alternative Methodologies................................................................................. 1174 6. Works Cited ............................................................................................................. 1185 7. Appendices ........................................................................................................... 12118 Appendix A: Cartridge Specification...................................................................... 12219 3 Appendix B: File Format Specification .................................................................... 1252 Appendix C: Regional Differences Specification ..................................................... 1285 Appendix D: Input Devices and Other Peripherals ................................................. 13027 Appendix E: Background Rendering in Detail ....................................................... 13229 Appendix F: Low Level Designs .............................................................................. 1385 Appendix G: Test Specification ................................................................................ 1585 Appendix H: Project Logs ........................................................................................ 1663 Appendix I: GNU GENERAL PUBLIC LICENSE Version 2 ................................. 1696 Appendix J: Source Code.......................................................................................... 1785 4 A Nintendo Entertainment System (NES) Emulator By Danny Matthews Supervised by Des Watson 1. Introduction 1.1. Aims and Objectives 1.1.1. Purpose The purpose of this project is twofold. Firstly, it aims to provide an end user with a convenient and enjoyable means of playing the games of yesteryear on their computer. The second major objective is to provide development tools for those who write software for the Nintendo Entertainment System. This functionality will include a debugger and disassembler amongst other things. A further, more academic reason include that the completed software could allow for the analysis of program execution. For example, how fast the CPU is required to run for games to be played successfully and the amount of time spent performing graphical operations. The system will provide many additional benefits to software developers (of which there are still surprisingly many for the NES) other than those discussed above. These benefits include: A substantially more accessible test platform Greatly accelerated development due to this increased accessibility A reduction in costs (e.g. limiting the need for NES flash carts and other peripherals) 1.1.2. Motivations My motivations for undertaking the writing of a NES emulator are several. First and foremost, they stem from a long time interest in the area of computer emulation of all types of system. The NES is of particular interest to me for reasons of nostalgia, having spent many an hour using the system in my youth. Additionally, an interest in the implementation and logic of hardware drew me to the area, as did the opportunity to research, design and implement a piece of software significantly larger than anything previously required. 5 I decided to incorporate development tools into the system with the aim of making the lives of Nintendo developers easier. This grew out of a respect for their work, being constrained in the way they are in how they design and code their software (having to work on such limited hardware). Finally, the great possibilities for extending the functionality of the software appeals (with several of these possible extensions discussed below). 1.1.3. Relevance This project relates to several areas of the degree programme. The main areas of relevance are those modules which focus on systems architecture (Computer Systems Architecture), programming (Introduction to Programming, Further Programming) and data structures (Data Structures, parts of Introduction to Operating Systems). Also, because it involves the writing of a substantial piece of software, Algorithmics will be useful, as will Software Engineering and Software Design. The software will require a GUI, meaning that Human Computer Interaction will prove helpful in making said GUI as intuitive and user friendly as possible. Computability and Complexity has proven helpful in that it has helped me to understand why it is that emulation is theoretically possible. Finally, Professional Issues in Computing will help ensure understanding of what is expected of me in terms of conduct and practice and Technical Communication Skills will undoubtedly help with the presentation aspect of the project. 1.1.4. Intention I intend to write an emulator for the Nintendo Entertainment System (NES) games console. I intend to implement this system using the Java programming language. One of the major benefits of using Java is its platform independent nature. The system will be deployable on all the major platforms (PC, Macintosh, and Linux) as well as any other system for which a Java Virtual machine has been written (see the ―Introduction to the problem area‖ for an explanation of the above). It consists of three main units (all present on the motherboard). These are: CPU – a modified 6502 processor, PPU – a chip providing the graphical capabilities of the system, 6 APU – a third chip providing all audio processing capabilities. These three units, working in parallel perform all the main tasks required of the machine. It is additionally required to provide input support so that the software can be controlled by the user. Finally, I intend to implement a set of development tools to aid NES developers. This will include: A debugger and disassembler, A Pattern table viewer, A Name table viewer Additional functionality can be added if time permits. This functionality is discussed in the Objectives section of the report. 1.1.5. Resources Required For Development A computer with internet access A copy of the Java Runtime Environment (JRE) v1.5. ROM files of games using varying MMC‘s and other features. All required resources are available. 1.2. Introduction to the problem area 1.2.1. Java Applications Java is an object-oriented programming language developed to be multi-platform without the need for separate compilation. It achieves this in two steps. Firstly, Java programs are compiled into byte code as opposed to target architecture compilation. This byte code acts as an intermediate language. Secondly, virtual machines (VM‘s) are written for each target platform which interprets this intermediate language into statements executable by the target machine. It should be noted that it is possible to compile Java applications into native code via third party compilation solutions. Also, VM‘s no longer simply interpret the intermediate language, instead using the ―Just in Time‖ paradigm, caching the code as it is compiled into native form, thus allowing for much faster execution speeds. 7 1.2.2. Emulation The British Computer Society (BCS) define emulation as follows: ―Emulation is a very precise form of simulation which should mimic exactly the behaviour of the circumstances that it is simulating. An emulator may enable one type of computer to operate as if it were a different type of computer.‖ (1) (2) It is possible to emulate any computer system on another. This is due to the following (assuming the Church-Turing Thesis to be correct): 1. All current computer systems are Turing Complete. That is, it is possible to completely emulate the universal Turing Machine (Lu) on all current systems. Lu is a special multi-taped Turing machine which is capable of emulating the behaviour of any other Turing Machine. 2. All current systems can be emulated by Lu. Thus, in principle, it is possible to emulate any system on another (3). Emulators are available for a vast number of systems. For example, there exist emulators for the Amiga (4), Commodore 64 (5) and even pocket calculators (6). 1.2.3. Debuggers Debuggers are software tools designed to help in the elimination of programming errors from code. Debuggers usually offer a large amount of functionality. This includes ―breakpoints‖ – where program execution is paused whenever particular conditions are met (for example, when a particular line in the code is met or a variable reaches a certain value) Memory viewers – allowing the viewing of memory during execution. 8 1.3. Report Overview The remainder of this report is split into several major sections. Firstly, the project is discussed in regard to the conditions which must be met by a member of the British Computer Society (BCS) before, during and after development. This is followed by a discussion of the tactical policies to be followed during development and an analysis of project requirements. This includes a discussion of the intended user group for the software and an evaluation of currently available solutions. Project objectives are also detailed. A requirements analysis follows, detailing all that must be implemented to achieve the project objectives. This is divided into seven major sections, each describing a one of the major system objectives. The design follows. Again, the sections largely mirror the system objectives, documenting the design to be followed when implementing the system. The first section of the design provides an overview of the system design in terms of package and class structure. Following the design is a discussion of the implementation. This discusses issues which occurred during implementation as well as design decisions which could not be discussed in an implementation free way within the design. System testing follows. Most of the testing followed a unit testing methodology although performance and portability testing were also discussed and tested. The final sections critically analyse the project in terms of its design and implementation, concluding with a discussion of the projects success, including ideas for future extension and alternative design and implementation methodologies. 9 2. Requirements Analysis 2.1. Professional Considerations In undertaking any project it is important to ensure that the software and development process is ethically sound. To this end, the British Computer Society (BCS) provide guidelines to be followed. These come in the form of two published documents. The project will be discussed in relation to these documents below: 2.1.1. Code of Conduct Arguably the most valid considerations are to ―have regard to the legitimate rights of third parties" and to have ―knowledge and understanding of relevant legislation, regulations and standards‖ (points 2 and 3). Although I have not received permission from Nintendo to write an emulator for their system, the law supports the development of emulators provided that no proprietary information was used in the development of the system (all information has been obtained from the internet where this information was ascertained via backwards engineering the NES). Several court cases support this (7) (8). Additionally, it should be stressed that at no time will pirated ROM files be used with the system. Firstly, legal copies of many titles are already owned, therefore providing sufficient testing material. Secondly, the emulator will not be released in any form (apart from in the form of source code in the final report). In short, this project is perfectly legal as: 1. The system is being built entirely without the use of proprietary documentation, 2. All ROM files used will either be dumped from a legally owned copy of the software or will be non-proprietary. No information will be ―misrepresented or withheld‖ in this report (point 9). All clauses deemed relevant from the BCS Code of Practice will be observed (point 16, see below). 2.1.2. Code of Practice I have attempted to keep my workload such that I will be able to successfully meet all goals within the time available and have ensured that I have access to all necessary resources (―Manage your workload efficiently). An acceptance strategy will be devised that ―will fairly demonstrate that the requirements of the project have been met‖ (―When defining a new project‖). 10 When the project is completed, I will be sure to ―honestly summarise the mistakes made, good fortune encountered and lessons learned‖ and to ―recommend changes that will be of benefit to later projects‖ (―When Closing a project‖). It will be ensured that the analysis and design specifications provide as accurate a representation of the system as possible and every attempt will be made to ensure that all programming practices used help to provide easy to maintain and efficient code. It is intended that the documentation will be written to a level of detail ―that others could take over the work if need be‖ and will be kept up to date (―When writing technical documentation‖). 2.2. Needs of Intended Users It is intended that there will be two distinct groups of users for the finished software. Firstly, there will be those who use the emulator simply to allow them to play their NES games in a convenient and portable form. To this end, the software should be written so as to provide maximum compatibility with the game images. Ideally, the software should support as many of the systems‘ peripherals as possible and should include additional features (possibly not present in the original system) which would enhance the users play (e.g. save states). It should also ideally be sufficiently efficient to allow the games to be played at close to original speed. The second intended user group are those developing software for the NES. At a minimum, the system should provide a number of development tools to aid those developing for the system (detailed in the ―Proposed Solution‖). Ideally, a great number of additional tools will be provided to allow efficient development (such as hex editors, trace loggers etc). 2.2.1. Current Solutions 2.2.1.1. Java Solutions The quality of NES emulators written in Java vary greatly. The Jamicom emulator (9) lacks sound support and executes at a speed so quick it is rendered un-useable. It also lacks GUI support, requiring command line operation. 11 Several emulators are incomplete (such as Animosity (10), currently consisting of little more than a 6502 emulator). However, one very complete emulator known as NESCafe (11) stands out above all others in quality. It is available in both applet and application form, fully emulating the graphics and sound of the NES. It additionally provides a rudimentary debugger. 2.2.1.2. Non-Java Solutions The quality of non-Java emulators tend to be much better. These include Nestopia (12), a Windows emulator of high quality (although it provides no software development aids) and FCEUXD (13), which, again is an excellent emulator. In contrast to Nestopia, FCEUXD provides excellent development tools which include a debugger, a hex editor and a RAM filter. 2.2.2. Ideal Solution Whilst FCEUXD would seem to be the ideal choice, providing both a robust emulator and many excellent development aids, it suffers from its lack of portability (with only a Windows version released to date). NESCafe does not suffer from this but lacks any kind of substantial development tool. An ideal solution would be one that incorporates the strengths of both these programs. An outline of the functionality of such a system can be seen below (Proposed Solution). 12 2.3. Proposed Solution 2.3.1. Primary Objectives O 1. Emulating a fully functional 2A03 (NTSC) processor O 1.1. Providing a means of pausing and continuing CPU execution and resetting the CPU. O 2. Emulating a fully functional NTSC Picture Processing Unit (PPU) O 2.1. Allowing alterable Tint and Hue of system colours. O 3. Emulating a fully functional NTSC Audio Processing Unit (APU) O 3.1. Providing the ability for the user to modify the output volume of the APU (including sound muting). O 4. Emulating the standard NES control pad (allowing user input), using the computer keyboard to control this interaction. O 4.1. Providing a means of changing the keyboard keys which correspond to control pad inputs. O 5. Implementing a development environment to aid NES programmers. O 5.1. Debugger O 5.1.1. Present system information in an easily readable way. O 5.1.2. Present a disassembled version of the loaded ROM. O 5.1.3. Provide a breakpoint system to stop CPU execution on certain register value conditions being met (with step based execution capabilities). O 5.1.4. Memory Viewer O 5.2. Name Table Viewer O 5.2.1. Display the name tables graphically (in 2 bit and 4 bit colour modes). O 5.2.2. Display attribute table values. O 5.2.3. Display the scroll line values graphically. O 5.2.4. Allow alterable name table refresh rates. O 5.3. Pattern Table Viewer O 5.3.1. Display the pattern tables graphically. O 5.3.2. Display the palette values graphically. O 5.3.3. Display details of individual pattern table tiles and palette entries. O 5.3.4. Display the numeric contents of the name tables in a grid format. This will provide the contents of the name table memory to the user in a much more accessible and readable format. O 6. Implementing a Graphical User Interface to facilitate easy use of the system. 13 2.3.2. Extensions E 1. Implementing additional Memory Management Controllers (MMC). E 2. Implementing a tool for easy analysis of program execution. For example, providing statistics of instruction and addressing mode use. E 3. Extending debugging facilities. A variety of functionality could be provided here. Possibilities include: E 3.1. A trace logger, which would allow for the user to view all (or a set number) of the instructions as they are processed. Register values could also be included. This could be a useful means of viewing the execution of the program. E 3.2. Palette altering facilities. One palette of colours exists for sprites and one for the background. Allowing the altering of these palettes would give the user a very efficient way of changing their games‘ colour schemes significantly. E 3.3. Cartridge details could be made available to the user. For example, the type of MMC used. E 3.4. Implementing visual system state, providing an at-a-glance view of the system state (e.g. illustrating whether the CPU is executing via a symbol). E 3.5. Implementing code highlighting. The user can highlight specified strings within the disassembled code. For example, they may wish to highlight all JMP instructions. E 4. Implementing additional peripherals. A significant number of alternative input devices exist for the NES, as do a number of other peripherals designed to enhance the use of the system. These include the ―Zapper‖ (a gun shaped peripheral) and the Game Genie (allows the user to alter the execution of software). Additional input devices and peripherals are detailed here. E 5. Implementing undocumented opcodes. There are many opcodes available for use on the 6502 which were not documented by the manufacturers. These include opcodes which implement several defined instructions at once. (14) E 6. Implementing Saved Game functionality. Some games allow you to save your progress through the game so that it is possible to return to that point at a later date. This could be implemented allowing saving only in those games which 14 allowed it or it could be implemented allowing the user to save the game state at any time and in any game they wish. E 7. Implementing a Sprite Viewer. This would allow the user to view all the sprites present in their project at once so as to ensure that they will be rendered as intended. E 8. Implementing a Frames Per Second counter so as to allow the user a means of measuring the performance of their software. 2.4. Requirements Specification 2.4.1. Overall 2.4.1.1. Development Model For the most part, the waterfall model will be employed in development. This will work well because of the nature of the project and my lack of familiarity with the system to be emulated. That is, it shall be necessary to carry out substantial research into the subject area before being able to design the system (requirements analysis) and it would be advisable to produce a design to guide through the implementation process. Figure 1: The Waterfall Model (modified from (15)) However, I feel that it would be impractical to follow this model too closely due to the large number of corner cases and detail involved in the system. Thus, once the main analysis and design processes have taken place, a more ad-hoc method of analysis and design will be used to fill in the inevitable holes in the main analysis and design documentation. Both formal and informal testing will be carried out throughout development. 15 2.4.2. CPU 2.4.2.1. General The CPU consists of a modified 6502 processor distributed by Ricoh. This modified CPU was known as the 2A03 in NTSC systems (1.79MHz) and the 2A07 in PAL (1.66 MHz). The 2A0X series was identical to the 6502 in every way except that it lacked a Binary Coded Decimal mode and included 22 memory mapped registers to assist in a multitude of tasks including sound generation and joypad reading. (16) Figure 2: The 6502 CPU (17) 2.4.2.2. Memory Mapping The 2A0X interacts with external devices via the method of memory mapping. All memory locations written to/read from by the CPU over 401F actually perform operations on some external device. For example, reading from location C000 will read the first byte from the upper bank of cartridge ROM (discussed later). 2.4.2.3. Memory Mirroring A significant amount of memory mirroring also occurs. Memory mirroring is a technique where multiple addresses map to the same location in memory. For example, if locations 6-10 mirrored locations 1-5, a write to 7 would be same as writing to 2 (the same applies to the reading of data). Mirroring is used to cut down on the hardware required by the system. Using mirroring, you need only decode part of the address in question. If all the available memory locations were required, this would result in a problem. However, the CPU is able to reference 64KB of memory with the NES needing only a fraction of this. Thus, mirroring can be used without issue. 16 2.4.2.4. Opcodes The 2A0X has 56 official instructions; with many of these supporting multiple addressing modes. This results in the 6502 supporting a total of 151 official op-codes. (2) There also exist 105 undocumented opcodes, many of which perform multiple official instructions at once. (14) The number of CPU cycles that each opcode takes varies and is largely dependent on the addressing mode used. 2.4.2.5. Addressing Modes (17) Addressing modes specify a format for how the addresses given to the opcode will be interpreted. Thirteen such modes exist in the 2A0X. Each addressing mode will not be discussed here. However, a discussion of these can be found at. (2) 2.4.2.6. Memory (17) The maximum amount of memory which can be present in the 2A0X is 64K. This is because the processor has a 16 bit address bus and thus cannot reference any memory beyond this boundary. 2.4.2.7. Page System (17) Memory locations are divided into pages of 256 bytes. Whenever a page boundary is crossed, it often introduces an extra cycle delay to the execution of the instruction. Two pages of memory are used for specific purposes. These are: Page 0 It is possible to read from and write to page 0 memory faster than any other memory as you need only specify one byte for the address (the speed advantage comes from needing fewer cycles for execution. The memory itself is no faster). Thus, it is usually used as ―working memory‖, storing those data which will be accessed regularly. Page 1 Page 1 consists of the stack. This is predominantly used for storing data that should be preserved whilst sub-procedure calls take place. All other pages have no specific purpose. 17 2.4.2.8. ALU The Arithmetic-Logic Unit is used to perform the arithmetic operations of the processor (addition, subtraction etc) as well as the logical operations (ANDs, ORs etc). It takes two inputs and returns a single output. Figure 3: The operation of the ALU unit (modified from (18)) 2.4.2.9. Registers (17) The processor has seven internal registers (all 8 bits wide). These are as follows: 1. X (index register) This register can be used as a high-speed counter but is predominantly used as an index into a block of memory. It can also be used to get and set the stack pointer. 2. Y (index register) This register is used in the same way as the X register (the only difference being that Y cannot be used to set or get the stack pointer). 3. S (Stack Pointer) The stack pointer is used to point to the current top of the stack. This register is actually nine bits wide with the ninth bit always set to ―1‖. Thus, the stack pointer is only able to access memory locations in the range 256-511 (page 1 of memory). The stack pointer begins at location 511 and the stack grows ―backwards‖ in memory. 18 4. PCH (Program Counter High) This register stores the higher 8 bits of the Program Counter (where the Program Counter is used to point to the next address in memory to be executed by the CPU). 5. PCL (Program Counter Low) This register stores the lower 8 bits of the Program Counter. 6. A (Accumulator) The Accumulator is tied to the left input of the ALU, with the right input typically being a memory location. When the result is computed, it is then deposited into the Accumulator. This is the reason why it is said that the 2A0X uses an Accumulator based design (with the Accumulator being used both as an input and as the output) and also explains its name (as it accumulates results). 7. P (Status Register) Figure 4: The 6502's status register (17) The status register consists of seven one-bit flags (and an un-used bit always set to one). They are defined as in the image above. Note that the Decimal flag performs no purpose in the 2A0X series (although the bit is still included in the status register and is available for programmer use). 2.4.2.10. Subroutines When subroutine calls are made, there must exist a means of returning to the previous point of execution once the subroutine has come to an end. This is achieved by pushing the value of the PC (minus one) onto the stack before changing the PC value to the first location of the subroutine. 19 When the subroutine comes to an end, the PC value previously pushed onto the stack is popped and becomes the new value of the PC. This way execution will continue from the point it was at prior to execution of the subroutine. 2.4.2.11. Interrupts (17) There are three kinds of interrupt in the 2A0X: 1. IRQ‘s (Interrupt Request) IRQ‘s are the standard type of interrupt on the processor. This type of interrupt may be masked (that is, ignored by the processor) depending on the state of the interrupt flag in the status register. When an IRQ is activated, the interrupt register is set (ignoring all subsequent IRQ‘s) and the PC and status register are pushed onto the stack (but not before setting the Break flag of the status register to ―0‖). The contents of memory locations FFFE and FFFF are then branched to. These two locations contain the IRQ interrupt vector. When the interrupt is completed, the PC and status register is popped off the stack and execution continues as before. 2. NMI‘s (Non-Maskable Interrupts) NMI interrupts are of higher priority than IRQ‘s and cannot be masked by the interrupt flag. NMI‘s are otherwise identical to IRQ‘s except that they instead branch to the address at locations FFFA and FFFB. 3. BRK (Break) The equivalent of a software interrupt, the BRK command behaves identically to an IRQ except that the Break flag of the status register is set to ―1‖ as opposed to ―0‖ for an interrupt (to allow the processor to differentiate between interrupt types). In the case that multiple interrupts occur at once, the following rules are used: 1. The type of the interrupt is checked. If it is not a NMI, it is ignored. 2. If the interrupt was an NMI, the PC and status register of the currently executing interrupt is pushed onto the stack and the PC value for the new interrupt is loaded into the PC. 20 Other than these special rules, the normal rules of operation for interrupts are followed. 2.4.2.12. Reset When the 2A0X is reset: The interrupt flag in the status register is set The PC is set to the contents of memory locations FFFC and FFFD The registers should be reset to their default values. (19) Figure 5: The three vectors used by the 6502 (17) 21 2.4.3. PPU (2) (20) 2.4.3.1. General The PPU generates a 256x240 pixel colour image (cropped to 256x224 on NTSC televisions). The PPU is external to the CPU with its‘ own memory. This memory cannot be directly accessed by the CPU, with access being restricted to manipulation of eight memory mapped registers. 2.4.3.2. The Rendering Process Graphics are rendered onto the screen line by line (scanline), beginning at the top left of the display. Each line has a height of 1 pixel and a width of 256. There are two types of delay which occur during the rendering process. These are HBlank and VBlank and are discussed below: 2.4.3.2.1 HBlank HBlank is the name given to the time that it takes to travel from the right hand side of the display back to the left after the current line has been rendered. This adds a delay to the rendering of scanlines. 2.4.3.2.1. VBlank VBlank refers to the time that it takes to travel from the bottom right hand side of the display back to the top left. VBlank occurs once per frame. The VBlank period is instrumental in allowing interactive programs to be written for the NES. It is during the VBlank period that most operations are performed (e.g. checking for user input, updating the scrolling etc). 22 2.4.3.3. Sprites Sprites consist of those images which are capable of being moved independently around the screen without causing damage to the background. Examples include "characters" (some of which are controllable by the user). The hardware allows for sprites to be either 8x8 or 8x16 pixels in size and the maximum number of sprites is 64. 8x16 sprites are dealt with slightly differently than 8x8 sprites (discussed later). All sprites have a priority which determines in what order they are drawn by the PPU, with higher priority sprites being drawn last to ensure that they are drawn "on top". The hardware allows for all sprites to be flipped horizontally and vertically as well as being able to specify whether the sprite should appear in front of the background or behind it. The NES allows a maximum of 8 sprites per scanline. Most "characters" are constructed from multiple sprites. Figure 6: Mario (above) is made up of 8 sprites (2) 23 2.4.3.4. Memory The PPU has access to 16KB of memory. There also exists a separate 256 byte area of memory dedicated to storing sprite data known as Sprite RAM. Each area of memory has a different purpose. These will now be explained. Figure 7: Summary of PPU Memory (2) 2.4.3.4.1. Colour Palettes The NES has three colour palettes. These are the master, sprite and image palettes. The master palette contains all the colours that the NES is capable of displaying (52, with space for 64). The sprite and image palettes each contain 16 colours. The sprite palette contains those colours that can be used for sprites and the image palette contains those colours usable for the background tiles. Both the sprite and image palettes are subdivided into groups of four colours (thus, each palette consists of four sub-palettes of four colours each). A further complication comes from the fact that the first element of each sub-palette is always transparent. This is achieved by mirroring the first palette entry every four bytes. 24 If it is the case that for a pixel location on the screen, neither a sprite nor the background points to a non-transparent colour, there is a ―fallback‖ background colour stored at location 3F00. The reasons why the palettes are constructed this way will hopefully become clear soon. Figure 8: All 8 sub-palettes begin with a transparency element (adapted from (20)) 25 2.4.3.4.2. Pattern Tables The PPU has two pattern tables. The pattern tables are used to store the 8x8 pixel tiles which make up the graphics used for both the sprites and the background. NES graphics use 4-bit colour. However, only the two least significant bits of this colour are stored in the pattern tables. The attribute tables contain the remaining two bits for each pixel. Each tile consists of two planes. The first contains the least significant bit of the colour and the second contains the most significant bit. These values are combined to make up the 2-bit colour (see below for a visual example). Figure 9: An illustration of how tiles are made up (the remaining two bits are in the attribute tables) (2) The two bits stored in the pattern tables actually determine which of the four colours available in the currently active sub-palette should be used to represent the colour of the pixel. For example, if the graphic was a sprite, the two bits were ‗11‘ and the currently active sub-palette (which sub-palette is active is determined by the attribute table value – see later) was the third, the pixel colour would be brown (using the above palettes). 26 2.4.3.4.3. Name Tables/Attribute Tables Name tables and attribute tables are closely related. Name tables consist of a 32x30 matrix where each element points to a tile in the pattern tables. As each tile is 8x8, this results in the name table being 256x240 pixels in size (the size of the image generated by the PPU). Attribute tables provide the upper two bits of colour for the tiles in the name tables. Each byte provides the colour information for a 4x4 block (32x32 pixels). Every two bits of each byte represents a quarter of this 4x4 block (2x2 block or 16x16 pixels). These two bits actually represent the sub-palette to be used for the block of 16x16 pixels. Thus, it is the case that for every block of 16x16 pixels in the name tables, only four colours are available. Figure 10: An illustration of the colour palettes in action (interacting with the pattern and attribute tables) 2.4.3.4.4. Sprite RAM (SPR-RAM) (21) Sprite RAM is a separate 256 byte area of memory which has the same purpose as the attribute tables except that it is used to construct the sprites rather than the background. The way that the graphical elements obtain the colour information for each of their pixels is summarised in the below image: 27 Figure 11: A summary of the interactions between the pattern tables and the Sprite RAM/Attribute Tables (Pattern table image from (22)) 2.4.3.4.5. Name Table Mapping The NES is capable of handling up to four name tables/attribute tables but only has enough memory to store two. However, via the use of mirroring it is possible to use all four. This capability is important when it comes to scrolling the screen. The four name tables are arranged in a 2x2 pattern as illustrated below. Figure 12: A visual representation of the name tables’ layout 28 In describing the mechanics of name tables, the concepts of physical and logical name tables will be used. Physical name tables are the actual memory used to represent the name table whereas logical name tables are those which are addressable via PPU memory. Each mirroring technique is described below and then is immediately followed by an image illustrating said technique in use. The images used are real name tables taken from Super Mario Brothers 2 (23). Horizontal mirroring maps 0x2000 and 0x2400 to the first physical name table and 0x2800 and 0x2C00 to the second physical name table. Figure 13: Horizontal Mirroring Vertical mirroring maps 0x2000 and 0x2800 to the first physical name table and 0x2400 and 0x2C00 to the second physical name table. Figure 14: Vertical Mirroring 29 Single-screen mirroring maps all four logical name tables to the same physical name table. Figure 15: Single-Screen Mirroring Four-screen mirroring allows each logical name table to map to separate physical name tables. An additional 2kb to store the two extra physical name tables (on the game cartridge) is required to achieve this. Figure 16: Four-Screen Mirroring 30 2.4.3.5. Registers 2.4.3.5.1. Control Register This register controls the operation of the PPU by allowing specification of various parameters needed for operation. For example, Sprite and background pattern table addresses and the base name table address. 2.4.3.5.2. Mask Register This register controls the behaviour of the graphics rendering process. It allows enabling/disabling of background and sprite rendering, allows rendering in greyscale mode and allows intensifying of certain colours amongst other things. 2.4.3.5.3. Status Register Like the CPU and APU, the PPU also has a status register and is used in much the same way. 2.4.3.5.4. SPR-RAM Address and data Registers These registers allow for reading and writing to Sprite RAM. The address to be accessed is written into the address register and it is then possible to read from or write to this address using the data register. 2.4.3.5.5. Scroll Register This register is used for specifying horizontal and vertical scroll offsets, determining in what direction and how much the screen should scroll. 2.4.3.5.6. Pattern Tables address and data Registers This works in the same way as the SPR-RAM address and data registers except that they allow access to pattern table addresses rather than those of Sprite RAM. 2.4.3.6. Direct Memory Access (DMA) (2) (24) During the execution of a program, large quantities of data will need to be periodically transferred between CPU memory and Sprite memory. To make this as efficient as possible, a DMA controller exists. Using a DMA, you are able to transfer large amounts of memory from one place to another without a need to go through the processor. 31 It is required to inform the CPU before DMA begins and to inform the CPU once all the data has been transferred. This is to ensure that the DMA controller has exclusive access to the memory bus whilst copying. In the case of the NES, access to the DMA controller is achieved by writing to a special register (4014), specifying the starting offset as the write operand – to which 0x100 is added. 256 bytes is then transferred from this address onwards. 2.4.3.7. Background Rendering 2.4.3.7.1. Overall A high level look at rendering a scanline to the display follows: 1. Obtain the pattern table information relevant for the current line, split into two layers. 2. Convert the information into 2 bit colour form by combining the two layers. 3. Retrieve the attribute table information for the current line and combine with the 2 bit colour form to make the complete 4 bit colour form. 4. Output the line to the display using the relevant palette entries. 32 2.4.3.7.2. Scrolling Scrolling is achieved via the use of multiple name tables. When the users' "character" moves sufficiently close to the edge of a name table, the name table next to the currently displayed one (which one will vary depending on the mirroring technique chosen) will begin to be used for screen rendering producing a composite of the two name tables). This continues until the first name table has been ―scrolled‖ out of view, resulting in the rendered image coming entirely from the second name table. To maintain the ability to scroll, when a name table is not visible during rendering, the PPU will often be filling the table with the graphics to be viewed when scrolling begins again. This process is illustrated below: Figure 17: A summarisation of the NES scrolling mechanism (adapted from (2)) 33 2.4.3.7.3. Rendering Modes Rendering modes influence the image rendered to the display. The possible modes follow: greyscale background/sprite clipping background/sprite rendering turned on/off Colour intensity (RGB). Specifying rendering modes is a simple process of writing to certain bits of PPUMASK which are then used by the PPU to decide exactly how the rendering should take place. The background rendering process is explained in far greater detail discussing the registers used along with how they are interpreted and manipulated by the machine. 2.4.3.8. Sprite Rendering 2.4.3.8.1. Sprite Evaluation (21) During each scanline (341 clocks), the PPU accesses the Sprite RAM in a particular pattern. This procedure determines the sprites on the current scanline, making them available for drawing to the screen. In more detail: There are two types of Sprite Memory: main and secondary. The sprites in secondary after the sprite evaluation are the ones which are to be rendered to the display. Each sprite‘s Y Co-ordinate is stored in secondary and evaluated. If the sprite is present on the current scanline, the remainder of the sprites attribute data is copied into secondary memory. This process continues for all sprites in memory until either: o Eight sprites have been copied to secondary. At this point, the evaluation logic becomes very erratic (having eight sprites on a scanline effectively ―breaks‖ the logic). o All sprites have been checked. Only the data used to determine where on the display the sprites should be rendered are stored in Sprite Memory. The actual sprite data is stored in one of the pattern tables. 34 (2.9) A flow chart representing this sprite evaluation logic. 2.4.3.8.2. Sprite Rendering Each sprite takes up 4 bytes of memory. Byte 0 – The Y position of the top of the sprite (used to determine whether the sprite is present on the current scanline – see later). Byte 1 – The tile index number used for this sprite. Byte 2 – Sprite Attributes. Byte 3 – The X position of the left side of the sprite. It essentially achieves this by checking whether each sprite is present on the current scanline using the sprites top Y co-ordinate to determine this (byte 0). Any sprites which should be rendered to the display are stored in secondary Sprite RAM. Much of the above logic is unnecessary in achieving an accurate emulation. Also, much of the more complicated logic occurs once more than 8 sprites have been found which should be rendered on the scanline. This behaviour will not be emulated for two reasons: 1. Original NES titles did not use more than 8 sprites per scanline because of the machines inability to handle any larger number in a deterministic manner. Thus, original titles will emulate correctly. 2. Removing the 8 sprite restriction will allow current day developers additional freedom in writing their applications. Removing this restriction makes the use of secondary Sprite RAM impractical as there is no way of gauging how much memory needs to be set aside for sprite storage. Rendering will instead be dealt with by rendering sprites to the display as they are found. 2.4.3.8.2.1. The Rendering Process For each sprite in secondary Sprite RAM (provided that the sprite is set to be rendered above the background): 1. Obtain the pattern table information relevant for the current sprite and line, split into two layers. 35 2. Convert the information into 2 bit colour form by combining the two layers. 3. Use byte 2 of the current sprites information to obtain the 2 high colour bits and combine with the 2 bit colour form to make the complete 4 bit colour form. 4. Output the sprite to the display at the correct X position (using byte 3) using the relevant palette entries. 2.4.3.8.2.2. Sprite Size NES sprites have the ability to be either 8x8 or 8x16 pixels. This is handled by byte 1 of the sprite data and PPUCTRL. By manipulation of bit 6 of PPUCTRL, the programmer can decide which size sprites to use. The PPU uses the value of this bit to determine how byte 1 should be interpreted. 8x8: Byte 1 specifies the tile number to be used. 8x16: Byte 1 specifies which tile bank should be used and the tile number of the top of the sprite. The bottom half of the sprite uses the next tile in the bank. 36 2.4.4. APU (25) (26) 2.4.4.1. General The NES APU (Audio Processing Unit) consists of five sound channels. These are as follows: Pulse 1 (Pulse wave) Pulse 2 (Pulse wave) Triangle (Triangle wave) Noise (Pseudo-Random) DMC (Delta Modulation) With the exception of the DMC channel (which plays samples), all channels play waveforms. The unit is made up of many smaller units which interact with one another in order to achieve the audio capabilities of the NES. This interaction is achieved largely via the use of clocking; that is, when one unit wishes to interact with another, the first clocks the second. 2.4.4.1. Component Building Blocks Most of the components in the APU use the following ―building blocks‖, used in different ways so as to achieve varying effects. Descriptions of these base units follow: 2.4.4.1.1. Divider Dividers output a clock every n clocks, where n is the divider‘s period. It contains a counter which is decremented on the arrival of each clock. When this counter reaches zero, the divider is reloaded with its period and an output clock is generated. It is possible to force a divider to reload its counter immediately. If this is done, an output clock is not generated. Upon changing the dividers period, the current count is not affected. 2.4.4.1.2. Sequencer Sequencers continuously loop over a sequence of values or events. When clocked, the next item in the sequence is generated. For example, a sequencer could have a sequence of events where each item causes specific units in the APU to be clocked. A table showing each channel and their use of these building blocks can be seen below: 37 2.4.4.2. Channel Units There are several components used by the various channels in order to alter the output of that channel (for example, increasing or decreasing the volume). A description of each follows. 2.4.4.2.1. Building Block Usage Several of the channel units use the building blocks defined above. A summary table follows: Building Block Unit Timer Length Counter Envelope Sweep Linear Counter Shift Register w/ feedback Memory Reader Sample Buffer Output Unit Frame Counter Status Register Divider X Sequencer X X X X Table 1 : A summarisation of the use of the "building blocks" in "channel units" 2.4.4.2.2. Timer A timer is equivalent to a divider. 2.4.4.2.3. Length Counter The length counter provides automatic duration control for the waveform channels of the APU. The channel can be set to continue playing until it is told to stop or it can be set to play for a certain amount of time and then silence the channel. 2.4.4.2.4. Envelope The purpose of the Envelope unit is to control the volume of the waveform channels. The volume can be set to be constant or to decrease gradually over time. It also provides the capability for sound looping. 38 2.4.4.2.5. Sweep The sweep unit allows the frequency of the Pulse channels‘ output to be increased or decreased periodically. This allows for a variety of effects to be achieved. 2.4.4.2.6. Linear Counter This is simply an additional duration timer which is similar to the Length Counter except that it is of higher accuracy (being 7 bits as opposed to the Length Counter‘s 5 bits). 2.4.4.2.7. Shift Register w/ feedback This unit provides a mechanism for generating pseudo-random bit sequences for use in the noise channel. The bit-sequences are generated using the current state of the register, the ExclusiveORing of bits and a right shift. 2.4.4.2.8. Memory Reader, Sample Buffer and Output Unit These three units work together to help achieve the functionality of the DMC channel. The sample buffer is used to store one byte samples of sound from the currently playing sample (with the data retrieved from memory). The sample buffer is populated via the memory reader whenever it is emptied. The output unit continuously outputs complete sample bytes to the mixer. Figure 18: A visual representation of the DMC channels operation (Mixer image, see (27)) 2.4.4.2.9. Frame Counter The frame counter acts as a master unit, periodically clocking various channels and other units. It has two sequences at its disposal, each of which clocks different units at different steps within the sequence. For example, if the first sequence is chosen and the frame counter is clocked, it will clock all envelopes and the triangle‘s linear counter. When the frame counter is clocked again, 39 it will clock all length counters and sweep units. It continues in this fashion, looping back to step 1 once the end of the sequence is met. This is an extremely important unit, playing a major part in ensuring that the sound is synchronised correctly. 2.4.4.2.10. Status Register The status register provides a means of enabling/disabling audio channels and querying the current playing status of the channels. It also allows for various interrupt flags to be read. 2.4.4.3. Channels 2.4.4.3.1. Mixer Output from all channels is sent to the mixer. The mixer then combines the inputted frequencies so as to produce a single output value. This is the value that is used for playing the sound. 2.4.4.3.2. Channel Unit Usage Each channel uses several of the units defined above. A summary table follows: Unit Timer Length Counter Envelope Sweep Linear Counter Shift Register w/ feedback Memory Reader Sample Buffer Output Unit Frame Counter Status Register Pulse X X X X Channels Triangle Noise X X X X X DMC X X X X X X X X X Table 2 : A summarisation of the use of "channel units" in the sound channels 40 X X X X X 2.4.4.3.3. Pulse The Pulse channel outputs a pulse wave and is capable of outputting one of four waveform sequences at a time. The Pulse channels have four registers at their disposal which allow you to alter the outputted value in several ways: The waveform sequence to be used for output can be specified The frequency that the sound should be played at The amount of time that the sound should be played for (or indefinitely) The volume that the sound should be played at (or constant) The rate that the channels frequency should be increased/decreased over time (if at all). The two Pulse channels only differ in the way that periodic frequency shifting (sweep unit) is calculated. 2.4.4.3.4. Triangle The triangle channel outputs a pseudo-triangle wave. The triangle channel has access to three registers which allows alteration of output in the following ways: The frequency that the sound should be played at The amount of time that the sound should be played for (or indefinitely) The triangle channel also contains an additional counter to allow for the output to play for a longer period of time than the Pulse channel. 2.4.4.3.5. Noise The noise channel outputs pseudo-random 1 bit noise at 16 different frequencies. It has access to three registers used to achieve the following capabilities: The amount of time that the sound should be played for (or indefinitely) The volume that the sound should be played at (or constant) The frequency that the sound should be played at The mode to be used: o Long mode - 32767 bit long sequences o Short mode - 93 bit long sequences 41 2.4.4.3.6. Delta Modulation Channel (DMC) The DMC channel outputs samples retrieved from memory. It uses four registers as well as a memory reader, sample buffer and output unit (described above) to achieve its goals: The frequency that the sound should be played at The ability to directly load the values to be output to the mixer The ability to loop the sample The ability to use the CPU‘s IRQ interrupt mechanism to make playback more flexible. as well as the basic ability to stream samples to the mixer. Note Complex details concerning how the APU channels work are not included in this report (for example, the exact sequence information of the Frame Counter and the lookup tables used by various channels). However, this information is freely available from (28) (26). 42 2.4.5. Input/Output (2) (27) 2.4.5.1. Ports The NES has two input ports (both of which can be read simultaneously) and an expansion port underneath the system. The expansion port can be written to via the lowest three bits of register 4016 The state of the input devices is accessible via the reading of the bottom five bits of 4016 for the first device and 4017 for the second. Reading from the expansion port is also possible from either 4016 or 4017. 2.4.5.2. Standard Control Pad The most widely used input is the standard rectangular control pad. This is the input device which will be emulated (emulation of further devices is a possibility if time permits). The discussed (annotated) control pad can be seen below: Figure 19: An annotated Standard NES control pad (unannotated image from (2)) 43 2.4.5.2.1. Determining Input State In order for programs to be interactive, there needs to be a mechanism for determining the actions the user has performed so that a response can be made. Inputs from the standard control pads achieve this by requiring a read of the appropriate register (4016 for pad 1, 4017 for pad 2) for each of the pads buttons. That is: Read # Button 1 A 2 B 3 Select 4 Start 5 Up 6 Down 7 Left 8 Right Table 3 : The state of each button is obtained by reading the controller registers a certain number of times Before these reads can be made however, it is necessary to strobe the controllers. 2.4.5.2.2. Strobing When the controllers are strobed, their current status is stored. It is then obtainable via the method above. Thus, strobing is required before every read of the controller state. Initiating a strobe is achieved by writing a ‗1‘ to the lowest bit of the 4016 register. Writing a ‗0‘ to this bit then stores the states and allows them to be read. If 4016 is read after ‗1‘ has been written, it will return the current state of the A button (not the state stored upon a complete strobe). Figure 20: Controller Strobe State Diagram 44 2.4.6. Development Tools 2.4.6.1. Debugger The debugger should have the following features: 2.4.6.1.1. Register State Pane 2.4.6.1.1.1. Functionality 1. CPU Register values 2. PPU Register values 3. PPU State a. Base Name Table b. VRAM address increment amount c. Sprite Pattern Table Address d. Background Pattern Table Address e. Sprite Size f. NMI generation active status 4. Scroll Values a. X Tile b. Y Tile c. X Fine Positioning (X position within the current tile) d. Y Fine Positioning (Y position within the current tile) The Register State Pane should display information about the various registers in the NES as the program executes. This includes both displaying the register contents as held in memory (1 and 2) and displaying state based on interpreting certain bits of said registers (3 and 4). It is hoped that this information will prove helpful in the debugging of applications. For example, use of the scroll values to help correct any scrolling related issues in the users' code. 45 2.4.6.1.2. Breakpoints Pane 2.4.6.1.2.1. Functionality 1. Break upon NMI 2. Break upon BRK 3. Register value a. Equals b. is Greater Than c. is Greater Than or equal d. is Less Than e. is Less Than or Equal 4. Remove Breakpoints 5. Remove all Breakpoints The most important function here is 3, allowing the user to break execution upon a particular register meeting a condition. Most likely used much of the time to stop execution at a particular line of code (PC equals …). 2.4.6.1.3. Tools Pane 2.4.6.1.3.1. Functionality 1. Step Executes one instruction and then pauses execution. 2. Resume Leaves Step mode and continues execution only breaking when the next breakpoint is met. 3. Seek PC Jumps to and highlights the instruction held at a given location in memory (see code pane). 2.4.6.1.4. Code Pane This should display the code of the currently loaded ROM. This will require a disassembler to be written. 46 2.4.6.1.5. Memory Pane This should display the contents of the CPU and PPU memory. The memory should be divided into sections (Stack, Zero Page etc) and also be divided by Unit (CPU and PPU). 2.4.6.2. Pattern Table Viewer 1. The tiles stored in the pattern tables should be visible. 2. The palette colours used by the sprite and image palettes should be visible. 3. Tile Information: a. Pattern Table Number b. Tile Number (in the order the tiles are stored in the tables). 4. Palette Information: a. Which Palette (image or sprite). b. Entry number within the palette. c. Entry number within the master palette. 5. Display options. a. Automatic Table refresh? b. Table refresh rate (between 0 and 5 seconds). 2.4.6.3. Name Table Viewer 1. The images stored in the name tables should be visible. 2. The attribute table data should be visible. 3. Name Table numeric data display. This should allow the user to view the actual data stored in the name tables in an accessible form. That is, the data should be displayed in a grid to make it easier to identify particular parts of the name table than if just a flat display of memory were provided. 4. Display Options a. Scroll Lines should be visible on the name tables, representing the scroll register visually. b. Automatic Table refresh? c. Table refresh rate (between 0 and 5 seconds). d. Two bit colour display – displays the tables if the attribute bits were not added to the pattern table tiles which make up the name tables. 47 2.4.7. Further Specifications In addition to the analysis above, several other aspects of the system have been specified: Cartridge Specification Documenting the workings of the NES cartridges (containing the executed software) File Format Specification A header appended to the beginning of the ROM files is necessary to execute them on foreign hardware. Regional Differences Specification NES hardware varies based on region. These differences are discussed here. 48 3. Design 3.1. Overall System Design The system can be broken down into several logical units. This overall structure is detailed in the following sections. 3.1.1. Packages Figure 21: Package Diagram Note that the above diagram only indicates the important package dependencies. 49 3.1.2. Classes 3.1.2.1. CPU The CPU uses the Memory package to provide objects to represent the whole CPU memory system (one object per memory type – e.g. Stack, CartRAM etc). The Memory package is discussed in detail later in the report. 3.1.2.2. PPU The PPU holds a reference to a Palette object. The Palette object allows the easy manipulation and recalculation of Palette colours. It also provides an easy means of updating the palette entries for the image and sprite palettes. 50 3.1.2.3. Inputs Input devices for the NES follow a common format. Thus, an interface would seem desirable to allow the generalization of all inputs (allowing simpler, better structured code). 3.1.2.4. APU Figure 22: APU Class Diagram 51 3.1.2.5. GUI Figure 23: GUI Class Diagram 3.1.2.6. Development Figure 24: Development Class Diagram 52 3.2. CPU Design 3.2.1. General (29) The general method to be used for emulating the CPU core is common. It will consist of an infinite loop which continues to run until the user chooses to halt the execution. The body of the loop will have two main purposes: 1. Executing the 6502 instructions of the running program 2. Dealing with the cyclic tasks which must be performed. Cyclic tasks are those tasks which need to be performed periodically. Examples include the refreshing of the screen and input state. The CPU will execute machine instructions for a pre-determined number of CPU cycles before breaking to deal with the aforementioned cyclic tasks. An integer interrupt period will be used to keep track of the number of cycles left to be executed. This variable will be decremented by a number of cycles after each instruction is executed. Once the cyclic tasks have been dealt with, the interrupt period will be restored. The interrupt period should be set to be the biggest common divisor of the number of cycles required for each task. Booleans will be used to allow the CPU to be paused and stopped. If the CPU is paused, it will simply not execute any further instructions (whilst remaining in the loop) until the CPU is un-paused. To aid efficiency, the CPU will only be able to be stopped (breaking out of the loop) once the interrupt period is exhausted. 53 It should be noted that this efficiency measure results in a quirk in the way the CPU runs. It is necessary to ensure that the CPU is not paused before being able to stop it: Figure 25: CPU State Transitions appendix F (1.1) 3.2.1.1. Data Mirroring Because the NES mirrors much of the data stored in memory, addresses sometimes need to be converted into physical addresses before they are used. This can be achieved by performing a logical AND between the address and the maximum physical address for that type of memory. Pseudo code illustrating this follows: int: actualAddress; If (address < 0x2000) { // Mirroring of zero page, stack and the CPU RAM actualAddress = address & 0x7FF; } else if (address < 0x4000) { // Mirroring of PPU I/O registers actualAddress = address & 0x2008; } else { // No other memory types involve mirroring actualAddress = address; } 54 3.2.2. Memory Different ranges of memory addresses provide different effects and types of values when written to/read from. For example, some memory cannot be written to and some change the behaviour of other units (such as the PPU) when written to. In order to handle this, several classes will be created, each representing a different range of memory. All classes will inherit from a Memory super-class. All memory will be represented as arrays of primitive integers inside Memory objects. This scheme should also help to simplify the implementation of memory mappers (MMC‘s). The size of the memory in bytes will be specified to the Memory objects‘ constructor. The integer array representing the memory for that range of memory will then be created. For several of the Memory classes, this will be all that is required as they inherit the reading/writing to memory behaviour from their super class ―Memory‖. Exceptions are where a range of memory should not be written to or some other behaviour should be performed instead. These exceptions are: Cart and Expansion ROM should not be written to (override writeToMemory with an empty method) Reading and writing to the IO should instead read and write to the PPU The Stack will require two additional methods: push and pop. Figure 26: Memory Class Hierarchy appendix F (1.2) 55 3.2.3. Registers 3.2.3.1. General All registers shall be stored as primitive integers. The program counter will be stored as a single 16-bit value instead of two 8-bit values for simplicity and efficiency. 3.2.3.2. Status Register The status register must be kept up to date at all times to ensure that programs are executed correctly. The CPU itself manipulates the overflow, carry, zero and negative flags (although user code can switch all other bits in the register). Efficient ways of setting/clearing these flags have already been developed and are in wide use. These methods are explained below (with appropriate referencing). 3.2.3.2.1. Zero and Negative Flags A common approach to dealing with the setting/clearing of the Zero and Negative bits is the use of a lookup table(30). This method ensures efficiency. The lookup table is typically stored as an array with 256 elements. When a registers‘ value is changed, its new value is used as an index into the array and a logical OR is performed against the returned value and the status register. This ensures that the negative and zero flags are always set to the correct values (as long as the register used is 8-bit using 2‘s complement format). Status Register |= znTable[Register] 002, 000, 000, 000, 000, 000, 000, 000, 128, 128, 128, 128, 128, 128, 128, 128, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, Table 4: A lookup table allowing efficient setting of the Zero and Negative bits (znTable) 56 000, 000, 000, 000, 000, 000, 000, 000, 128, 128, 128, 128, 128, 128, 128, 128 3.2.3.2.2. Overflow Flag An efficient way of determining whether the overflow flag should be set/cleared is by performing an Exclusive OR between the carry-in and carry-out of bit 8 of the register whose value was just changed.(17) This fact was used to create the efficient overflow code below (30): If (comp (accumulator EOR Register value) & (accumulator EOR Register value) & 0x80) != 0 { Status register |= bit 8; // Set overflow bit } // comp RHS = The compliment of the RHS. I.e. all bits are inverted. // & = Logical AND 3.2.3.2.3. Carry Flag The carry bit can be set if either of two conditions are met. 1. If addition is taking place, carry has occurred if the result of the addition is more than 255. If result > 255 { Status register |= bit 1; // Set carry bit } // LHS |= RHS = Shorthand for LHS = LHS (logical OR) RHS 2. If subtraction is taking place, carry has occurred if the result of the subtraction is more than or equal to zero. If result >= 0 { Status register |= bit 1; // Set carry bit } 3.2.3.2.4. Other Flags All other flags are simply set/cleared by use of certain instructions. 57 3.2.4. Addressing Modes (17) (2) (19) This will involve the implementation of several methods; each one emulating the function of one of the 6502‘s addressing modes. Each of the addressing modes manipulate and use the addresses given in different ways (for example, one takes two eight bit addresses and forms a sixteen bit address, another simply uses the eight bit address given as is). appendix F (1.3) All methods will be situated in the main CPU class. Most will make use of the getMemory() utility function (discussed later) in order to easily obtain the data from memory. The methods will simply return an integer value of the address to be used. 3.2.5. Opcodes All opcode execution will be dealt with via a simple switch/case setup. switch (opcode) { case (0x00) : // Code to emulate opcode execution Case(0x01) : // … } The number of cycles that each opcode take to complete varies. A lookup table will be implemented where the cycle number can be determined by using the opcode value as the index. This will be implemented as a 256 element primitive integer array (providing space for information about all opcodes, documented or not). 58 3.2.6. Interrupts Upon being triggered, the interrupts simply carry out a number of actions on the systems memory and registers. These actions are best explained in the form of high level pseudo code below: 3.2.6.1. IRQ‘s void : IRQ() { // Returning from the interrupt is achieved via the RTI opcode. if (interrupt flag is NOT set) { Clear Break Flag Set Interrupt Flag Push PC onto Stack Push Status Register onto Stack PC = loadWord(0xFFFE); } } 3.2.6.2. NMI‘s Void : NMI { // Returning from the interrupt is achieved via the RTI opcode. if (interrupt flag is NOT set) { Clear Break Flag Set Interrupt Flag Push PC onto Stack Push Status Register onto Stack PC = loadWord(0xFFFA); } } 3.2.6.3. BRK‘s void : BRK() { // Returning from the interrupt is achieved via the RTI opcode. if (interrupt flag is NOT set) { Set Break Flag Set Interrupt Flag Push PC onto Stack Push Status Register onto Stack PC = loadWord(0xFFFE); } } When the RTI instruction is encountered (the interrupt has come to an end), the status register and PC are popped off the stack and execution continues as before. 59 3.2.6.4. System Reset void : reset { Set the interrupt flag PC = loadWord(0xFFFC); Reset all registers and memory } 3.2.7. Direct Memory Access (DMA) This will be implemented as a method within the CPU class. It will take the address to begin reading from as a parameter (0x100 should be added to it) and will simply fill the Sprite Memory by simply looping through the CPU memory, copying 256 consecutive values across. For the sake of efficiency, the data should be written to a temporary array within the CPU class before passing this array across to the PPU. The PPU sprite memory object (an integer array) will then be made to hold a reference to this new array. This will save on the method invocation overhead of writing each byte to sprite memory individually. appendix F (1.4) 3.2.8. Utility Methods A number of utility methods should be written so as to simplify implementation and eliminate the duplication of code: 1. branch(int : branchOpCode). This method will deal with determining whether or not branching should occur from one memory location to another dependent on the state of PPU register flags. It returns a Boolean. 2. checkPageBoundary(int : oldAddress, int : newAddress). It is often the case that if memory operations cross the current page boundary (each page being 256 bytes) an additional cycle is necessary for computation. This method will return a Boolean indicating whether a page boundary has occurred. 3. loadWord(int : lowestByte). This is a convenience method which returns a 16-bit value when given the lowest byte of the value. 4. getMemory(int : memory). This method will return the data stored at the memory address given. In order to achieve this, it will need to take account of memory 60 mirroring and will need to determine which section of memory the address corresponds to (which Memory object the data should be obtained from). appendix F (1.5) 61 3.3. PPU Design 3.3.1. General The execution of the PPU will be dealt with on a per scanline basis. That is, the graphical data will be written to the display one line of pixels at a time. Specifically, the process will take the following form: 1. When the CPU has executed a defined number of cycles, all cyclic tasks will be dealt with (see CPU design). 2. If the number of cycles required for the output of a scanline has been met, the PPU will generate and output the next scanline. 3.3.2. Registers The first three registers of the PPU represent how various aspects of the PPU behave. In general, each bit of each register determines the behaviour of one aspect. For example, one bit in PPUMASK determines whether background rendering should occur. Another specifies whether the display should be rendered in greyscale or not). These PPU behavioural aspects are to be stored as simple variables within the PPU class. The updating of these values will be dealt with within a single method (updateState()). updateState() should be called just prior to performing any rendering to ensure correct rendering. Details of the effects of manipulating the bits of these registers are provided below. 62 Figure 27 : Changing the values of certain sets of bits alters image output (21) 63 All the below registers are to be represented as integer variables. The behaviours required for reading and writing to each register are to be handled via individual methods within the PPU class. 3.3.2.1. OAMADDR ($2003) As this register is simply used to specify the address to be read from or written to in Sprite Memory when using the OAMDATA register, it only requires a simple ‗setter‘ method. 3.3.2.2. OAMDATA ($2004) Represented by two methods: one for reading and one for writing. Reading will simply return the data present at the given address (OAMADDR) in Sprite Memory. Writing will write the data given to Sprite Memory at the given address (OAMADDR) and then increments OAMADDR. 3.3.2.3. PPUSCROLL ($2005) and PPUADDR ($2006) These two registers share an address latch (in one of two states). What occurs when writing to the registers depends on the state of this latch. The latch will be represented as a Boolean. A simple If construct will be used to determine what writing to these registers will do. Writing to the registers will set either the top or bottom 8 bits of the register being written to. 3.3.2.4. PPUDATA ($2007) Reading and writing to this register will simply set or retrieve the memory location in PPU memory specified by PPUADDR. All Registers. Appendix F (2.1 – 2.6) 64 3.3.3. Start-Up Upon starting up the PPU (and on resets), both the PPUCTRL and PPUMASK registers should be set to 0 (disabling NMI‘s and rendering). PPUCTRL = 0x00; // disables NMI’s. PPUMASK = 0x00; // disables rendering. 3.3.4. Memory 3.3.4.1. General The PPU memory will be represented as simple arrays for each part of memory. It does not require the additional structure that is present for CPU memory as all reads and writes are handled through registers. Thus, the registers are able to handle read and write issues before accessing the memory. 3.3.4.2. Colour Palettes A near perfect algorithm for determining the colours used in the NES master palette which also supports alterable hue and tint was devised by Kevin Horton (31). This mathematical algorithm has been implemented in Java by David de Niese (32) with additional palette manipulation possibilities (colour emphasis and a black and white display – both necessary for a complete emulation). Nieses' code was distributed under the GNU General Public License allowing for the code to be used in other projects. His code will be reused in this project and comments will be put in place to indicate ownership. 3.3.4.2.1. Master Palette A master palette which is used by the image and sprite palettes will be maintained. The palette will be stored in a separate Palette class, allowing for the palette to be passed around the system by reference (allowing easy updating and data consistency). Representing the palette like this will also allow very easy modification of palette Tint and Hue. 65 3.3.4.2.2. Image and Sprite Palettes The image and sprite palettes are to be represented using two arrays for each: 1. The first will store the location of the colour in the master palette, 2. The second will store the integer value representing the colour in RGB (using the values in the first array). When a pixel is to be rendered to the display, the RGB arrays will be updated. This will allow for colour information to be maintained efficiently, only requiring updates to the information when it is needed. appendix F (2.7) 3.3.4.2.3. Scrolling and background rendering Scrolling will be implemented using the method set out by Loopy in his document ―The skinny on NES Scrolling‖ (33) detailed in the analysis. 3.3.4.2.4. Sprite Rendering The following algorithm details a methodology for dealing with the slightly altered sprite rendering routine discussed in the analysis. 1. Using byte 0 of the sprite memory, determine whether the sprite is on the current scanline. 2. If Yes: a. copy the next three bytes and use them to render the sprite to the display. b. If the number of sprites rendered to the display >= 8, set the sprite overflow flag. 3. If No: a. If all sprites have been evaluated, finish b. If some sprites remain, return to 1. appendix F (2.8) 66 3.3.4.2.5. Name Table Mapping Depending on the value of a bit in one of the PPU status registers, different name table mapping modes may be used (horizontal or vertical). As mentioned above, the state of this bit will be stored in a variable (in this case, a Boolean). Where data is stored during writes to the name table memory depends on the state of this variable. This will be handled via simple nested Ifs. appendix F (2.10) 67 3.4. APU Design 3.4.1. Common Features Several aspects of the design will be common for most units. These will be discussed here. 3.4.1.1. Clocking The NES APU implementation uses clocking between the units in order to handle the outputting of sound. It would seem to be a good practice to use when emulating the system also. All the APU units will include a clock() method which performs the relevant clocking activities for the unit. 3.4.1.2. Resets It must be possible to reset all units of the APU. This is in the case that a ROM is reset or a new ROM is loaded. 3.4.2. Interfaces Due to the common behaviour of the APU units, it would seem appropriate to define an interface which all units will implement. This is added less for functional advantages than to highlight the common behaviours of the units within the code. Interface APUUnit { public void clock(); public void reset(); } 3.4.3. Divider The dividers‘ purpose is to output a clock every time a counter reaches zero, allowing control of the duration of sounds among other things. It would seem reasonable to have three methods and an integer counter to implement this unit. The three methods would allow: 1. Clocking (decrementing the counter and outputting a clock whenever it reaches zero) 2. Forcing a Divider Reload 68 3. Changing the Dividers period (the number of clocks the counter requires before it reaches zero). appendix F (3.1) 3.4.4. Sequencer The sequencer simply runs through a sequence of values. When the sequencer is clocked, it will move control to the next element of the sequence. Once the end of the sequence is reached, control is returned to the head of the sequence. This can be implemented straightforwardly using a variable to hold the sequence number and an array structure to hold the sequence. Continuous looping of the sequencer can be handled via modulo on the length of the sequence. The Sequencer will be implemented in a general way, allowing for the class to be extended, substituting in alternative values for the Sequencers‘ sequence. Figure 28: Sequencer Example Generalisations This setup will be used to create the Sequencers for the Triangle and Square waveform channels. 69 appendix F (3.2) 3.4.5. Timer The timer is nothing more than a divider under another name. 3.4.6. Envelope There exists only one Envelope in the APU, its purpose being to control the volume of the sounds output by the waveform channel. The operation of the Envelope is summarised in the below Activity Diagram. The implementation shall follow this closely. Figure 29: Operation of the Envelope Unit 3.4.7. Sweep Only one Sweep unit exists within the APU, being used to adjust the periods of the Pulse channels. This period adjustment occurs when the internal divider outputs a clock (provided certain conditions are met). 70 The behaviour of the unit is described in the following two diagrams: Figure 30: Operation of the Sweep Unit (1) The reload flag is set every time the Sweep unit registers are written to. As a result, writing to these registers has the effect of restarting the unit for another sweep. Figure 31: Operation of the Sweep Unit (2) The above essentially modifies the Pulse Channel‘s period in such a was as to produce effects such as gradually increasing pitch of the output, allowing for more interesting sounds than the Pulse would allow alone. 71 3.4.8. Shift Register with Feedback Generates the bit sequences output by the Noise channel. Figure 32: Shift Register with Feedback Each run through of the above generates one bit of the bit sequence used by the Noise Channel. appendix F (3.3) 72 3.4.9. Frame Counter This unit will be implemented simply as a switch/case construct. The current step through the appropriate sequence will be kept track of and units will be clocked based on the value of this step counter. Switch (step) { case 1: Clock Envelopes and Linear Counters case 2: Clock Envelopes, Linear Counters, Length Counters and Sweeps. … appendix F (3.4) 3.4.10. Mixer The mixer is to be implemented simply as two lookup tables (28) (which use the outputs of the channels as the lookup values). This closely approximates the actual working of the mixer. This method was chosen over the slightly more accurate purely mathematical algorithm for efficiency: Lookup Tables pulse_table [n] = 95.52 / (8128.0 / n + 100) // 31 entry table – Used by the Pulse channel tnd_table [n] = 163.67 / (24329.0 / n + 100) // 203 entry table – Used by other channels Mixer Behaviour pulse_out = pulse_table [pulse1 + pulse2] // Pulse Channels Output tnd_out = tnd_table [3 * triangle + 2 * noise + dmc] // Other Channels Output output = pulse_out + tnd_out // Mixer Output Figure 33: Mixer Formula 73 3.4.11. Channels Both Pulse and Triangle waveform channels require sequencers. As noted above, their sequencers will extend a core Sequencer class. Pulse 0 01000000 1 01100000 2 01111000 3 10011111 Triangle 15, 7, 0, 8, 14, 6, 1, 9, 13, 5, 2, 10, 12, 4, 3, 11, 11, 3, 4, 12, 10, 2, 5, 13, 9, 1, 6, 14, 8, 0, 7, 15 Figure 34: Waveform Sequences All three waveform channels share a common design, being tweaked in certain ways to provide the different behaviour required. Each channel contains a Timer. This holds the number of clocks from the Frame Counter required before the sequencer held by the channel is clocked. Thus, whilst the Timer is more than 0, the channel will be performing its operations on the same element of the sequencers' sequence. When the channel is clocked, it will start receiving the next value from its related sequencer. When the Timer reaches zero, it is reset. Each channel also contains a length counter which dictates how long the channel should produce output. When the counter reaches zero, the channel will no longer output to the Mixer. The below diagram illustrates the common behaviour of the channels pictorially: 74 Figure 35: Waveform Channels (General Case) The way in which the channels vary will now be discussed. 75 3.4.11.1. Pulse Channel This channels‘ sequencer is more complicated than the others. It contains four sequences. Which is used depends on the Duty value given (0-3). It is assumed that a Duty value has been specified and the channel is using a particular sequence. An additional requirement for output is required in step 4 of the above diagram. The sequence value currently used must not equal 0. If it does, nothing is output to the Mixer. In step 5, the value output the Mixer if it reaches this far is the current envelope volume. Figure 36: Pulse Channel 76 3.4.11.2. Triangle Channel This channel contains an additional timer: the linear counter. This allows for the Triangle channel to output sound data to the Mixer for longer than the other waveforms. Thus, an additional condition must be added to step 4: that the linear counter > 0. This channel does not contain an envelope. Instead, it outputs the actual sequence value. Figure 37: Triangle Channel 77 3.4.11.3. Noise Channel This channel produces random noise. To achieve this, it contains a shift register which is manipulated in such a way as to produce apparently random activity. This behaves like the sequencer where bit 0 of the register is taken to be the current sequence value. In step 2, rather than clocking a sequencer, this shift register is clocked. In step 4, the additional condition that bit 0 of the Shift Register must be > 0 is added. When step 5 is reached, it is the current envelope volume which is output. Figure 38: Noise Channel Pulse, Triangle and Noise Channels. Appendix F (3.5 – 3.7) 78 3.4.11.4. DMC Channel Most of the detail of how the DMC channel behaves internally (see analysis) can be disregarded for implementation. Loosely, the channel performs as follows: Figure 39: DMC Channel Operation The Sample Address and a ―Bytes remaining‖ variable should be incremented every time sample data is returned from the Memory. 79 3.5. Input/Output Design 3.5.1. Structure All inputs to the system follow a common pattern. Thus, they can be represented with a common structure. Because of this, using a simple interface would seem to be desirable as the handling code can then be more generalized. Interface InputDevice { int : read(); void : write (int : data); } 3.5.2. Determining Input State Upon being strobed, the controllers on the real NES store the state of each button in an internal 8-bit shift register (one bit per button state). Each read of the controller returns the lowest bit of this shift register. This register is then shifted to the right by one each time. It shall be implemented here in a near identical fashion using a simple integer variable as the shift register. appendix F (4.1) 80 3.6. Development Design 3.6.1. Debugger 3.6.1.1. Register State This can be handled via the use of simple text fields and labels along with accessor methods in the CPU and PPU classes. It would seem to be a good idea to have an update method which updates the state of all the text fields when called. The method could then be called periodically to keep all fields up to date. 3.6.1.2. Breakpoints The breakpoint system is to be handled by allowing the CPU to be in one of two modes: Normal Mode Step Mode Figure 40: CPU Modes Modes will be set using a simple Boolean in the CPU. When the CPU is in Normal mode, it will behave as detailed in the CPU sections of this report. When in Step mode, it will execute one instruction and then pause the CPU. The user then chooses to either ‗step‘ through execution or ‗continue‘ execution. Continuing execution will return the CPU to Normal mode. Stepping through execution will leave the CPU in Step mode but will un-pause the CPU. The CPU will execute one instruction and then pause again. 81 Figure 41: CPU Step Mode Execution Each breakpoint is to be represented as a Breakpoint object. This object will encapsulate all the information required (register, condition and address). All breakpoints will be stored in the BreakpointHandler (in a suitable Collection). This class will provide the "add", "remove" and "remove all" functionality as well as being used to determine whether execution should break at a given point (by iterating over the Breakpoints). The Register and Conditional information is to be stored in the form of enumerations to ensure code is as readable as possible and to allow the use of a simple Switch/case construct when checking breakpoint state in the BreakpointHandler. 82 Figure 42: Breakpoint System Design The above setup should be reasonably extensible, allowing simplified further development. 3.6.1.3. Tools ‗Step‘ and ‗Continue‘ functionality will be simple to implement. It will simply require setting for Step and clearing for Continue the Step Boolean in the CPU. The Breakpoint system in place above will take care of the rest. The ‗Seek PC‘ functionality will be handled by Focus handling methods of the JList object in Java (see GUI Design). Users will be permitted to enter the desired address with or without the ‗0x‘ Hex prefix. 83 3.6.1.4. Code This will require the implementation of a disassembler. When assembled, each opcode is represented in the ROM by an integer value (as are all the operands). It is possible to determine the opcode used by this value. The disassembler will make use of many OpCodeInfo objects. One will exist for each Opcode of the CPU. These objects are to hold information (pertinent to the disassembly process) about the opcode they represent. They are to be stored in an array in ascending order of opcode number. This will allow for easy retrieval of opcode information for each assembled opcode by using the assembled opcodes numeric value as an index into the array. This retrieved information can then be used to rebuild the original code (as best as possible). The addressing mode will be used to determine how the data should be interpreted and the reassembled code formatted. Figure 43: Disassembler Design 3.6.1.5. Memory This will be simple to implement, requiring accessor methods in the CPU and PPU to retrieve the memory. For efficiency, memory should only be updated in the debugger when it is visible to the user (see GUI Design). 84 3.6.2. Pattern Table Viewer 3.6.2.1. Tiles Display The technique that must be employed to form the pattern table tiles is discussed in the analysis. The algorithm to be employed is detailed below: Figure 44: Pattern Table Bit Combination Algorithm Essentially, it runs through each pair of bytes in the pattern table memory, combining their bits to make up the pattern table tiles. 3.6.2.2. Palettes Display These are obtainable from the Palette object. 85 3.6.2.3. Tile and Palette Information The tile and palette entry numbers can be derived by performing some simple maths on the images on display (see GUI Design). For example, say that each tile is 8x8 pixels. Divisions by eight can be employed to determine the number of the tile currently hovered over. Whether the display should be refreshed or not can be handled simply by use of a Boolean. The refresh rate can be implemented by requiring that the thread executing the viewer sleep for the determined time before updating the display. 3.6.3. Name Table Viewer Screen refresh controls can be handled in the same way as for the Pattern Table Viewer. 3.6.3.1. Name Table Rendering Rendering should be handled using the following abstracted model: Figure 45: Name Table Rendering Model Once both the tile data and attribute bit for the current name table element is retrieved, it is possible to display the tile on screen in full four bit colour. Rendering the table in two bit colour is a simple case of not applying the attribute bit when displaying on screen. 86 3.6.3.2. Scroll Line Display This can be achieved by obtaining the X Tile, X Fine, Y Tile and Y Fine values from the PPU and simply adding the X‘s together and the Y‘s together: Scroll X = X Tile + X Fine Scroll Y = Y Tile + Y Fine This will display the scroll lines at the exact pixel locations of the scroll. 3.6.3.3. Attribute Table Information Obtainable via access to the PPU. 3.6.3.4. Name Table numeric data display A mock up of this display is available in the GUI design. Again, the name table data is easily accessible from the PPU. The data can be displayed in grid formation by applying some simple maths to Java‘s graphics system. drawLineX draws a solid line from the top to the bottom of the display given an X Coordinate. drawLineY draws a solid line from the left to the right of the display given a Y Co-ordinate. Neither method exists in Java. They are provided here to simplify the code. int : tileSize = 24; for (i = 0 to 32) { // 32 tiles across drawLineY(i*tileSize); } For (i = 0 to 30) { // 30 tiles down drawLineX(i*tileSize); } The Name Table values can be drawn in the appropriate places using a similar method. 87 3.7. GUI Design 3.7.1. The main window Figure 46: The main window GUI Designs 88 The Set Keys menu item and Sound menu will result in the same tabbed window being brought up. Depending on the item selected, the appropriate tab will be displayed. The Graphics menu will not initially do anything. Graphics options may be incorporated if time permits. 3.7.2. The Settings Menu Figure 47: The Key Mapping Tab Figure 48: The Sound Tab 89 Figure 49: The Graphics Tab The settings menu consists of all input and sound options split into appropriate tabs. 3.7.2.1. The Key Mapping Tab Changing the input device in the drop down box will change the panel below it to display the input options available for that device. Unless time permits, the only device available here will be the standard controller. The implementation of the key mapping will be based on a solution suggested in the book ―Developing Games in Java‖ by David Brackeen (34). 3.7.2.2. The Sound Tab The sound tab allows you to change the sound volume. This can be changed either for all sound channels at once or for each individually. It is also possible to mute channels or all sound. 90 3.7.3. The Debugger Figure 50: The Debugger The debugger works as described in the analysis and design. 91 3.7.4. The Name Table Viewer Figure 51: The Name Table Viewer Clicking the buttons in the ―Name Table View‖ will result in the appropriate name table data being displayed on screen using the GUI shown below. Figure 52: The Name Table Number View 92 3.7.5. The Pattern Table Viewer Figure 53: The Pattern Table Viewer Tile and Palette information should be displayed for the tile or palette entry currently hovered over by the mouse. Note: The Pattern table viewers‘ GUI design is based heavily on the Pattern Table viewer of FCEUXD. 93 4. Implementation 4.1. Common Tactical Policies 4.1.1. Package Naming This project will use the package naming conventions documented in the Java Language Specification (Third Edition) (35) in an attempt to eliminate package name conflicts. The package hierarchy used will be as follows: uk.ac.sussex.drm24 4.1.2. Commenting Policy In an effort to make the code clear, two types of comments will be used. Comments of the following form will be used to describe each major block within the code. For example, if a class contains several lookup tables, they will be labelled as such above the first table within the code: //===================================================================== // LOOKUP TABLES //===================================================================== For code within these blocks, normal commenting will be used. For example: // This variable represents the Accumulator. or /* * This variable represents the Accumulator. */ 4.1.3. Debugging Conventions Any software project invariably requires a great deal of debugging. In an effort to allow for the insertion of debugging code without it becoming intrusive or resource hungry, a simple debugging system will be put in place. This will involve the creation of a Debug class containing many Boolean class variables. Each of these variables will correspond to a certain class of debug code. This will allow for large chunks of debug code to be switched on or off with ease. This class will also contain any methods which carry out a function helpful to the debugging process (also class level). 94 For Example: Class Debug { // When true, any general CPU debug code is executed. public static final boolean CPU_DEBUG = false; // When set, any PPU debugging information will be shown. public static final boolean PPU_DEBUG = false; // Debugging specific to display rendering. public static final boolean PPU_RENDERING_DEBUG = false; } // Generic debug method given as example. public static final debugHelp() {} When some debug code is added, the appropriate flag must be checked before carrying out the actions contained within. For example, if it desired to place a message when the PPU starts, this would be achieved as follows: if (Debug.PPU_DEBUG) { System.out.println(“PPU Started”); } It should be noted that implementing the debugging in this way will not result in any additional overhead to the execution time. This is because the Java compiler will exclude any debug code whose flag is clear and exclude the conditional from the debug code should the flag be set (this code will not be present in the compiled byte code). It can do this safely due to the ―final‖ status of the flags. It is not possible to reassign their value. 4.1.4. GUI Structure The project GUI‘s are to contain a great many listeners (to listen for user interaction with the GUI). While there are several possible ways of writing listeners, the method to be used in this project is to make each listener an inner class of the GUI for which it is listening. For the most part, each component will have its own listener rather than dealing with many within one. This should help to keep the class hierarchy manageable and ensure the code is clear respectively. 95 4.1.5. JavaDoc Documentation The important methods and fields in the project will be documented in such a way as to allow automatic JavaDoc documentation generation. The benefits of this will be two-fold. Firstly, the documentation will make it easier during development, allowing viewing of just the important information from each class. This will make it quicker and easier to find desired information. Secondly, the documentation will prove helpful if other developers wish to expand or modify the code at a later date. 4.2. Software Re-use Small pieces of code have been incorporated into the software from other projects. In most cases, this was in an effort to not ―re-invent the wheel‖. All code used was released under the GNU General Public License Version 2. Palette colour generation routine – Documented within the PPU design. Code to carry out low pass filtering and note smoothing on the audio data as it passed through the mixer was used from David de Niese‘s NESCafe emulator. (11) Two lookup tables within the CPU – the first used to set/clear the values of the zero and negative flags given a value, the second to determine the number of cycles a given instruction takes to execute. (11) An opcode information table was modified from Darron Schall and Claus Wahlers‘ CPU emulator. It was used in the Disassembler. (30) 4.3. Threading The software is multi-threaded. This implementation decision was made mainly out of necessity. 4.3.1. Thread Creation Because of the way the CPU was designed (executing within a never ending loop, except when stopped), it was necessary to thread it so as to allow the rest of the program to continue executing at the same time as the CPU. It was decided that whilst the CPU would be created at start-up, it would only begin instruction execution upon being started in a new thread. This new thread is created upon a ROM being loaded: 96 Figure 54: Loading a ROM creates a new CPU Thread When a Java application is loaded, all execution occurs in the ―Main‖ thread. If a GUI is created, an additional ―AWT‖ thread is created which is used to listen for user interaction with the GUI. Thus, when a ROM is loaded, there will be three executing threads. New threads are also created upon opening the name table and pattern table viewers. This is a necessity here also, using a similar means of execution as the CPU. Figure 55: Opening the Name Table and Pattern Table Viewers (excluding threads managed by Java). All created threads are members of the same Thread Group. 97 Note that the threads are all assigned to the ―NESThreads‖ thread group upon creation. This is to simplify the process of destroying the threads later (see below). 4.3.2. Thread Destruction When a ROM is closed, all members of the NESThreads thread group are destroyed. This is handled by setting a ―stop‖ Boolean which is checked for periodically in each thread (built-in Java methods for destroying threads are inherently unsafe and their use was avoided). It is ensured that all threads are destroyed before execution continues. Figure 56: All three created Threads are destroyed when a ROM is closed. 4.3.3. Thread Priorities In order to make the system as efficient as possible, thread priorities change dynamically to best suit the current situation. The Pattern table viewer, name table viewer, sprite viewer and CPU are all threaded. When the main GUI window has the focus, all threads except the CPU are given the lowest possible priority so as to allow the emulator to run at the best possible speed. All threaded elements of the system have an accompanying GUI (except the CPU). When one of these GUI‘s is hidden from view, the appropriate thread is set to minimum priority. 98 4.4. Outputting Sound (36) Due to the huge discrepancy between the frequencies that the NES APU and modern PC hardware output sound samples (the APU outputs approximately 1789772 samples a second as opposed to sample rates between 11,025 and 192,000 samples a second on PC). A sampling rate of 44,100 is used in this project. In order to bridge this gap, a sample was output every 41 APU outputs made up of the average of these 41 APU outputs. This figure was decided upon via the following calculation: 1789772 / 44,100 = ~41 Thus, there are approximately 41 APU samples output per 1 PC sample output. 99 4.5. Testing The tests below use this test specification. Each test is labelled using the references documented within this specification. 4.5.1. Pre-Written Test Files These files are ROM images written with the intent of testing specific aspects of the system. These are documented and referenced within the test specification. CPU CPU Timing Branch Timing CPU Operation CLI Latency CLI and Related Overflow PPU Pass Pass Pass Pass Pass Pass APU PPU General Scanline Rendering Sprite Overflow Sprite Hit PPU Miscellaneous Partial Fail Partial Fail Partial APU Miscellaneous Sound Test Partial Pass I/O Joypad Pass 4.5.2. Project Specific Testing 4.5.2.1. Performance Testing Partial Success. The frame rate is reasonable, running at speeds between about 47-53 frames per second. This does not quite meet the desired run-time speed. 4.5.2.2. Portability Testing Windows Ubuntu Mac Pass Pass Pass 100 4.5.3. Unit Testing Overall Load Exit CPU Pass Pass APU Volume Control Pause Stop Reset Pass Pass Pass I/O Pass Pattern Table Viewer Many Refresh PPU Pass Pass Graphical Effects 4 of 6 Name Table Mappings Pass Pattern Table Use Pass Debugger Key Mapping and input recognition Pass Register State Breaks Breakpoint Manager Step and Resume Seek PC Disassembled Memory Display Name Table Viewer Tiles and Scroll Lines Attribute Information Refresh Two Bit Colour Numeric Name Tables Pass Pass Pass Pass Pass 4.5.4. Integration Testing The need for integration testing is minimal as if all units work correctly independently; they will work correctly when combined. It is of utmost importance that the CPU operates correctly as all other major units rely heavily upon it (APU, PPU, I/O). 101 Pass Pass Pass Pass Pass Pass Pass 5. Conclusion 5.1. Finished Software Screenshots Figure 57: Illustrating the Background and Sprite Rendering Figure 58: Space Invaders in Action 102 Figure 59: Examples of software being emulated (TV Mode) Figure 60: Examples of software being emulated (Non-TV Mode) 103 Figure 61: Pattern Table and Name Table Viewers in action 104 Figure 62: Input and Graphic Settings Dialog 105 Figure 63: Two GUI Modes Figure 64: Frame per Second Indicator and Sprite Viewer 106 Figure 65: About (provides cartridge information) and Help menus 107 Figure 66: Debugger 108 5.2. Success of the finished product 5.2.1. Objectives Completion Summary CPU Emulation (O 1) Done PPU Emulation (O 2) Palette Handling Register Handling Name Table Handling Attribute Table Handling Sprite Rendering Background Rendering Scrolling Alterable Palettes (O 2.1) Partial Done Done Done Done Done Partial Partial Done APU Emulation (O 3) Done Control Pad Emulation (O 4) Done Development Environment (O 5) Done GUI Implementation (O 6) Done 5.2.2. Project Evaluation 5.2.2.1 Objective Completion All requirements of the CPU, APU, control pad emulation, development environment and GUI are completed. Most capabilities of the PPU are complete but the unit lacks: 5.2.2.1.1. scanline based background rendering A routine which paints the tiles of the name tables to the display has been provided in Lou of the scanline based renderer, and for some ROMs, this works without fault. 5.2.2.1.2. Scrolling The scrolling of the playfield has been partially implemented. Whilst the scrolling is not visible during the rendering of the display, the logic which lies behind the scrolling is fully implemented. 112 Other minor issues also exist within the PPU which occasionally result in incorrect rendering. These issues are predominately a result of unimplemented corner cases of PPU behaviour. Obviously, these issues restrict the use of the emulator. However, I believe these limitations could be overcome if more time were permitted for development. 5.2.2.2. Comparison Against Existing Software Emulator Jamicom (Java) Graphical Support Limited Audio Support None Animosity (Java) NESCafe (Java) Very Little Full None Full Usability Command Line only N/A Reasonable Development Aids None Nestopia (NonJava) FCEUXD (NonJava) Full Full Reasonable None A Rudimentary Debugger None Full Full Good Extensive In a comparison against existing software on the market, I feel that my emulator bears well. The current Java emulators available lack any real development aids whereas this project offers a number. Also, although it is not as feature rich as FCEUXD, it has the benefit of being crossplatform (FCEUXD executes on Windows machines only). Admittedly, the majority of the emulators above offer superior graphical support but in my opinion are largely inferior in regard to usability. I feel that the developed software certainly has its place within the market, offering good emulation and development capabilities and availability on all major operating systems. 5.2.2.2. Design Issues I feel that the design is flawed in several areas. Firstly, I made the false assumption at an early stage that the three major units (CPU, PPU, and APU) were largely independent with minimal interaction between them. Although the units are largely self contained, the interactions are significant enough that they should have been considered in greater depth at the design stage so as to allow these interactions to be defined more cleanly. Secondly, I feel that although the CPU memory structure documented provides some advantages, its‘ complexity makes its use undesirable. This is discussed further in Alternative Methodologies below. 113 The original design choice of emulating the APUs‘ Mixer unit via the use of two lookup tables was abandoned after implementation due to the fact that their use produced an inferior sound quality. Instead, the values outputted from the channels were simply fed directly to the mixer. Although less efficient, the sound quality tends to be significantly higher. 5.2.2.3. Implementation Issues The system suffers from issues of efficiency. This lack of efficiency stems largely from the background rendering implementation in place. Whilst background rendering is enabled, the sound will occasionally output from the system at an incorrect rate. This issue would almost certainly disappear if a correct scanline based background renderer was put in place. One aspect of the implementation which would do well to be improved is the methodology used to determine four bit colour for the name tables. The approach is long winded and inefficient. This should ideally be re-written to make it more compact. 5.2.3. Extensions Achieved All the cartridge details have been made been made accessible to the user via the ‗About‘ option on the GUI. These are grouped into three sections: details about the file and its condition, details about the memory held on the cartridge and all other details such as the mirroring mode and region (E 3.3). A panel was added to the debugger to display the state of execution visually to the user. The CPU state (whether it is executing or not) is represented by either a green or red circle and an exclamation mark with explanatory text is used to alert the user when a breakpoint is hit (and which one) (E 3.4). The ability to highlight specified text within the code panel (which displays the disassembled program) has been added. This allows the user to select text to highlight from several predetermined options or to specify the text themselves. It is possible to add and remove any number of strings to match at a time (E 3.5). A sprite viewer was added which allows you to view all the sprites currently in memory on screen at once (E 7). A Frames per Second counter was added to allow the user to measure the performance of their software (E 8). 5.2.4. Conclusion In my opinion, the software fully satisfies the requirement of an integrated development environment, meeting the needs of the target user group. 114 Because of the limitations in the PPU, the needs of the second user group: those who intend to use the software to play games have not been fully met. However, the system is fully useable for many titles. I believe the only major limitation of the system to be the lack of an efficient background rendering routine. Whilst it is unfortunate that the PPU remains incomplete, I feel confident that its issues could and will be remedied with additional time spent. In conclusion, I believe that this project has been largely a success. It has resulted in a piece of software that I view as useful to both target user groups and has been a very rewarding learning experience. 5.3. Future Extensions 5.3.1. Additional Debugger Functionality I feel that if the system was to be extended, most effort should be concentrated on the development of additional debugger capabilities. For example, in the current debugger, the APU is largely excluded. Much useful APU information could be added to the application. For example, the values of length counters, duty cycles etc could be displayed. This would help to make the process more user friendly. A trace logger to log the execution of the CPU could also be invaluable to users attempting to debug their code. On the simplest level, a trace logger could be implemented by simply writing register and instruction values to a buffer and then outputting them to a file once the logger is stopped. 5.3.2. Sound Quality Whilst the sound outputted by the system is of a reasonable quality, it could be much improved through the use of more complicated sampling techniques. One such technique which would result in greatly improved sound is band limited synthesis. This technique is too complicated to discuss here but is outlined at (37) and would be a worthwhile extension to the system. 5.3.3. Save State Functionality This is a feature which could significantly improve the user experience. State saving could be employed in several areas of the system: Saving user preferences (palette alterations, volume levels, key configurations etc), Saving execution state, allowing the user to return to the running title later, Saving debugger state (breakpoints, text highlights, PPU manipulations). Java provides several methods for implementing state save functionality. Three possibilities follow: 115 1. The simplest method conceptually is simply to write the values which you wish to save to a text file with names to identify the values by. You would then need to write a parser to retrieve the values. 2. A second approach is to use the Serialisation capabilities of Java which allow objects to be flattened and written to disk and then reconstructed. 3. Finally, the Properties object (in the java.util package) could be used. This object has been written specifically for the purpose of saving state to text file. The third option is perhaps the easiest but storing the data in an object may be the more flexible option as you then have all the capabilities of Java at your disposal for storing and retrieval. 116 5.4. Alternative Methodologies In retrospect, I believe that the system design could be improved upon in several areas. 5.4.1. CPU Memory Structure Figure 67: The CPU Memory Structure Used The initial intention behind this structure was to provide a clear and simple way of modelling the various types of memory within the CPU. For example, if a type of memory was read only, it was possible to emulate this by simply overriding the ‗writeToMemory‘ method with an empty method. Although this system has its advantages, it was too complex overall and required a fair amount of ―housekeeping‖ to keep track of the objects to be used for particular reads and writes. Another limitation is the overhead produced. This overhead includes the method invocations required when reading or writing to memory and the overhead produced by the abovementioned ―housekeeping‖ methods. If I was to redesign the system, I would instead use a single array to represent the memory. What it would lose in code conciseness, it would more than make up for in efficiency and ease of access. This would also be a much more flexible setup, allowing for a dynamic memory system by the simple inclusion of a number of pointers into the array (making the emulation of memory mapper hardware significantly simpler). 5.4.2. Palette Implementation The NES colour palette was implemented using a mathematical algorithm devised by Kevin Horton and implemented by David de Niese. Whilst this provides the advantage of very accurate colours and a simplistic means of changing hue and tint, it is inflexible in that providing users with the ability to change the colour palettes to their liking would be difficult. Thus, I have concluded that the preferable methodology would be to implement the palette entries using simple RGB values. This would likely result in a less accurate colour scheme but would be much simpler to allow user manipulation. 117 6. Works Cited 1. Burdett, A, et al. A Glossary of Computing Terms. A Glossary of Computing Terms. 9th Edition. s.l. : Longman, 1998, pp. 30-31. 2. Diskin, Patrick. Nintendo Entertainment System Documentation. NesDev. [Online] [Cited: 12 October 2007.] http://nesdev.parodius.com/NESDoc.pdf. 3. Various. Turing Complete. Wikipedia. [Online] [Cited: 2008 February 23.] http://nostalgia.wikipedia.org/wiki/Turing-complete. 4. Wilen, Toni. WinUAE. WinUAE. [Online] [Cited: 16 October 2007.] http://www.winuae.net. 5. Sundell, Per Hakan. CCS64 - A Commodore 64 Emulator - By Per Håkan Sundell. Computerbrains. [Online] [Cited: 16 October 2007.] http://www.computerbrains.com/ccs64. 6. Rechlin, Eric. HP Calculator Emulators for the PC. hpcalc.org. [Online] [Cited: 16 October 2007.] http://www.hpcalc.org/hp49/pc/emulators. 7. Smith, Tony. Bleem beats Sony. The Register. [Online] [Cited: 15 October 2007.] http://www.theregister.co.uk/1999/04/12/bleem_beats_sony. 8. —. Playstation emulator wins first round against Sony. The Register. [Online] [Cited: 15 October 2007.] http://www.theregister.co.uk/1999/02/05/playstation_emulator_wins_first_round. 9. Ninn. Emulators. Patent Pending. [Online] [Cited: 12 November 2007.] http://patpend.net/emulators/NES/OS/. 10. Ani. Java NES Emulators. Zophar's Domain. [Online] [Cited: 12 November 2007.] http://www.zophar.net/java/nes.html. 11. De Niese, David. Download NESCafe. NESCafe Web. [Online] [Cited: 12 November 2007.] http://www.nescafeweb.com/main.download.php. 12. Freij, Martin. Nestopia Index. Nestopia. [Online] [Cited: 12 November 2007.] http://nestopia.sourceforge.net/. 13. Porst, Sebastian. Release of FCEUXD SP 1.07. Programming Stuff. [Online] [Cited: 12 November 2007.] http://www.the-interweb.com/serendipity/. 14. Vardy, Adam. Extra Instructions of the 65XX Series CPU. FC64 Wiki. [Online] [Cited: 12 October 2007.] https://mirror1.cvsdude.com/trac/osflash/fc64/wiki/6502Extras. 15. Berthouze, Luc. Software Engineering: Product and Processes. Software Engineering. [Online] [Cited: 25 March 2007.] http://www.sussex.ac.uk/informatics/syllabus/current/15626.html. 16. University of Illinois. The Nintendo Entertainment System. UIUC Computer Science Department. [Online] [Cited: 12 October 2007.] http://www.cs.uiuc.edu/homes/luddy/PROCESSORS/Nintendo.pdf. 17. Zaks, Rodney. Programming the 6502. Programming the 6502. Berkeley CA : Sybex Inc, 1983. 18. Burnett, Colin. ALU Symbol. Arithmetic Logic Unit. [Online] [Cited: 25 October 2007.] http://en.wikipedia.org/wiki/Image:ALU_symbol.svg. 19. Bluechip. Rockwell 6502 Programmers Reference. Cyborg Systems. [Online] [Cited: 16 October 2007.] http://homepage.ntlworld.com/cyborgsystems/CS_Main/6502/6502.htm. 20. Rost, Bob. Nintendo. Game Development for the 8-bit NES - A class by Bob Rost. [Online] [Cited: 28 October 2007.] http://bobrost.com/nes/lectures/NES_January_21.pdf. 21. Green, Shay and Disch. NES PPU. NES Dev Knowledge Base. [Online] [Cited: 26 November 2007.] http://nesdevwiki.org/wiki/index.php/NES_PPU. 22. Fry, Ben. deconstructulator. ben fry. [Online] [Cited: 30 October 2007.] http://acg.media.mit.edu/people/fry/deconstructulator/. 118 23. Moby Games. Super Mario Bros. 2. MobyGames. [Online] [Cited: 13 November 2007.] http://www.mobygames.com/game/super-mario-bros-2. 24. Fayzullin, Marat. Nintendo Entertainment System Architecture. Computer Emulation Resources. [Online] [Cited: 30 October 2007.] http://fms.komkon.org/EMUL8/NES.html. 25. Green, Shay. NES APU Sound Hardware Reference. Blargg's Home. [Online] [Cited: 19 October 2007.] http://www.slack.net/~ant/nes-emu/apu_ref.txt. 26. Taylor, Brad. 2A03 Sound Channel Hardware Documentation. Game Development for the 8-bit NES - A class by Bob Rost. [Online] [Cited: 19 October 2007.] http://bobrost.com/nes/files/nessound.txt. 27. Kemp, Kevin. Mixer Image. Introduction to Digital Home Recording. [Online] [Cited: 1 November 2007.] http://www.kevinkemp.com/homerecordingtutorial/images/softwaremixer.JPG. 28. Green, Shay and Disch. APU Mixer. NES Dev Knowledge Base. [Online] [Cited: 27 November 2007.] http://nesdevwiki.org/wiki/index.php/APU_Mixer. 29. Fayzullin, Marat. How To Write a Computer Emulator. Computer Emulation Resources. [Online] [Cited: 14 November 2007.] http://fms.komkon.org/EMUL8/HOWTO.html. 30. Schall, Darron and Wahlers, Claus. CPU Core. fc64 C64 Emulator Source Code. [Online] [Cited: 10 November 2007.] http://svn1.cvsdude.com/osflash/fc64/trunk/projects/fc64/core/cpu/CPU6502.as. 31. Horton, Kevin. NES Palette Generator. Bluetech. [Online] [Cited: 14 November 2007.] http://nesdev.parodius.com/kevin_palette.txt. 32. Niese, David de. NESCafe Nintendo Emulator for Java. The David de Niese Homepage. [Online] [Cited: 14 November 2007.] http://www.daviddn.com/nescafe/index.asp. 33. Loopy. The Skinny on Scrolling. NES DEV. [Online] [Cited: 27 November 2007.] http://nesdev.parodius.com/loopyppu.zip. 34. Brackeen, David. Developing Games in Java. s.l. : New Rider Games, 2003. 35. Sun Microsystems. Packages. The Java Language Specification Third Edition. [Online] [Cited: 26 September 2007.] http://java.sun.com/docs/books/jls/third_edition/html/packages.html#7.7. 36. Disch. APU Sound Frequencies. NesDev. [Online] [Cited: 15 February 2008.] http://nesdev.parodius.com/bbs/viewtopic.php?t=4011. 37. Blargg. Band-Limited Sound Synthesis. Blargg's Home. [Online] [Cited: 17 February 2008.] http://slack.net/~ant/bl-synth/. 38. Disch. NES Memory Mapping Version 1.0. Romhacking dot net. [Online] [Cited: 12 October 2007.] http://www.romhacking.net/docs/353/. 39. Firebug. Comprehensive NES Mapper Document v0.80. TuxNES. [Online] [Cited: 12 October 2007.] http://tuxnes.sourceforge.net/mappers-0.80.txt. 40. Hunsinger, Ed. How does the Nintendo Light Gun work? Geeked.info. [Online] [Cited: 12 October 2007.] http://www.geeked.info/how-does-the-nintendo-light-gun-work/. 41. Gamers Graveyard. NES Four Score. Gamers Graveyard. [Online] [Cited: 12 October 2007.] http://www.gamersgraveyard.com/repository/nes/peripherals/fourscore.html. 42. —. Power Pad/Family Fun and Fitness/Family Trainer. Gamers Graveyard. [Online] [Cited: 12 October 2007.] http://www.gamersgraveyard.com/repository/nes/peripherals/powerpad.html. 43. The Mighty Mike Master. NES Game Genie Technical Notes. TuxNES. [Online] [Cited: 12 October 2007.] http://tuxnes.sourceforge.net/gamegenie.html. 119 44. Nick M. Introducing the Miracle System. The Warp Zone. [Online] [Cited: 12 October 2007.] http://thewarpzone.classicgaming.gamespy.com/piano.html. 45. Green, Shay. NES Tests/CPU. Blargg's Home. [Online] [Cited: 12 November 2007.] http://www.slack.net/~ant/nes-tests/. 46. —. NES Tests/Branch. Blargg's Home. [Online] [Cited: 12 November 2007.] http://www.slack.net/~ant/nes-tests/branch_timing_tests.zip. 47. Horton, Kevin. NES. BlueTech. [Online] [Cited: 12 November 2007.] http://tripoint.org/kevtris. 48. Green, Shay. NES Tests/CLI. Blargg's Home. [Online] [Cited: 12 November 2007.] http://www.slack.net/~ant/nes-emu/cli_latency_tests.zip. 49. Bridgewater, Alastair. NES Programs. Nes Dev. [Online] [Cited: 27 November 2007.] http://nesdev.parodius.com/overtest.zip. 50. Green, Shay. NES Tests/PPU. Blargg's Home. [Online] [Cited: 12 November 2007.] http://www.slack.net/~ant/nes-tests/blargg_ppu_tests.zip. 51. —. NES Tests/PPU Overflow. Blarrg's Home. [Online] [Cited: 12 November 2007.] http://www.slack.net/~ant/nes-emu/sprite_overflow_tests.zip. 52. —. NES Tests/PPU Hit. Blargg's Home. [Online] [Cited: 12 November 2007.] http://www.slack.net/~ant/nes-tests/sprite_hit_timing.zip/http://www.slack.net/~ant/nestests/sprite_hit_tests.zip. 53. —. NES Tests/PPU More. Blargg's Home. [Online] [Cited: 13 November 2007.] http://www.slack.net/~ant/nes-emu/vbl_nmi_timing.zip. 54. —. NES Tests/APU 2005. Blargg's Home. [Online] [Cited: 12 November 2007.] http://www.slack.net/~ant/nes-tests/blargg_apu_2005.07.30. 55. Fry, Ben. Programs. Moogle Charm. [Online] [Cited: 21 November 2007.] www.morganleahrecords.com/mooglecharm/programs.html. 56. Gough, Paul. Computer Systems Architecture Notes. G6015 Computer Architectures. [Online] [Cited: 30 October 2007.] http://www.informatics.sussex.ac.uk/users/michaelg/computerarchitectures/course_notes.pdf. 57. fluBBa. FluBBas TechDocs. GBARetro.com. [Online] [Cited: 12 November 2007.] http://www.ndsretro.com/download/NEStress.zip. 58. Green, Shay. NES Tests/CLI More. Blargg's Home. [Online] [Cited: 13 November 2007.] http://www.slack.net/~ant/nes-tests/cli_latency_tests.zip. 59. Firebug. NES ROMS - Starting with j. Rom Hustler. [Online] [Cited: 31 November 2007.] 60. Oorni, Lasse. Roms NES. Consolemul. [Online] [Cited: 21 November 2007.] http://roms.consolemul.com/index.php?machine=20. 120 7. Appendices Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix A: B: C: D: E: F: G: H: I: J: Cartridge Specification File Format Specification Regional Differences Specification Input Devices and Other Peripherals Background Rendering in Detail Low Level Designs Test Specification Project Logs GNU GENERAL PUBLIC LICENSE Version 2 Source Code 121 Appendix A: Cartridge Specification 122 General All software for the NES came encased in a plastic cartridge external to the system (in the form of ROM) which was executed by slotting the cartridge into a 72 pin connector on the NES and turning on the power. Figure 68: A NES game cartridge. Used by slotting into the cartridge slot in the NES hardware (adapted from (2)) Basic cartridges contain two types of ROM, CHR-ROM and PRG-ROM. The CHR-ROM contains the pattern tables of the game whilst the PRG-ROM contained the actual program code. Cartridges contained either 16 or 32KB of PRG-ROM depending on the size of the program. Figure 69: The annotated insides of a standard NES cartridge (un-annotated image from (2)) 123 Additional Hardware It is possible for cartridges to contain additional hardware (and very often did) which provide additional functionality. These enhancements will be briefly summarised below: WRAM WRAM allowed for information to be saved which would allow the user to return to a previous state in the execution of the program. For example, returning to the beginning of a level in a game when the player ―dies‖ with the same statistics they had when they first entered the level. This RAM may retain data even when the console is switched off via the use of a small battery maintaining the current through the memory. Memory Management Chips (MMC‘s) (2) (38) (39) Memory management chips (also commonly known as memory mappers) were used to counteract the limitations in the NES hardware. They allowed the use of a larger number of both PRG and CHR-ROM banks thus allowing larger programs with superior graphics. This was achieved by the executing program indicating the need for data from a ROM bank not currently loaded into memory. The MMC would then swap the required data into a defined page of memory for use by the program. Some mappers also provided additional functionality such as the ability to trigger IRQ‘s and enhanced graphical manipulations (such as the ability to only scroll certain areas of the screen). 124 Appendix B: File Format Specification 125 In order to parse the ROM files that make up the NES software, a file format has to be decided upon. There exist two main formats in use by NES emulators today. These are the iNes format and UNIF. The iNes format was the first to be proposed but has been criticised for being ambiguous and for storing little data about the software title, making correct emulation more difficult. UNIF attempts to fix these issues and more. Whilst UNIF would seem the natural choice, it is the case that practically all ROM files use the iNes format. Because of this reason, iNes has been chosen to provide maximum compatibility. iNes (24) The iNes format consists of a 16 byte header situated at the beginning of ROM files. It identifies which MMC is used (if any) via an 8-bit number. The numbers for each mapper were decided upon by Marat Fayzullin (the creator of the iNes format). After the header, the ROM banks should be stored in the file in ascending order. If a trainer exists, its 512 bytes will precede the ROM banks in the file. The format makes reference to both trainers and the VS System. These are briefly discussed here: Trainers Trainers are 512 bytes of code which were used to allow the copying of cartridges. They allowed the bypassing of the normal MMC used by the cartridge, instead using the MMC defined by the copier. The use of trainers will not be emulated as cartridges copiers no longer need to resort to using them. Thus, the only possible purpose for emulating this functionality would be to allow the proper handling of illegally obtained ROM files. VS System The VS System series is a collection of arcade games designed for competitive play between two people which were based on many NES titles. This functionality will not be implemented as it has very little scope for use and would leave less time available for more useful implementations. The iNes format is summarised below: 126 Byte Contents 0-3 This should contain the string ‗NES‘ followed by the MS-DOS end-of-file character (in hex: $4E $45 $53 $1A) 4 The number of 16KB PRG-ROM banks (program code) 5 The number of 8KB CHR-ROM banks (pattern tables) 6 Bit 0 – Indicates the name table mirroring scheme used. 0 – horizontal mirroring 1 – vertical mirroring Bit 1 – Presence of battery backed RAM 0 – No battery backed RAM at $6000-$7FFF 1 - Battery backed RAM at $6000-$7FFF Bit 2 (see below for explanation) 0 – No 512 byte trainer present at $7000-$71FF 1 – A 512 byte trainer present at $7000-$71FF Bit 3 – Presence of four screen name table mirroring 0 – The name table mirroring scheme indicated in bit 0 is used 1 – Four screen name table mirroring is used Bits 4-7 – Four lower bits of the ROM Mapper type 7 Bit 0 (see below for explanation) 0 – The cartridge is not of the VS-System type 1 – The cartridge is of the VS-System type Bits 1-3 – Reserved for later use. They should all be set to 0. Bits 4-7 – Four higher bits of the ROM Mapper type 8 The number of 8KB RAM banks present (when zero, this should be read to mean that one bank exists. This is for compatibility reasons). 9 Bit 0 – Indicates the region of the original cartridge 0 – NTSC 1 – PAL Bits 1-7 – Reserved for later use. They should all be set to 0. 10These bytes are reserved for later use and should all be set to 0. 15 Table 5: A summarisation of the iNes header format(24) 127 Appendix C: Regional Differences Specification 128 It is important to note that all three of the main units to be implemented vary in minor ways depending on the region that the NES is manufactured for (NTSC or PAL).These differences are noted below (2) (25): NTSC PAL CPU Clock Speed 1.79 MHz 1.66 MHz PPU Clock Speed 21.477272MHz / 4 26.601712MHz / 5 Frames Per Second 60 50 Visible Screen Resolution 256x224 256x240 Table 6 : Differences between the NTSC and PAL NES Additionally, the sound frequencies emitted by the noise and DMC sound channels tend to be higher for the NTSC APU I intend to write the emulator to conform to the NTSC NES specification as this will allow for a wider array of software to be used on the system than if PAL was chosen. PAL ROMs will still be useable but will execute in a way not intended by the developers (due to the discrepancies in machine specification). 129 Appendix D: Input Devices and Other Peripherals 130 Input Devices E 4.1. The NES Zapper. This is a peripheral shaped like a gun and used as such. The users point it at the screen and press the ―fire‖ button, with the aim to ―shoot‖ targets on screen. (40) E 4.2. The NES Four Score. This peripheral allows for up to four people to play the same game simultaneously. This is achieved via inputs on the device allowing four control pads to be inserted. (41) E 4.3. The Power Pad. This peripheral consisted of a mat with inputs which the users were supposed to press with their feet. It was designed as a way of getting fit whilst playing games. (42) Other Peripherals E 4.4. The NES Game Genie. This peripheral allows the user to alter the way that the software used in the NES is executed via codes input into the game genie. The NES cartridge is inserted into the Game Genie which is then, in turn, inserted into the NES cartridge slot. The codes input by the user translate into addresses and data in a game‘s program space which the Game Genie tricks the CPU into using instead of the data that should be there. (43) E 4.5. The Miracle Piano Teaching System is a peripheral which can be used to learn basic skills on the piano. A keyboard with pedals is provided, as is software for teaching. The software‘s AI alters lessons based on how the user plays the keyboard. (44) 131 Appendix E: Background Rendering in Detail 132 Display Rendering The rendering process relies on the values in three internal registers: PPUADDR XFine loopyT (named after the person who first identified the scrolling behaviour). PPUADDR is used and manipulated throughout the rendering process in order to render the image in the name tables to the display. Thus, it should not be altered by the programmer during rendering so as to avoid rendering issues. It should, however be possible to write to PPUADDR during rendering without altering the rendering behaviour. loopyT is used for this purpose and is written to via writes to PPUCTRL and PPUSCROLL. PPUADDR is updated with the value in loopyT once every frame. This allows the updating of PPUADDR without interfering with the rendering. The bits of loopyT and XFine are interpreted in such a way as to always point to a particular pixel within a particular name table. The meaning behind the bits in these registers is illustrated in the below diagram: 133 Figure 70: Display Rendering Register Interpretations X Tile Position – The first tile from the X axis of the current name table which should be rendered for the current scanline. Y Tile Position – The first tile from the Y axis of the current name table which should be rendered for the current scanline. X Fine Position – The first column of the selected tile which should be rendered for the current scanline. Y Fine Position – The first row of the selected tile which should be rendered for the current scanline. Name Table Base Address - Dictates the name table that rendering should begin at. Name Table Address (in red) – Contains the actual address of the next name table element to be rendered. The writes to PPUADDR are carried out by the user. They are included to show that loopyT is copied into PPUADDR upon the second write. Also that bit 15 is always set to zero. This is to prevent the user from attempting to reference memory locations not present (14 bits gives a maximum address of 0x3FFF, the highest referenceable memory location in the PPU). 134 The use of X tile, Y tile, X fine and Y fine is illustrated below: Figure 71: Applying the Register Interpretations to the Name Tables Rendering Behaviour After each tile of a scanline is rendered to the display, the value of X Tile is incremented, allowing the PPU to render the next tile on the name tables X axis. The X Tile‘s value should wrap to 0 when it reaches 31 (the end of the name table has been reached). This will result in bit 11 of loopyT being inverted, switching the horizontal name table which will be used for rendering from now on. 135 After every complete scanline has been rendered to the display, Y Tile is incremented, allowing rendering from the next row of tiles in the name table. Y Tile wraps from 29 to 0 (the end of the name table has been reached). This will result in bit 12 of loopyT being inverted, switching the vertical name table which will be used for rendering from now on. Figure 72: Name Table Switching As can be seen from the above diagram, this bit inverting process ensures that the PPU will never ―run out‖ of name table to render, simply alternating between tables each time the current name table comes to an end. Ordinarily, the X Fine value is not changed during the rendering of a frame. This is so that each scanline begins rendering at the same point, maintaining the image stored in the name table during scrolling. However, some programs manipulate this value to achieve various effects (such as split screen). At the beginning of each scanline, certain bits of loopyT are copied into PPUADDR to achieve two purposes: ―Resets‖ the X Tile value. This will ordinarily copy across the X Tile value present before the scanline began (resulting in the next scanline beginning its horizontal rendering at the same point as the one before it). 136 Sets bit 11 of PPUADDR to bit 11 of loopyT. This will return to the horizontal name table being used before the scanline began in case a horizontal name table boundary was crossed during the previous scanline (resulting in the alternate horizontal table being set for use). 137 Appendix F: Low Level Designs 138 1. CPU 1.1 General CPU Design Field : stopCPU : boolean Field pauseCPU : boolean Field : counter : int Field : interruptPeriod : int counter = interruptPeriod; stopCPU = false; pauseCPU = false; for (;;) { if (!pauseCPU) { // To ensure the CPU can be stopped whilst paused, the CPU // must be un-paused before attempting to stop it. Deal with the next instruction counter = counter – number of cycles for current instruction } } if (counter <= 0) { if (stopCPU) { break; } else { Deal with all cyclic tasks counter = counter + interruptPeriod; } } 1.2. Memory Class Memory { Field: memory: int Array Contructor Memory(int: memorySize) { // in bytes memory = new Array[memorySize]; } void: writeToMemory(int: address, int: data) { Memory[actualAddress] = data; } 139 } int: readFromMemory(int: address) { return memory[actualAddress]; } Class ZeroPage extends Memory { // Requires no overriding. } Constructor ZeroPage() { Super(256); } Class Stack extends Memory { // The stack grows “backwards” in memory. Constructor Stack(int: memorySize) { Super(256); } void: writeToMemory(int stackPointer, int data) { memory[stackPointer] = data; decrement stackPointer; } } int: readFromMemory(int: stackPointer) { increment stackPointer; return memory[stackPointer]; } Class CartRAM extends Memory { Constructor CartRAM(int: memorySize) { Super(8191); } } Class CartROM extends Memory { /* NOTE: Even though writing to ROM does nothing, a method for writing should still be provided as some programs include instructions which write to ROM as an antipiracy measure (if the write is successful, the program ends execution because it knows it is not executing on official hardware. */ Constructor CartROM(int: memorySize) { 140 Super(16383); } } void: writeToMemory(int: address, int: data) { // Writing to ROM is not allowed. } Class CPURAM extends Memory { } Constructor CPURAM(int: memorySize) { Super(1535); } Class ExpansionROM extends Memory { Constructor ExpansionROM(int: memorySize) { Super(8159); } } void: writeToMemory(int: address, int: data) { // Writing to ROM is not allowed. } Class IO extends Memory { /* No actual data is stored. Just provides a means of interacting with external devices. */ Constructor IO(int: memorySize) { Super(0); } void: writeFromMemory(int: address, int: data) { switch (address) { case (2000) : // PPU Control Register 1 Case (2001) : // PPU Control Register 2 … } } int: readFromMemory(int: address) { 141 } } switch (address) { case (2000) : // return PPU Control Register 1 Case (2001) : // return PPU Control Register 2 … } 1.3. Addressing Modes int : absolute() { // Returns the operand found at a 16-bit address. int : lowEightBits = PC++; int : highEightBits = PC++; int : address = (highEightBits <<< 8) | lowEightBits; // full 16-bit address } return getMemory(address); // absoluteY also present. int : absoluteX() { // Returns the operand found at a 16-bit address plus X int : lowEightBits = PC++; int : highEightBits = PC++; // full 16-bit address + X (with wrapping to ensure a valid address) int : address =(( (highEightBits <<< 8) | lowEightBits) + X) & 0xFFFF; } return getMemory(address); int : zeroPage() { return getMemory(PC++); } int : zeroPageX() { // zeroPageY also exists. Identical to zeroPageX int : address = PC++; address = (address + X) & 0xFF; // Logical AND to keep the address in zero // page (wraparound) } return getMemory(address); 142 int : indirect() { int : address = PC++; int : lowEightBits = getMemory(address); int : highEightBits = getMemory((address+1) & 0xFF); // Possible // wraparound. } return getMemory((highEightBits <<< 8) | lowEightBits); int : indexedIndirect() { int : address = PC++; address = (address + X) & 0xFF; // Wraparound possible. int : lowEightBits = getMemory(address); int : highEightBits = getMemory((address+1) & 0xFF); // Possible // wraparound. return getMemory((highEightBits <<< 8) | lowEightBits); } int : indirectIndexed { int : address = PC++; int : lowEightBits = getMemory(address); int : highEightBits = getMemory((address+1) & 0xFF); // Possible // wraparound. int : baseAddress = ((highEightBits <<< 8) | lowEightBits); baseAddress = (baseAddress + Y) & 0xFFFF; // Wraparound possible. } return getMemory(baseAddress); 143 1.4. DMA Access // The DMA controller is used to write 256 bytes from CPU memory to Sprite // memory. // The transfer takes a total of 512 cycles to complete. // The CPU is unable to access memory while this process takes place. // The DMA controller is started by a write to register $4014. The operand given // specifies the memory address to begin copying from (with an offset of 0x100). void : DMA(int: start) { start = start + 0x100; } for (i = 0; i < 256; i++) { spriteMem.writeToMemory(CPUMem.readFromMemory(start + i)); } 1.5. Utility Methods Branch switch (branchOpCode) { Case (0xB0) : // BCS – Branch on carry set return Carry Set? Case (… } CheckPageBoundary return (newAddress EOR oldAddress) & 256; // (30) LoadWord int : loadWord(int : lowestByte) { int : highEightBits = (lowestByte + 1) & 0xFFFF; // wraparound. int : address = (highEightBits <<< 8) | lowEightBits; // 16-bit address } return getMemory(address); 144 2. PPU 2.1. PPUCTRL ($2000) // Certain bits of the registers will cause different effects in the rendering of the // image to the screen. // Bits 0 and 1. Give the base name table address to be used. int : nameTableToUse = PPUCTRL & 00000011; switch (nameTableToUse) { Case (0) { baseNameTable = 0x2000; } Case (1) { baseNameTable = 0x2400; } Case (2) { baseNameTable = 0x2800; } Case (3) { baseNameTable = 0x2C00; } } // Bit 2. Internal PPU RAM Address increment per CPU read/write of // PPUDATA. 0: Increment by 1 (going across), 1: Increment by 32 (going // down). int : VRAMIncrement = PPUCTRL & 00000100; if (VRAMIncrement == 0) { addressIncrement = 1; else { addressIncrement=32; } The above code is not representative of how it will be written in the finished software. This is because it will be necessary for the code to be spread out throughout the PPU class so it would be impractical to show here. Additionally, pseudo code has only been provided for the first three bits of register PPUCTRL. Any further code would be superfluous as all three behave very similarly. 145 2.2 OAMADDR ($2003) // The value written to this register specifies the location of sprite memory you wish to access (read from or write to). The address written can then be accessed via the OAMDATA ($2004) register. void : writeToOAMADDR(int : address) { OAMADDR = address; } 2.3 OAMDATA ($2004) // Behaviour of this register depends on whether it is being written to or read from. // Reading from this register simply returns the data at the address in Sprite RAM // specified by OAMADDR. // Writing to this register writes the data and then increments OAMADDR. void : writeToOAMDATA (int : data) { SpriteRAM[OAMADDR] = data; OAMADDR++; } int : readFromOAMDATA() { return SpriteRAM[OAMADDR]; } 2.4 PPUSCROLL ($2005) // The first write to PPUSCROLL sets the horizontal scroll offset. // The second write sets the vertical scroll offset. boolean : horizontalScrollNext = false; writeToPPUSCROLL (int : data) { horizontalScrollNext = !horizontalScrollNext; if (horizontalScrollNext) { hScroll = data; // offsets range from 0 to 255. } else { vScroll = data; // offsets range from -16 to 239. } } 146 2.5 PPUADDR ($2006) // The first write to PPUADDR sets the upper byte of the address in PPU internal // memory that you wish to access. // The second write sets the lower byte of the address. int : PPUADDRWord; boolean : PPUADDRfirstByte = false; writeToPPUADDR(int : address) { PPUADDRfirstByte = !PPUADDRfirstByte; if (PPUADDRfirstByte) { PPUADDRWord &= 0x00FF; PPUADDRWord |= (address << 8); // Highest byte written. } else { PPUADDRWord &= 0xFF00; PPUADDRWord |= address; // Lowest byte written. } } 2.6 PPUDATA ($2007) // Allows you to read from or write to PPU internal memory at the address specified // by PPUADDR // NOTE: Reads are delayed by one cycle. void : writeToPPUDATA (int : data) { PPUInternal[PPUADDRWord] = data; } int : readFromPPUDATA() { return PPUInternal [PPUADDRWord]; } 147 2.7 Image and Sprite Palette Representation The code includes Java Specific objects (Color). Note that every fourth element in both palettes contains the same colour. The object created by the use of ―Color(0,0,0,0)‖ represents transparency. Color[16] imagePalette; Color[16] spritePalette; IMAGEPALETTE = 0x3F01; // first non-transparent element of image palette SPRITEPALETTE = 0x3F11; // first non-transparent element of sprite palette // Set the transparent elements of imagePalette. imagePalette[0] = new Color(0,0,0,0); imagePalette[4] = new Color(0,0,0,0); imagePalette[8] = new Color(0,0,0,0); imagePalette[12] = new Color(0,0,0,0); // backgroundColour used if both sprite and image colours are transparent. Color : backgroundColour = new Color(masterPalette.get(memory[0x3F00])); // sub-palette 1 imagePalette[1] = new Color(masterPalette.get(memory[IMAGEPALETTE])); imagePalette[2] = new Color(masterPalette.get(memory[IMAGEPALETTE+1])); imagePalette[3] = new Color(masterPalette.get(memory[IMAGEPALETTE+2])); // sub-palette 2 imagePalette[5] = new Color(masterPalette.get(memory[IMAGEPALETTE+4])); imagePalette[6] = new Color(masterPalette.get(memory[IMAGEPALETTE+5])); imagePalette[7] = new Color(masterPalette.get(memory[IMAGEPALETTE+6])); // sub-palette 3 imagePalette[9] = new Color(masterPalette.get(memory[IMAGEPALETTE+8])); imagePalette[10] = new Color(masterPalette.get(memory[IMAGEPALETTE+9])); imagePalette[11] = new Color(masterPalette.get(memory[IMAGEPALETTE+10])); //sub-palette 4 imagePalette[13] = new Color(masterPalette.get(memory[IMAGEPALETTE+12])); imagePalette[14] = new Color(masterPalette.get(memory[IMAGEPALETTE+13])); imagePalette[15] = new Color(masterPalette.get(memory[IMAGEPALETTE+14])); // An identical process is followed for the sprite palette (using the SPRITEPALETTE constant). 148 2.8 Sprite Evaluation Routine int : numSprites = 0; // Still need to raise the Sprite Overflow flag. int : currentSecondary = 0; for (i = 0 to 64) { int : yPos = spriteMem[i*4]; int : difference = yPos – scanline // scanline == current scanline int : ySize = -8; if (8x16Sprites) { // Used to determine if Y pos in range later. ySize = -16; } // Is the sprite in range of the current scanline? if (difference <= 0 && difference > ySize) { numSprites++; if (numSprites == 8) { Set Sprite Overflow Flag // Not actually adhered to. } // Sprite found to be in range. Render it. int : byte1 = spriteMem[(i*4)+1]; int : byte2 = spriteMem[(i*4)+2]; int : byte3 = spriteMem[(i*4)+3]; } } // Using the three bytes retrieved above, render the sprite to the display. 149 2.9 Sprite Evaluation Flowchart 150 2.10 Name Table Mapping // Depending on the name table mirroring scheme being used, writes to // addresses in the range of the name table memory (0x2000 – 0x3000) will // be treated differently. if (horizontalMirroring) { if (address >= 0x2000 && address < 0x2800) { physicalTable1[address & 0x1000]; } else { physicalTable2[address & 0x1000]; } } else if (verticalMirroring) { if ((address >= 0x2000 && address < 0x2400) || (address >= 0x2800 && address < 0x2C00)) { physicalTable1[address & 0x1000]; else { physicalTable2[address & 0x1000]; } } 151 3. APU 3.1 Divider int : period = n; // The period will be specified by writing to a sound register. int : counter = period; void : clock() { // Called each time the CPU clocks. if (--counter <= 0) { Output a clock. counter = period; } } void : forceReload() { // Reload the clocks counter with the period. counter = period; } void : changePeriod(int : newPeriod) { period = newPeriod; } 3.2 Sequencer // The below code represents the sequencer present in the Triangle sound channel // (simplified slightly so as to make the code as general as possible for illustration // purposes). field : sequence : int array = [15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]; field : current = 0; int : clock() { return sequence[current]; current = (current++ MOD sequence.length); // Keep looping. } 152 3.3 Shift Register with Feedback int : shiftReg // Clock() results in a pseudo-random bit sequence. // Exclusive OR of bit 0 and either bit 6 or bit 1 (depending on status of loop bit). // The result of this EOR replaces bit 15 of the shift register. // Finally, shift the shift register 1 bit to the right. void : clock() { int : firstBit = shiftReg & 0000000000000001; // Get bit 0. if (loopSet) { // If loop has been set. int : secondBit = 0000000001000000; // Get bit 6. } else { int : secondBit = 0000000000000010; // Get bit 1. } int : eorBits = firstBit EOR secondBit; } shiftReg = shiftReg & (eorBits << 16); // Right shift eorBits + sign extension. shiftReg = shiftReg >>> 1; // Shift 1 right w/o sign extension. 3.4 Frame Counter int : mode = 0; int : steps = 4; // number of steps for the sequence. Depends on the mode. int : current = 0; // Current position in the sequence. void : setMode(int : mode) { this.mode = mode; current = 0; } if (mode == 1) { steps = 5; // A 5 step sequence. clock(); // clock immediately if mode is 1. } else { steps = 4; } divider.forceReload(); // The divider is what clocks the frame counter. This // is left out for simplicity. 153 void : clock() { current = current++ MOD steps; if (mode == 0) { // 4 step sequence. if (current == 1) { pulse1.envelope.clock(); pulse2.envelope.clock(); noise.envelope.clock(); triangle.linearCounter.clock(); } else if (current == 2) { pulse1.envelope.clock(); pulse2.envelope.clock(); noise.envelope.clock(); triangle.linearCounter.clock(); pulse1.lengthCounter.clock(); pulse2.lengthCounter.clock(); noise.lengthCounter.clock(); triangle.lengthCounter.clock(); pulse1.sweep.clock(); pulse2.sweep.clock(); } else if (current == 3) { pulse1.envelope.clock(); pulse2.envelope.clock(); noise.envelope.clock(); triangle.linearCounter.clock(); } else { pulse1.envelope.clock(); pulse2.envelope.clock(); noise.envelope.clock(); triangle.linearCounter.clock(); pulse1.lengthCounter.clock(); pulse2.lengthCounter.clock(); noise.lengthCounter.clock(); triangle.lengthCounter.clock(); pulse1.sweep.clock(); pulse2.sweep.clock(); } else { } if (interrupt inhibit clear) { frame interrupt flag = true; } 154 if (current == 1) { pulse1.envelope.clock(); pulse2.envelope.clock(); noise.envelope.clock(); triangle.linearCounter.clock(); pulse1.lengthCounter.clock(); pulse2.lengthCounter.clock(); noise.lengthCounter.clock(); triangle.lengthCounter.clock(); pulse1.sweep.clock(); pulse2.sweep.clock(); } else if (current == 2) { pulse1.envelope.clock(); pulse2.envelope.clock(); noise.envelope.clock(); triangle.linearCounter.clock(); } else if (current == 3) { pulse1.envelope.clock(); pulse2.envelope.clock(); noise.envelope.clock(); triangle.linearCounter.clock(); pulse1.lengthCounter.clock(); pulse2.lengthCounter.clock(); noise.lengthCounter.clock(); triangle.lengthCounter.clock(); pulse1.sweep.clock(); pulse2.sweep.clock(); } else if (current == 4) { pulse1.envelope.clock(); pulse2.envelope.clock(); noise.envelope.clock(); triangle.linearCounter.clock(); } } } 155 3.5 Pulse Channel // adapted from C code written by Blargg. // The timer variable in this code will be a Timer object in the implementation. It // has been kept as an int here for simplicity. field : $4000 : int; field : $4002 : int; field : $4003 : int; field : timer : int = 0; field : phase : int field : waves[4][8] : int array = { {0,1,0,0,0,0,0,0}, {0,1,1,0,0,0,0,0}, {0,1,1,1,1,0,0,0}, {1,0,0,1,1,1,1,1}}; // The below outputs waveform values based on the state of three APU registers. int : clock() { if (--timer <= 0) { int : raw = (($4003 & 7) << 8) | $4002; timer = (raw + 1) * 2; phase = (phase + 1) & 7; } } return waves[($4000 >> 6) & 3][phase]; 3.6 Triangle Channel field : sequence : int array ={15,14,13,12,11,10,9,8,7,6,5,4,3, 2,1,0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; field : timer : int = 0; field : $400A : int; field : $400B : int; field : current : int; // Current sequencer value. int : clock() { if (--timer <= 0) { int : raw = (($400B & 7) << 8) | $400A; // timer high and timer low, // plus one. timer = (raw + 1); } current = current++ MOD sequence.length; // Looping. 156 return sequence[current]; } 3.7 Noise Channel field : timerPeriods : int array = {4, 8, 16, 32, 64, 96, 128, 160, 202, 254, 380, 508, 762, 1016, 2034, 4068}; field : timer : int = 0; field : $400E : int; field : SRWF : ShiftRegisterWithFeedback; int : clock() { if (--timer <= 0) { int : 400ETimerPeriod = ($4003 & 15); timer = timerPeriods[400ETimerPeriods]; SRWF.clock(); } } return current SRWF value; 4. Input/Output 4.1 Determining Input State // This code assumes that the control pads have already been strobed via a write of // 1 followed by 0 to the lowest bit of register $4016. It also only handles one pad. // Strobing will set button back to 0 and fill the shiftRegister with the button states. int : shiftRegister; int : button = 0; // The next buttons state to check. int : read() { // Read the next button state from pad. if (!button == 8) { return shiftRegister & 00000001; shiftRegister = shiftRegister >>> 1; // Shift right 1 to allow getting of // next buttons state next time. } } 157 Appendix G: Test Specification 158 Overview This test specification will be referred to within the main report. To this end, each test or set of tests discussed below will be labelled to allow easy referencing. These references will be provided after the test title in the form (‗reference‘) or in its own explicitly labelled column. Pre-Written Test Files In addition to project specific testing, there are a number of pre-written tests which will be used to help verify the soundness of the software. CPU Cpu_timing_test (‗CPU Timing‘) Tests 6502 instruction timing for most 6502 instructions. This consists of four separate test files, all available at: (45) Branch_timing_tests (‗Branch Timing‘) Tests correct emulation of the 6502 branch instructions. (46) NES Test (‗CPU Operation‘) Thoroughly tests the operation of the CPU. (47) cli_latency_tests (‗CLI Latency‘) Tests for the correct operation of the CLI instruction. (48) cli_tests (‗CLI and Related‘) Tests CLI and related instructions. overtest (‗Overflow‘) Tests that the CPU‘s overflow flag works correctly. (49) PPU blargg_ppu_tests_2005.09.15b (‗PPU General‘) Tests several aspects of the NTSC PPU. (50) 159 scanline.nes (‗Scanline Rendering‘) Tests for the correct operation of the scanline rendering process. (50) sprite_overflow_tests (‗Sprite Overflow‘) Tests for the correct operation of the sprite overflow flag. (51) sprite_hit_timing (‗Sprite Hit‘) Tests for the correct operation of sprites. (52). vbl_nmi_timing (‗PPU Miscellaneous‘) Additional PPU tests (53) APU blargg_apu_2005.07.30 (‗APU Miscellaneous‘) Tests the Frame Counter operation and the first square wave‘s length counter. (54) sndtest.nes (‗Sound Test‘) A NES ROM file which allows the testing of the APU via allowing manipulation of the APU registers and outputting the resulting sound. (55). I/O Sndtest.nes (‗Joypad‘) In order to test the sound unit using this ROM, values on screen must be changed using the input device. If this works correctly, it can be concluded that the input mechanism works correctly. Project Specific Testing Performance Testing Even on modern hardware, few (if any) NES emulators achieve the full 60 Frames per Second of the original system. Thus, it would seem reasonable to accept a frame rate of anything over 50 as being acceptable performance. 160 Portability Testing To test portability, it will be tested on the following systems: Windows XP Pro (Service Pack 2) (‗Windows‘) Ubuntu Linux 7.10 (‗Ubuntu‘) Mac OS X (‗Mac‘) Unit Testing Overall Test Case A ROM file can be loaded and executed The program can be exited Acceptance Condition The ROM chosen will begin executing and displaying Choice of the ―Quit‖ option in the ―File‖ menu closes the application Reference ‗load‘ Acceptance Condition Paused instruction execution upon selection of the ―Pause‖ option in the ―CPU‖ menu. Reference ‗pause‘ ‗exit‘ CPU Test Case Can be paused and continued Can be stopped Can be reset Correct working of the DMA Continued instruction execution upon selection of the ―Continue‖ option in the ―CPU‖ menu. The CPU stops execution upon selecting the ―Close ROM‖ option in the ―File‖ menu. The display rendering starts from the beginning of the ROM file and continues to render as if opened for the first time. Writing 256 bytes of data from a location to the sprite RAM. Compare the Sprite RAM against the location copied from. If these two are the same, the copying has worked correctly. 161 ‗stop‘ ‗reset‘ PPU Test Case Graphical effects the emulator should be capable of: greyscale/colour modes enable/disable background clipping enable/disable sprite clipping enable/disable background rendering enable/disable sprite rendering Set/Unset colour emphasis Name table mappings should work correctly. Pattern table use: Correct use in rendering sprites Correct table used for displaying sprites and background Acceptance Condition These can be tested by flipping the appropriate bits responsible for these effects and visually checking that the intended effect occurs. Reference ‗Graphical Effects‘ This can be tested entirely visually. If they are not working correctly, the display will not render correctly. This too can be tested visually. ‗Name Table Mappings‘ Acceptance Condition This will be tested purely by inspecting the output at a variety of volume levels (0%, 25%, 50%, 75%, 100%). Reference ‗Volume Control‘ ‗Pattern Table Use‘ APU Test Case The volume and mute features of the system work correctly. Correct APU operation will be verified largely through comparison testing against the output of the FCEUXD SP emulator. The ―sndtest.nes‖ ROM will be used for this purpose. 162 I/O Test Case The key mapping facility in the GUI works correctly. Input from the user is picked up and interpreted correctly. Acceptance Condition This will be tested by using the GUI to change the key mappings for the standard control pad. The Joypad Test Cartridge will then be used to confirm that the keys just mapped do indeed correspond to standard control pad buttons. The above method of using the test cartridge can be used here also Reference ‗Key Mapping and input recognition‘ Acceptance Condition This is confirmable by tracing the execution of the CPU for a time, ensuring that the register display matches that of the actual register values. Confirmable by entering breakpoints of all types and ensuring that the system breaks at set points. Reference ‗Register State‘ Remove a breakpoint. Check the program no longer halts at this point Remove all breakpoints. Check the program no longer halts at all. Add a breakpoint, check it halts where desired. ‗Breakpoint Manager‘ Both by observation. ‗Step and Resume‘ By observation. ‗Seek PC‘ Comparison with the disassembled code produced by FCEUXD SP. ‗Disassembled‘ Debugger Test Case The register state fields show the correct values. The debugger breaks program execution when it is required to do so. The breakpoint manager works as desired. i.e. it should allow: The removal of a particular breakpoint The removal of all breakpoints The addition of an additional breakpoint Upon pressing ―Step‖, the CPU should execute one instruction and then break. Upon pressing ―Resume‖, execution should continue until the next breakpoint (if any). ―Seek PC‖ should highlight the line in the disassembled code with the same PC number. The ―Code‖ pane should display the disassembled source code for the running ROM. Memory Panes should display the correct values for all memory locations. ‗Breaks‘ Check the memory locations when just written to ‗Memory by the code. If they hold the correct values, take as Display‘ correct. 163 Pattern Table Viewer Test Case It should be capable of displaying: the image tiles in the pattern table the Palettes used by the pattern table entries tile information (table num, tile num) palette information (Palette type, entry num, master palette entry num) Display options: toggle automatic refresh alterable refresh rate Acceptance Condition Most of these requirements can be tested via comparison with the output of the FCEUXD SP emulator with a given ROM. Reference ‗Many‘ The remainder can be tested via observation (table num, palette type). By observation. 164 ‗Refresh‘ Name Table Viewer Test Case It should be capable of displaying: the image made up of pattern table entries in the name tables scroll lines Should display correct attribute table values Display options: toggle automatic refresh alterable refresh rate Display the name tables in two bit colour. The capability to view the name table in numeric form. Acceptance Condition Accept if the visuals match those seen in FCEUXD SP for several ROMs. Reference ‗Tiles and Scroll Lines‘ Accept if the name table colours are displayed correctly. The attribute values must be correct if these are displayed correctly. By observation. ‗Attribute Information‘ Compare the tile colours used in the name tables against the colours of the appropriate tiles in the pattern table viewer. If these match, two bit colour display is working correctly. Compare the values shown in this display against the appropriate locations in memory. ‗Two Bit Colour‘ 165 ‗Refresh‘ ‗Numeric Name Tables' Appendix H: Project Logs 166 Before 1st October 2007 – Preliminary research of the NES and the 6502. 1st October 2007 – NES and 6502 processor information researched. Mainly ―Programming the 6502‖ by Rodney Zaks. 4th October 2007 – Project requested via the project database. Also, PERT charts developed. 8th October 2007 – First meeting with project supervisor. 12th October 2007 – Much online material consulted. Mainly related to project extension possibilities. 18th October 2007 – Meeting with project supervisor. Project Proposal given to project supervisor. 18th October 2007 – Alterations to report suggested by supervisor implemented. Continued research on the NES APU and began documenting APU. 19th – 24th October 2007 – APU documentation continues 24th October 2007 – Restructured document to help readability. 26th October 2007 – Beginning to document PPU for analysis. 27th October 2007 – Continued documentation of PPU. 28th October 2007 – PPU documentation. 29th October 2007- Documentation of PPU registers and name tables 30th October 2007 – PPU documentation completed. Input Documentation completed. 2nd November 2007 - Continued Interim report. 7th November 2007 – Continued editing of interim document. Began work on design. 9th November – Begun CPU design doc. 14th November 2007 – Completed CPU design. Beginning PPU design doc. 16th November 2007 – Continued work on PPU design. 22nd November 2007 – Completed PPU design doc. 25th November 2007 – Beginning APU design. 167 27th November 2007 – Continued APU design. 28th November 2007 – Completed I/O design spec. 29th November 2007 – GUI designs completed. 30th November 2007 – Testing section added. 31st November 2007 – Report Cleanup, project proposal added to appendix. 1st December 2007 – Interim report completed. 15th January 2008 – CPU mainly operational. 22nd January 2008 – PPU internal operation functional. 29th January 2008 – Pattern Table Viewer functional. 2nd February 2008 – Name Table Viewer functional. 8th February 2008 – Debugger partially functional. 12th February 2008 – CPU timing locked to 60 FPS. 10th March 2008 – APU mostly functional. 19th March 2008 – APU functional bar the DMC channel. 26th March 2008 – CPU fully operational. 5th April 2008 – Sprite rendering functional. 7th April 2008 – Sprite rendering at both priority levels working. 17th April 2008 – Programming Completed. 19th April 2008 – Report Finished. 168 Appendix I: GNU GENERAL PUBLIC LICENSE Version 2 169 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so 170 that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 171 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable 172 source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the 173 Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the 174 original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 175 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. <one line to give the program's name and a brief idea of what it does.> Copyright (C) 19yy <name of author> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software 176 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) 19yy name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. <signature of Ty Coon>, 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License. 177