SC123 Assembly Language Manual SC123 Website: Dr. Robert Silverman Computer Science Department
Transcription
SC123 Assembly Language Manual SC123 Website: Dr. Robert Silverman Computer Science Department
SC123 Assembly Language Manual SC123 Website: http://www.cs.csustan.edu/~rrsilver/html/sc123.html Dr. Robert Silverman Computer Science Department California State University Stanislaus Turlock, CA 95382 rsilverman@csustan.edu Created date: May 1, 2006 Revised date: April 29, 2007 Revised date: June 8, 2007 Revised date: June 16, 2007 Revised date: July 6, 2007 Revised date: July 20, 2007 Revised date: August 27, 2007 Revised date: November 9, 2007 Revised date: January 7, 2008 Revised date: October 13, 2008 Revised date: November 20, 2008 Revised date: May 11, 2009 Revised date: November 17, 2009 Revised date: July 30, 2010 Revised date: October 5, 2010 Revised date: October 25, 2010 Revised date: October 7, 2011 Revised date: February 5, 2012 Revised date: August 26, 2012 Revised date: October 2, 2012 Revised date: October 15, 2012 Revised date: January 20, 2013 Revised date: July 20, 2014 Note: Key changes at architecture level in January 2012: User space allocation. Stack behavior. Old programs using stack will not work and require user to read this document and then to modify their programs. This document and system is copyrighted by Robert R. Silverman, 2002, ... current date. No part of this system or information presented here can be reproduced using any means, such as: book print, physical, electronic, chemical, biological, mathematical, telepathic, or other not yet discovered means without the explicit permission by Dr. Robert R. Silverman, CSU Stanislaus, rsilverman@csustan.edu An explicit permission is given to all my students to copy this material as needed for educational noncommercial purpose. This document can be obtained for free from Dr. Silverman for educational noncommercial purpose. Robert Silverman Page 1 8/1/2014 TOC SC123 Assembly Language Manual.........................................................................................................1 Introduction...........................................................................................................................................3 Overview of SC123 System..................................................................................................................3 System Organization Diagrams............................................................................................................4 Instruction Set.......................................................................................................................................5 Address Space Partition........................................................................................................................5 Registers and Memory..........................................................................................................................5 Addressing Modes................................................................................................................................5 Operand Addressing Mode Usage........................................................................................................6 Machine Instruction Format..................................................................................................................7 Instruction Semantics............................................................................................................................7 Basic Instructions (group 0)..............................................................................................................8 Logic and Shift Instructions (group 1)..............................................................................................9 Input/output Instructions (group 2)...................................................................................................9 Jump Instructions (group 3)...............................................................................................................9 Subroutine Call Instructions (group 4)............................................................................................11 Data Definition....................................................................................................................................11 Data Semantics....................................................................................................................................11 I/O Device Programming....................................................................................................................11 Example of a Simple Program............................................................................................................12 Program Example...............................................................................................................................13 Programming Fragments Examples....................................................................................................14 Robert Silverman Page 2 8/1/2014 Introduction This is the Assembly Language Manual for SC123, a new Educational computer Developed at CSU Stanislaus. This manual has four main parts: ● System overview ● SC123 organization overview ● Assembly language syntax and semantics. ● Code tutorial The SC123 is a new computer architecture. This manual is for the ID3E programming and execution environment. The ID3E consists of an Assembler, IDE, and an Execution engine all packaged in a friendly GUI accessible system. The ID3E provides assembly language programming teaching/learning environment that will: 1. Provide instructors with the capability of staging the learning process of the instruction architecture from basic operations to more complex operations to a complete architecture. 2. Provide instructors with a convenient tool within which to test students programs. 3. Provide students with a realistic assembly language programming experience that is readily transferable to any industry setting in which they would need to use assembly language. 4. Provide students with a convenient tool to learn, write, test, debug, and execute their programs. Overview of SC123 System The machine has a simple architecture to make the language easy to learn. Here is the registers view of the architecture. The functional view is discussed in the ISA section. The machine contains the following user program accessible registers. Register SP is mapped into r3. We will refer to it as SP or r3 as appropriate. Register name functionality PC program counter 16 bits holds the address of the current instruction IR instruction register 48 bits holds instruction including two addresses (if needed) r0-r3 general purpose reg 16 bits register to hold operands for computation or address of operands PSR processor status rg 16 bits currently contains only CCstate mapped into the five least significant bits of PSR CCstate Condition code state negative (N), positive (P), Zero (Z), carry (C), and overflow (O) SP/r3 stack pointer/r3 16 bits points to top of stack (empty slot) stack grows down. In other words, pushing a word onto stack cause sp to decrement. The semantic was changed in January 2012. inport input register 16 bits brings data from input devices outport output register 16 bits sends data to output devices Robert Silverman Page 3 8/1/2014 CCstate[N], CCstate[Z], CCstate[P], CCstate[C], and CCstate[O] are condition code bits which are affected by arithmetic operations. They are used for conditional jump (jumpct, jumpcf), described in jump section. The valid address space is 2^15 words; hence the valid address range is 0x0000-0x7FFF. If you want to know why the valid space is not 2 ^ 16 words, see how real computers deal with negative address. System Organization Diagrams The following is the SC123 organization diagram. The organization is simple and can be used in the first course on assembly language, course on computer architecture, or a course on computer organization. For clarity, omitted from the diagram is PSR and its logic. PSR.CCstate is affected by results of arithmetic operations. We will call PSR.CCstate the “condition code state”. Figure: SC123 Organization Diagram Main Memory CPU regs IR-Dec R3-Inc PC-Inc MAR ALU MM MDR Data bus Address bus Control bus SEQ Inport Outport Input/Output Robert Silverman Page 4 8/1/2014 Instruction Set For pedagogical purposes, the SC123 instruction set is divided into a number of disjoint op-code groups. This allows the instructor to teach the instructions in logically related groups. It is tempting to describe these instruction set divisions as hierarchical, adding layers of sophistication. However, this would be misleading. Different teachers might present concepts in a different order, so the instruction set might grow perfectly logically in more than one direction. While one instructor might want to teach arithmetical operations first, followed by logical operations, and then subroutines, another might leave logical operations out of the course entirely, but introduce indirect addressing. In addition to the partitioned op-code set, the SC123 uses an orthogonal set of addressing modes which are available to most instructions. Addressing modes which are orthogonal to the op-code set are a feature of most (though not all) modern computer architectures. They simplify the instruction set architecture while enhancing the computational power and efficiency. Pedagogically, orthogonality allows the two concepts -- operations and addressing -- to be taught separately. Address Space Partition The computer address space is partitioned into three subspaces. [0x00-0x0F] Interrupt space [0x10-0x3F] I/O space [0x40-(MMsize-1)] user program space. Registers and Memory The SC123 CPU includes four general purpose registers, stack pointer (mapped to r3), a program counter, and a processor status register which contains the condition code register. Each of these, except the condition code register, is sixteen bits in size. The condition code register (CCstate) (condition code state) has separate bits for "negative," "zero," "positive," "carry," and "overflow" conditions: Name Meaning Negative Zero Positive Carry Overflow Operation result is less than zero Operation result is zero Operation result is greater than zero Operation resulted in an arithmetic carry Operation result is larger than will fit in destination CCstate Register bit 0 1 2 3 4 Memory consists of sixteen-bit words, addressable on a word basis. Memory is used to store data, program and stack. Addressing Modes The bulk of the SC123 instruction set uses two operands, designated as the source and destination operands, with the following "basic addressing modes". If information is in main memory, we will use concept of effective address (ea), to describe how the data is accessed. The data is accessed in main memory using the ea, but the ea is computed differently in different modes. Robert Silverman Page 5 8/1/2014 Register: This addressing mode specifies one of the four general purposes registers. If src is register, the information is in register. If destination is register, then the data will be deposited in the register. Memory: This mode indicates a location in main memory. When the memory addressing mode is used, the instruction is followed by a sixteen-bit fixed address. Note that although the machine-level architecture has a single memory addressing mode, the assembler makes distinctions between general memory addresses, data variables, and program labels. If operand uses memory mode, then ea = address. Variable names must be at least 2 chars long. Literal: This addressing mode specifies the data as a part of the instruction, rather than indicating where to find it. When it is used, the instruction is followed by a sixteen-bit fixed data value. This addressing mode is only permitted with source operands, for obvious reasons. If source uses literal mode, then the information is in the instruction. Register Indirect with Offset: This addressing mode specifies a register that contains a base address, plus an offset to be added to that base address to produce the memory address of the operand. When used, the instruction is followed by a sixteen-bit fixed offset. Register indirect with offset addressing can be used in a wide range of applications, including register indirect addressing (with a zero offset), stack operations (using a register as the stack pointer, and incrementing or decrementing the register using the add and subtract instructions), and record references (with the base address of the record in the register and the in-record offset of the data in the offset). For stack operations, r3 is used as the stack pointer by convention. The effective address is computed as follows: ea = r + offset. The data is contained in MM[ea]. The offset can be positive or negative. Control flow instructions (jump, conditional jumpct, jumpcf, and jump to subroutine) use the "jump addressing mode" for its address. Jump addressing mode: This addressing mode specifies a memory location, and is used exclusively as the target of a control flow jump. When it is used, the instruction is followed by a sixteen-bit fixed address. Although this mode is similar to the previously described memory mode, we consider it a separate mode for the sake of clarity, since it references a memory location rather than its contents. Operand Addressing Mode Usage Some instructions have no operands, such as nop and halt. Some instructions have operands and need to use addressing modes to access the operands. The addressing modes are as general as possible, but clearly some addressing modes cannot be applied in certain situations. For example, literal cannot be a destination. NA means- not applicable. instruction Op1 permitted modes Op2 permitted modes nop, halt, clearcc NA NA move, add, sub, and, nor, shl, shr rg, mm, lit, rg(off) rg, mm, rg(off) in, pop rg, mm, rg(off) NA out, push rg, mm, lit, rg(off) NA jumpa jump-mode NA jumpct, jumpcf rg, mm, lit, rg(off) jump-mode jsr jump-mode NA ret NA NA Robert Silverman Page 6 8/1/2014 Machine Instruction Format SC123 instructions are divided into four nibbles, (a nibble is four bits): the op-code group, the op-code, the source operand, and the destination operand. For pedagogical reasons, the instruction set consists of several instruction groups. We have the following instruction groups, the basic group (G0), logical and shift group (G1), input/output group (G2), control flow group (G3), and subroutine call group (G4). Based on our experience in the classroom, we may decide to add additional groups. Group Op-code Source Destination The group number and op-code fields are simple, four-bit numbers. The source and destination fields contain addressing modes. For the four basic addressing modes and jump mode consist of two, two-bit sub-fields: the select mode bits and a select register bits. Address mode: select mode bits select register bits The address mode field can have the following values: Address Mode Register Memory Literal (source operands only) Register indirect with offset jump select mode bits 00 01 10 11 01 select register bits 00-11 00 00 00-11 01 Note that the memory, literal, and register indirect with offset address modes involve an additional word appended to the instruction to hold the memory location or literal value. The control flow instructions have only one mode for its address, the jump address mode. The jump address mode: The jump address mode is specified by 0101[2]. Instruction Semantics In the following sections, we describe the semantics of each of the instructions. The following abbreviations are used: Robert Silverman Page 7 8/1/2014 PSR: processor status register (currently containing only CCstate) PC: program counter r0-r3: the four general purpose registers il: instruction length CCstate: condition code register (aka “condition code state”) Note that CCstate is mapped into the five least significant bits of PSR. N: negative condition code Z: zero condition code P: positive condition code C: carry condition code O: overflow condition code src: source operand (see addressing mode discussion above) dst: destination operand (see address mode discussion above) addr: instruction address operand (see address mode discussion above) Most of these are obvious, but the instruction length, il, requires some additional explanation. SC123 instructions are one, two, or three words in length, depending upon the addressing modes used. The basic instruction size is one word. A source operand of memory, literal, or register indirect with offset type adds an additional word (to hold the memory address or literal value). A destination operand of memory type (literal being illegal for destination operands) adds an additional word (to hold the memory address). The il is the total length of the current instruction. Some instructions affect no condition code state, some affect some bits of condition code state, and some instructions affect all bits of condition code state. If an instruction does not affect a bit, the bit stays unchanged. Basic Instructions (group 0) The basic instruction set includes the most basic operations available in the SC123. Operation Description No-op Halt Clear CCstate Move No operation Halt computation Clear condition codes Move word from src to dst move 3 Add Add src to dst add 4 Subtract Subtract src from dst sub 5 Robert Silverman Mnemonic Opcode nop 0 halt 1 clearcc 2 Page 8 Semantics PC + il -> PC 0 -> CCstate PC + il -> PC src -> dst PC + il -> PC src + dst -> dst PC + il -> PC dst - src -> dst PC + il -> PC CCstate affected None None All None All All 8/1/2014 Logic and Shift Instructions (group 1) The logic instructions include basic bit-wise manipulations. This is logical shift by number of bits specified by cnt. For shl, bits shifted to the left are lost. We inject zeroes from right. For shr, bits shifted to the right are lost and zeroes are injected from left. Operation Description And Bit-wise logical and Mnemonic Opcode and 0 Nor Bit-wise logical nor nor 1 Shift left Bit-wise logical left shift shl 2 Shift right Bit-wise logical right shift shr 3 Semantics CCstate affected N, Z, P src AND dst -> dst; PC + il -> PC; src NOR dst -> dst; N, Z, P PC + il -> PC; src -> cnt; N, Z, P dst shifted left by cnt -> dst; PC + il -> PC; src -> cnt; N, Z, P dst shifted right by cnt -> dst; PC + il -> PC; Shift left operation: Ex: shl #2 val 0100 0000 1000 1100 Result: 0000 0010 0011 0000. In other words: If (i + cnt) < 16, shift bi left by count cnt. If (i + cnt) >= 16, discard bi. Inject 0 from right. Input/output Instructions (group 2) The input/output instructions allow the SC123 to communicate with the outside world via a single input and a single output channel. Operation Description In Input data from external device Output data to external device Out Mnemonic Opcode in 0 out 1 Semantics inport -> dst PC + il -> PC src -> outport PC + il -> PC CCstate affected none none Jump Instructions (group 3) We have unconditional jumpa (always). Conditional jumps provide for condition-based change of control flow. Robert Silverman Page 9 8/1/2014 Jumpct jumps if condition is true. jumpcf jumps if condition is false. A single conditional jump instruction can test any combination of the processor's condition code state (CCstate). Description of jumpct: jumpct <CCcond> <address>. The jump condition "CCcond" can specify a logical "or" of multiple primitive conditions. The primitive conditions are the following: N, Z, P, C, and O. We can jump for example, on “P or Z”. The PSR stores five condition code state (CCstate) bits as five bits, where the least significant is b0. The bits are b4= [O], b3=[C], b2= [P], b1= [Z], and b0= [N]. To determine if jumpct will take the jump, we need to compute the boolean condTotal. The system hardware evaluates the condTotal and if it is true, then the jump is taken. Otherwise, jump is not taken. There are two components to compute condTotal; we need to know CCstate and CCcond. The CCcond specified in the jumpct can be a number between 0 and 31 (base 10). This is equivalent to 00000 to 11111 (base 2). Higher bits than b4, will be ignored. The CCcond is specified in decimal, but can be converted to 5-bit binary number. For example, if CCcond=#3 (base 10), that is equivalent to 00011 (base 2). If CCcond=#1 (base 10), that is equivalent to 00001 (base 2). The condTotal is computed as follows. condTotal = ORi=0 i=4 (CCstate[i] AND CCcond[i]]); the "logical or" is taken from i = 0 to 4. In other words: condTotal = (CCcond[4] AND CCstate[4]) OR (CCcond[3] AND CCstate[3]) OR (CCcond[2] AND CCstate[2]) OR (CCcond[1] AND CCstate[1]) OR (CCcond[0] AND CCstate[0]) For example: jumpct #1 200, #1 (base 10) = 00001 (base 2), will take the jump if CCstate[0] ==1, in other words, jump on negative. jumpct #3 200, #3 (base 10) = 00011 (base 2), will take the jump if CCstate[1] ==1 or CCstate[0] ==1, in other words, jump on zero or negative. Jumpct #16 200, will jump if CCstate[4] ==1, in other words, jump on overflow. jumpcf behaves similarly, except it jumps if the condTotal evaluates to false. Operation Description Unconditiona Jump always l jump Conditional Jump based on condition jump if code values and will be condTotal is taken if condTotal = true true Conditional Jump based on condition jump if code values and will be condTotal is taken if condTotal = false false Robert Silverman Mnemonic Opcode jumpa 0 Semantics jumpct 1 if (condTotal = true) then addr -> PC else PC + il -> PC None jumpcf 2 if (condTotal =false) then addr -> PC else PC + il -> PC None Page 10 addr -> PC CCstate affected None 8/1/2014 Subroutine Call Instructions (group 4) These instructions allow subroutines to be called, and data to be pushed onto and popped off of the stack. Regis ter r3 is used as the stack pointer. The semantic was changed in January 2012. Operation Description Semantics Return Mnemonic Opcode Call a subroutine, pushing jsr 0 a return address on the stack Return from a subroutine ret 1 Push Push data onto the stack push 2 Pop Pop data from the stack pop 3 PC + il -> M[SP] SP - 1 -> SP addr -> PC SP + 1 -> SP M[SP] -> PC src -> M[SP] SP - 1 -> SP PC + il -> PC SP + 1 -> SP M[SP] -> dst PC + il -> PC Jump to subroutine CCstate affected None None None None Data Definition Data size is one word. We can define uninitialized data like so: cnt word //reserve 1 word uninitialized array word 10 //reserve 10 word uninitialized sum initword #13 //reserve 1 word initialized to 13 Data Semantics We have one data type: 16 bit integer in 2s complement notation. All arithmetic uses 2s complement rules. I/O Device Programming The simulator models I/O subsystem. The I/O space is memory mapped. Currently implemented I/O device is Printer. In the future we may add hard disk. Each I/O device is represented by interface as follows. Name DevTypeRg CtrRg StatRg DataRg Usage Register contains the type of I/O device (read only) Hold commands (write only) Contains status of the device (read only) Data to/from device (read/write) Rg offset within interface 0 1 2 3 Device Interrupts Address allocated 0x00-0x0F State of affairs done Robert Silverman Page 11 8/1/2014 Printer Disk Reserved 0x10-0x1F 0x20-0x2F 0x30-0x3F done future future Printer status register bits mapping. The status register has the following assignment. Bit 0 1 2 3 4 5 6 Bit name bitgoodproto bitinproto bitrdy bit3 bitoper bitjam bit6 function 1/0 the device is within good protocol/ otherwise bad protocol the device is within protocol/ not in protocol DataRg is ready for new data/ not ready NU the device is operational/ the device is not operational the device has soft error/ the device does not have soft error NU Devices commands (new commands may be added later). Also listing special control data. Value 0xF001 0xF002 0xF003 0xF0040xF00F command PROTOSTART PROTOEND PROTORESET reserved function start protocol end protocol reset protocol Value control data function 0x8000 NEWLINE tells printer to go to new line For sample printer program printing a char on the printer see the program in WrkSpace dir. Example of a Simple Program Here is an example program in SC123 assembly language intentionally using a broad range of instructions and different addressing modes. In this case we will take an array and modify each element by adding 3 to it. Note that each program has four required blocks, the comment header, data, text, and stack. Each program block is delimited by <X>Start and <X>End assembly directives. Omission of any of them will generate an assembly error. The data, text, and stack blocks directives include user specified start address where the block will be located in main memory. For example, !textStart 90 means that the text block starts in location 90[10] in MM. This teaches students that programs take space. The program can use decimal and hexadecimal numbers for values and addresses. Hexadecimal number is represented by 0xNNNN, where N is hex digit, 0...F. Robert Silverman Page 12 8/1/2014 Program Example !commentStart //The slash-slash denotes that this line is a comment. //Comment must be alone on its own line. //Line that starts with ! is an assembler directive. //Space between two fields can be single or multiple blank(s) or tab(s). //Every program consists of four adjacent blocks: comment, data, text and stack. //There can be blank line anywhere, except between adjacent blocks. //This program goes through an array and adds 3 to each element. //This is not written the most efficient way – it is this way – like everything else – for pedagogical reason. !commentEnd !dataStart 0x80 //array data start myArray initword #11 initword #12 initword #13 initword #14 //myArrayLen is initialized to length of the array myArrayLen initword #4 //add this constant const initword #3 //jump on neg CCNEG initword #1 !dataEnd !textStart 0xA0 move #myArray %r1 //%r1 points to element 0 //check array limit (use myArrayLen as count) startLoop: sub #1 myArrayLen //jump on negative jumpct CCNEG endProg //get next array element //myArray[i] -> r2 move %r1(0) %r2 //add 3 to each element of array. //3 + r2 -> r2 add const %r2 //modified element -> myArray[i] move %r2 %r1(0) //update ptr address add #1 %r1 jumpa startLoop endProg: halt !textEnd !stackStart 0xE0 !stackEnd Robert Silverman Page 13 8/1/2014 Programming Fragments Examples Notes: SP is mapped into r3. Here are some examples using different addressing modes. Literal cannot be destination. op1 address mode register direct: add %r1 %r1 // r1 + r1 -> r1 add %r1 varx // r1 + M[varx] -> M[varx] op1 literal: add #20 %r1 //20 + r1 -> r1 add #200 vary //200 + M[vary] -> M[vary] move #myArray %r2 //address of array -> r2 op2 mm-direct: add varx vary // M[varx] + M[vary] -> M[vary] add 200 vary // M[200] + M[vary] -> M[vary] add #varx vary // address of varx + M[vary] -> M[vary] op2 register-indirect with offset: add %r1(0) %r2(10) //M[ r1 + 0] + M[r2 + 10] -> M[r2 + 10] //jsr, jumpct, jumpcf, jumpa use “jump addressing mode” //jump addressing mode: the address can be specified by a label //or by a numeric address (with no # symbol). jsr 200 //means push ret address, and 200-> PC jsr #200 //assembler error jsr startSub //means push return address, and startSub -> PC jumpa op1 //op1 jump-mode jumpct op1 op2 // op2 jump-mode jumpcf op1 op2 // op2 jump-mode //For jumpct, jumpcf, op1 can be any of the four basic addressing modes. In/out in op1 //op1 can be reg-direct, mm-direct, reg-indirect with offset. out op1 // op1 can be any of the four basic addressing modes. Push/pop pop op1 //op1 can be reg-direct, mm-direct, reg-indirect with offset. push op1 //op1 can be any of the four basic addressing modes. Some clarifications move arr1 %r0 //M[arr1] -> %r0, mem mode move 200 %r0 //M[200] -> %r0, mem mode move #200 %r0 //200 -> %r0, lit mode move #arr1 %r0 //arr1 -> %r0, lit mode Control flow instructions: jumpa 200 //200 -> PC, jumpa loop //loop -> PC, Robert Silverman jump mode jump mode Page 14 8/1/2014