Implementation Examples Three examples of assemblers for real machines are: 1. MASM assembler 2. SPARC assembler 3. AIX assembler MASM Assembler The programs of x86 system views memory as a collection of segments. Each segment belongs to a particular class corresponding to its contents. The commonly used classes are: 1. CODE 2. DATA 3. CONST 4. STACK During program execution segments are addressed via an x86 segment register. In most cases: Code Segments are addressed using register CS. Stack Segments are addressed using register SS. * The loader automatically sets CS and SS when the program is loaded. CS is set to indicate the segment that contains the starting label specified by the ‘END’ statement of the program. * SS is set to indicate the last stack segment processed by the loader. * The programmer can specify explicitly the segment register to be used, else the assembler selects one. * Data segments are addressed using DS,ES,FS and GS. * By default the assembler assumes that all references to data segments use register ‘DS’, but the following statement with the assembler directive ASSUME tells the assembler to assume that register ES indicates the segment DATAEG2.
ASSUME ES:DATASEG2| * Thus any references to labels that are defined in DATASEG2 will be assembled using register ‘ES’. * It is also possible to group several segments together. The following instruction would set ‘ES’ to indicate data segment DATASEG2. MOV AX,DATASEG2MOV ES, AX| * BASE directive tells the SIC/XE assembler the contents of register ‘B’/ * ASSUME directive tells MASM the content of a segment register. Jump instructions are assembled is two ways 1. Near Jump 2. Far Jump
Near Jump * It is a jump to a target location in the same code segment. * Assembled instruction for NEAR JUMP is 2 or 3 bytes. Far Jump * It is a jump to a target location in a different code segments. * Assembled instruction for FAR JUMP is 5 bytes. Pass 1 of x86 assembler It is more complex than SIC as, operands has to be analyzed in addition to operation codes. Segments of MASM * Segments of MASM source program can be written in more than 1 part. * If a segment directive has a name as a previous defined segment, then it is said to be the continuation of that segment. The assembly process combines all the segments together. * These segments are similar to program blocks. * Assembler handles the references between the segments. * External references between separately assembled module is handled by the linker. MASM directives * MASM directive PUBLIC function is similar to EXTDEF. * MASM directive EXTRN function is similar to EXTREF. SPARC Assembler Sections * The SPARC assembly language program is divided into units called sections. * The assembler provides a set of predefined section names, such as the following: . TEXT .DATA .RODATA .BSS The programmer can switch between sections at any time in the source program by using assembler directives. * The assembler maintains a separate location counter for each named section. Similarity between Section and program blocks * Each time assembler switches to different section, it also switches to the location counter associated with that section. In this way sections are similar to program blocks. Difference between sections and program blocks * References between different sections are resolved by the linker in the case of sections, and by the assembler in the case of program blocks.
Symbols used in the program * Local symbol * Global symbol * Weak symbol Object file of SPARC * The object file written by the SPARC assembler contains translated versions of thee segments of the programs and a list of relocation and linking operations that need to be performed. * The object program also includes a symbol table that describes the global symbol, week symbol and section names. Delayed branch * SPARC assembler language branch instructions are delayed branches. * The instruction immediately following a branch instruction is actually executed before the branch is taken.
AIX Assembler AIX assembler supports various models of PowerPC microprocessors as well as machines that implement the original POWER architecture. .MACHINE assembler directive * The programmer can declare which architecture is being used with the assembler directive . MACHINE. * PowerPC program that contains only instructions that are also in the original POWER architecture would be executable on either type of system. Base register * PowerPC load and store instructions use a base register and a displacement value to specify an address in the memory. Any register except GPR0 can be used as a base register. * Decisions about which registers to use are left to the programmer. * The programmer specifies which registers are available for use as a base register, and the contents of these registers, with the “USING” assembler directive. Thus the statements .USING LENGTH, 1. USING BUFFER, 4| would identify GPR1 and GPR4 as the base registers. * GPR1 contains the address of the LENGTH. * GPR4 would contain the address of BUFFER. If the base register is to be used later for some other purpose, the programmer uses the . DROP statement which Indicates that the register is no longer available for addressing purpose. Selection of base register * For each instruction whose operand is an address in the memory, the assembler scans the table to find a base register that can be used to address that operand. * If more than one register can be used to address the operand, the assembler selects the base register that results in the smallest signed displacement. If no suitable base register is available the instruction cannot be assembled. * AIX assembler language also allows the programmer to writ base registers and displacements explicitly in the source program. Dummy control section * AIX assembler provides a special type of control section called dummy sections Data items included in a dummy section do not actually become a part of the object program; they serve only to define labels within the section. * Dummy sections are most commonly used to describe the layout of a record or table that is defined externally.
Table of Contents (TOC) * By using this assembler directive the programmer can create a table of contents(TOC) for the assembled program. * TOC contains the addresses of control sections and global symbols defined within the control sections. The two passes of an AIX assembler AIX assembler itself has two pass structures. Pass 1 * The first pass of the AIX assembler writes a listing file that contains warnings and error messages. * If errors are found during the first pass the assembler terminates and does not continue to the second pass. If no errors are detected during first pass the assembler proceeds to pass 2. Pass2 * The second pass reads the source program again, instead of using an intermediate file. It means that the location counter values must be recalculated during pass 2. * Any not serious warning messages that were generated during pass1 are lost. * The assembled control sections are placed into the object program. Relocation and Linking * Relocation and linking operations are specified by entries in a relocation table, which is similar to the modification record for SIC.