Understanding Low-Level Assembly Programming Fundamentals

low level programming n.w
1 / 24
Embed
Share

Delve into the world of low-level assembly programming, exploring concepts like registers, memory, instruction sets, and function call implementations. Discover how assembly code interacts with hardware and the operating system to create efficient programs.

  • Assembly Programming
  • Low-Level Programming
  • Registers
  • Memory
  • Operating System

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Low level Programming

  2. Assembly basics What makes up assembly code? Instructions Architecture specific Operands Registers Memory (specified as an address) Immediates Conventions Rules of the road and/or behavior models

  3. Registers General purpose 16bit: AX, BX, CX, DX, SI, DI 32 bit: EAX, EBX, ECX, EDX, ESI, EDI 64 bit: RAX, RBX, RCX, RDX, RSI, RDI + others Environmental RSP, RIP RBP = frame pointer, defines local scope Special uses Calling conventions RAX == return code RDI, RSI, RDX, RCX == ordered arguments Hardware defined Some instructions implicitly use specific registers RSI/RDI String instructions RBP leaveq

  4. Memory X86 provides complex memory addressing capabilities Immediate addressing mov %rsi, ($0xfff000) Direct addressing mov %rsi, (%rbp) Offset Addressing mov %rsi, $0x8(%rax) Base + (Index * Scale) + Displacement A.K.A. SIB Occasionally seen Hardly ever used by hand movl %ebp, (%rdi,%rsi,4) Address = rdi + rsi * 4 A more complicated example segment:disp(base, index, scale)

  5. 8/16/32/64 bit operands Programmer explicitly specifies operand length in operand Example: mov reg, reg 8 bits: movb %al, %bl 16 bits: movw %ax, %bx 32 bits: movl %eax, %ebx 64 bits: movq %rax, %rbx What about movl %ebx, (%rdi) ?

  6. Function call implementation We can now decode what is going on here int foo(int arg1, char * arg2) { return 0; } 0000000000000107 <foo>: 107: 55 push %rbp 108: 48 89 e5 mov %rsp,%rbp 10b: 89 7d fc mov %edi,-0x4(%rbp) 10e: 48 89 75 f0 mov %rsi,-0x10(%rbp) 112: b8 00 00 00 00 mov $0x0,%eax 117: c9 leaveq 118: c3 retq Location Arguments Passed in registers (which ones? And why those?) Return code Stored in register: EAX Address of function + ret instruction

  7. OS development requires assembly programming OS operations are not typically expressible with a higher level language Examples: atomic operations, page table management, configuring segments, System calls(!) How to mix assembly with OS code (in C) Compile with assembler and link with C code .S files compiled with gas Inline w/ compiler support .c files compiled with gcc

  8. Implementing assembler functions C functions: Location, args, return code ASM functions: Location only Programmer must implement everything else Arguments, context, return values Everything in foo() from before + function body Programmer takes place of compiler Must match calling conventions

  9. Calling assembler functions Programmer implements calling convention Behaves just like a regular function Only need location Linker takes care of the rest Defines a global variable .globl foo foo: extern int foo(int, char *); push %rbp mov %rsp, %rbp int main() { } int x = foo(1, test ); main.c foo.S

  10. Inline OS only needs a few full blown assembly functions Context switches, interrupt handling, a few others Most of the time just need to execute a single instruction i.e. set a bit in this control register GCC provides ability to incorporate inline assembly instructions into a regular .c file Not a function Compiler handles argument marshaling

  11. Overview Inline assembly includes 2 components Assembly code Compiler directives for operand marshaling asm ( assembler template : output operands /* optional */ : input operands : list of clobbered registers /* optional */ ); /* optional */

  12. Inline assembly execution Sequence of individual assembly instructions Can execute any hardware instruction Can reference any register or memory location Can reference specified variables in C code 3 Stages of execution 1. Load C variables into correct registers or memory 2. Execute assembly instructions 3. Copy register and memory contents into C variables

  13. Specifying inline operands How does compiler copy C variables to/from registers? C variables and registers are explicitly linked in asm specification Sections for input and output operands Compiler handles copying to and from variables before and after assembly executed Assembly code references marshaled values (index of operand) instead of raw registers

  14. Operand Codes Wide range of operand codes ( constraints ) are available Input: code (c-variable) Output: =code (c-variable) a = %rax, %eax, %ax b = %rbx, %ebx, %bx c = %rcx, %ecx, %cx d = %rdx, %edx, %dx S = %rsi, %esi, %si D = %rdi, %edi, %di r = Any register q = a, b, c, d regs m = memory operand f = floating point reg i = immediate g = anything Other Operand codes Explicit Register codes And many more .

  15. Register example int foo(int arg1, char * arg2) { int a=10, b; asm ("movl %1, %%ecx;\n movl %%ecx, %0;\n" : =b"(b) : a"(a) : ); /* output */ /* input */ return 0; } 0000000000000107 <foo>: 107: 55 push %rbp 108: 48 89 e5 mov %rsp,%rbp 10b: 53 push %rbx 10c: 89 7d e4 mov %edi,-0x1c(%rbp) 10f: 48 89 75 d8 mov %rsi,-0x28(%rbp) 113: c7 45 f0 0a 00 00 00 movl $0xa,-0x10(%rbp) 11a: 8b 45 f0 mov -0x10(%rbp),%eax 11d: 89 c1 mov %eax,%ecx 11f: 89 cb mov %ecx,%ebx 121: 89 d8 mov %ebx,%eax 123: 89 45 f4 mov %eax,-0xc(%rbp) 126: b8 00 00 00 00 mov $0x0,%eax 12b: 5b pop %rbx 12c: c9 leaveq 12d: c3 retq What does this do?

  16. Memory example X86 can also use memory (SIB, etc) operands m operand code int foo(int arg1, char * arg2) { int a=10, b; 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) 12: 8b 4d fc mov -0x4(%rbp),%ecx 15: 89 4d f8 mov %ecx,-0x8(%rbp) 18: b8 00 00 00 00 mov $0x0,%eax 1d: c9 leaveq 1e: c3 retq asm ("movl %1, %%ecx;\n" "movl %%ecx, %0;\n" : "=m"(b) : "m"(a) : ); return 0; }

  17. Input/output operands Sometimes input and output operands are the same variable Transform input variable in some way 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x8(%rbp) 12: c7 45 fc 05 00 00 00 movl $0x5,-0x4(%rbp) 19: 8b 45 fc mov -0x4(%rbp),%eax 1c: 03 45 f8 add -0x8(%rbp),%eax 1f: 89 45 fc mov %eax,-0x4(%rbp) 22: b8 00 00 00 00 mov $0x0,%eax 27: c9 leaveq 28: c3 retq int foo(int arg1, char * arg2) { int a=10, b=5; asm ( addl %1, %0;\n" : "=r"(b) : "m"(a), "0"(b) : ); return 0; }

  18. Input/output operands (2) Input/output operands can also be specified with + 0000000000000107 <foo>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 89 7d ec mov %edi,-0x14(%rbp) 7: 48 89 75 e0 mov %rsi,-0x20(%rbp) b: c7 45 fc 0a 00 00 00 movl $0xa,-0x8(%rbp) 12: c7 45 fc 05 00 00 00 movl $0x5,-0x4(%rbp) 19: 8b 45 fc mov -0x4(%rbp),%eax 1c: 03 45 f8 add -0x8(%rbp),%eax 1f: 89 45 fc mov %eax,-0x4(%rbp) 22: b8 00 00 00 00 mov $0x0,%eax 27: c9 leaveq 28: c3 retq int foo(int arg1, char * arg2) { int a=10, b=5; asm ( addl %1, %0;\n" : +r"(b) : "m"(a) : ); return 0; }

  19. Clobbered list int foo(int arg1, char * arg2) { int a=10, b; We cheated earlier asm ("movl %1, %%ecx;\n" "movl %%ecx, %0;\n" : "=m"(b) : "m"(a) : ); How does compiler know to save/restore ECX? It doesn t return 0; } We must explicitly tell compiler what registers have been implicitly messed with In this case ECX, but other instructions have implicit operands (CHECK THE MANUALS) Second set of constraints to inline assembly Clobber list: Operands not used as either input or output but still must be saved/restored by compiler

  20. Why clobber list? Why do we need this? Compilers try to optimize performance Cache intermediate values and assume values don t change Compiler cannot inspect ASM behavior outside scope of compiler Clobber lists tell compiler: You cannot trust the contents of these resources after this point Or Do not perform optimizations that span this block on these resources

  21. Using clobber lists int foo(int arg1, char * arg2) { int a=10, b; asm ("movl %1, %%ecx;\n" "movl %%ecx, %0;\n" : "=m"(b) : "m"(a) : ecx , memory ); return 0; } ECX is used implicitly so its value must be saved/restored What about memory ?

  22. Sneak Preview: The x86 is not atomic! CISC (x86) Load/Store Arch (micro ops) HW decoder asm ( addl %1, %0\n : +m (balance) : r (amount) : ); Load R1, balance Add R1, amount Store R1, balance The x86 offers a special instruction mode that forces atomicity Only for a single instruction!! Lock Prefix: Forces all micro-ops of a single instruction to execute atomically Asserts a lock signal on the memory bus Disallows other CPUs/cores from accessing memory region asm ( lock <instr> ::: )

  23. When can you use lock? Not all instructions support locked (atomic) operation You need to check the ISA manuals

  24. Lock Example Load/Store Arch (micro ops) CISC (x86) Lock Mem/Cache Load R1, balance Add R1, amount Store R1, balance Unlock Mem/Cache asm ( lock addl %1, %0\n : +m (balance) : r (amount) : ); HW decoder 400564: 8b 45 f4 400567: 01 45 f0 add %eax,-0x10(%rbp) 400564: 8b 45 f4 400567: f0 01 45 f0 lock add %eax,-0x10(%rbp) mov -0xc(%rbp),%eax mov -0xc(%rbp),%eax

More Related Content