Learning Assembly, C in the Intel 64 Architechture from the book low level programming
NOTICE All the codes in this repository only work on Mac OS X not Linux (which is different from the book and the author's github repo).
Assume that we have a program named prog.asm
, follows are steps to compile, link and run this
program on Mac OS X.
-
Compile:
nasm -f macho64 prog.asm
This will produce the file
prog.o
ready for the linker. -
Linking:
ld -macosx_version_min 10.7.0 -lSystem -o prog prog.o
This will produce the executable object file ready to run.
-
Run:
./prog
NOTICE If the nasm
comes with Mac OSX system doesn't support macho64
format, use homebrew to install latest version of nasm
: brew install nasm
A few nasm
related commands will come in handy:
version: nasm -v
, available formats: nasm -hf
, help: nasm -h
System call number is passed into the register rax
berofe the syscall
instruction.
One important thing to keep in mind is that on Mac OSX we need to add 0x2000000
to the actual
system call number before copying it into rax
.
How to find system call number?
-
Check the kernel version on your Mac machine by typing this command on the terminal
uname -v
. On my machine the output looks like this:Linhs-MBP:~ linhngoc$ uname -v Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64
In my case, ther kernel version part is
xnu-3789.70.16
-
Go to https://opensource.apple.com/source/xnu/xnu-3789.70.16/bsd/kern/syscalls.master.auto.html. Don't forget to replace the kernel version to match yours.
This page lists all the system calls available on MAC OS X system. The left most column contains system calls numbers, and the right most column contains function prototypes.
System calls arguments are passed in to registers rdi
, rsi
, rdx
, r10
, r8
and r9
for
the first argument, second argument, third argument and so on, respectively.
System calls are limited up to 6 arguments, and no argument is passed directly on stack.
As always, the return value will be stored in the register rax
.
syscall
instruction changes rcx
and r11
, thus if your program has something to do with these two
registers, make sure to save them before syscall
and restore them after that.
Callee-saved registers consists of 7 registers rbx
, rbp
, rsp
, r12-r15
. These registers must be
restored by the procudure being call. So, if it needs to change them, it has to change them back.
In linux system we can declare global label such as global _start
, etc...But things
are a little diffrent with Mac OSX. If we underscore global variables the linker will fail(still don't quit understand why the hell is that)
To enable Intel assembly syntax, add this line settings set target.x86-disassembly-flavor intel
to ~/.lldbinit
file.
Let's assume that we have a executable file named prog
. We can start looking around by invoke the command lldb prog
-
b some_label
. For exampleb start
: set break point at start label. -
run
: run until hit the breakpoint. -
n
: execute the next instruction. -
register read some_register
. For example,register read rax
: reads content currently inrax
. -
p $some_register
. For example,p $rdi
. Similar to the 4. above but instead of outputting contents in hexadecimal format, this command will output it in a human readable format, e.g(unsigned long) $0 = 6
. -
memory read --size [sz] --format [x|a|i|c|s] --count cnt $register_name
: reads cnt consecutive memory cells, each cell has size of sz (1 for 1 byte, etc...) starting at the address stored inregister_name
(must be prefixed with$
sign, i.e$rdi
), and outputs it in eitherx
(hexadecimal format),a
(address),i
(instruction),c
(char) ands
(null-terminated string).Example 1: In C term, test_string is char * and it points to "abcdef"
(lldb) register read rdi rdi = 0x0000000000002000 test_string (lldb) memory read --format s $rdi 0x00002000: "abcdef"
Example 2: reads first 3 characters from a null-terminated string
(lldb) memory read --size 1 --format c --count 3 $rdi 0x00002000: abc
-
To do...
-
If you're familiar with
gdb
, you can visit this lldb vs gdb for more informations.