💾 Archived View for gem.librehacker.com › gemlog › tech › 20220202-0.gmi captured on 2022-03-01 at 15:18:17. Gemini links have been rewritten to link to archived content

View Raw

More Information

-=-=-=-=-=-=-

Learning x86-64 Assembly

I'm learning how to code in x86-64 assembly, using the GNU assembler (AT&T syntax). My more long term goal was to learn assembly for some microcontroller platforms, like AVR, but x86-64 seemed like a good place to start, since it is easy to setup my environment and there is lots of documentation.

I want to get low-level, and write code that is purely assembly and not just inline code to a C program. I'm not necessarily against calling external functions from a dynamic library, though, if I can do so with the standard ABI.

At the moment I'm just doing some exercises I've made up myself. I might look for some online programming challenges later. The most complicated program I have written so far is call "sine-ascii" and outputs asterisks to the terminal, following a sinusoidal curve.

/*
sine-ascii.s
gcc -pie -nostdlib sine-ascii.s -o sine-ascii
SPDX-FileCopyrightText: 2022 Christopher Howard <christopher@librehacker.com>
SPDX-License-Identifier: GPL-3.0-or-later


        .global _start
        .data
/*
table generated with guile scheme:
(map (lambda (n)
        (inexact->exact
          (round (+ 20 (* 20 (sin (* (/ 6.28 24) n))))))) (iota 24 0))

margin_table:
        .byte 20, 25, 30, 34, 37, 39, 40, 39
        .byte 37, 34, 30, 25, 20, 15, 10,  6
        .byte  3,  1,  0,  1,  3,  6, 10, 15
/* single byte buffer for printing */
char_buffer:
        .byte 0
        .text
/* print whatever character is in char_buffer */
print_char:
        mov     $1,%rax /* write syscall */
        mov     $1,%rdi /* file descriptor */
        leaq    char_buffer(%rip),%rsi /* buffer to write */
        mov     $1,%rdx /* number of bytes */
        syscall
        ret
/* print N spaces, where N is in rdi */
print_spaces:
        cmp     $0,%rdi
        je      1f
        dec     %rdi
        movb    $0x20,char_buffer(%rip)
        push    %rdi
        call    print_char
        pop     %rdi
        jmp     print_spaces
1:      ret
/* print asterisk and CRLF */
print_marker:
        movb    $0x2a,char_buffer(%rip)
        call    print_char
        movb    $0x0d,char_buffer(%rip)
        call    print_char
        movb    $0x0a,char_buffer(%rip)
        call    print_char
        ret
/*
loop through margin_table, printing margin_table[n] number of spaces,
with an asterisk and CRLF at the end of each line.

table_loop:
        push    %rbx
        mov     $0,%rbx
1:      cmp     $24,%rbx
        je      2f
        lea     margin_table(%rip),%rcx
        mov     $0,%rdi
        mov     (%rbx,%rcx,1),%dil
        call    print_spaces
        call    print_marker
        inc     %rbx
        jmp     1b
2:      pop     %rbx
        ret
/* program entry point */
_start:
        call    table_loop
        mov     $0x3c,%rax /* exit syscall */
        mov     $0,%rdi /* exit code */
        syscall

Here is the program output:

$ ./sine-ascii 
                    *
                         *
                              *
                                  *
                                     *
                                       *
                                        *
                                       *
                                     *
                                  *
                              *
                         *
                    *
               *
          *
      *
   *
 *

 *
   *
      *
          *
               *

After first coding this program, I learned about position independent coding, which makes it possible for the operating system to reuse more memory space. So I recoded it to its current form, for compatibility with the -pie option (Position Independent Executable). The basic difference is that in PIC, all the memory references, including function calls, must be made relative to the position of the calling code (or, the code right after it... something like that) rather than using an absolute memory address. This allows the code to execute correctly regardless of what memory address it is loaded into.

One interesting note about PIE, however: without linking with -pie, the program above has no external dependencies (other than the linux kernel system calls), but including -pie adds a dynamic dependency to ld-linux-x86-64.so.2, or the equivalent interpreter on your system.

These are some Web resources I found very helpful for x86-64 assembly programming:

https://www.felixcloutier.com/x86/

https://www.cs.uaf.edu/2017/fall/cs301/reference/x86_64.html

These articles about relocation and PIC are also quite interesting:

https://eli.thegreenplace.net/2011/08/25/load-time-relocation-of-shared-libraries/

https://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/

https://eli.thegreenplace.net/2011/11/11/position-independent-code-pic-in-shared-libraries-on-x64

The folks at the #asm Libera IRC channel (irc.libera.chat) are also very helpful.