← blog index

TempleOS programs in Linux user-space, part 3: Stranger in a strange land

10 May 2020

Last time, we took apart the kernel and looked at some binary formats. We will now put our gained knowledge to good use and build something that can be run.

Application Binary Interface

Before we can bring our blessed code into the impure Linux-land, we need to talk a bit about how programs fit within the operating system.

An Application Binary Interface (ABI) encompasses the following (and more):

To bridge the two worlds, we need to adapt all relevant aspects of the ABI. However, in this post we will focus only on the most essential parts to get off the ground, ignoring floating-point concerns, vararg functions and exceptions, for example.

TempleOS

In the interest of simplicity, TempleOS doesn’t attempt to adhere to any pre-existing standard. This makes it interesting from a study perspective, because there is not much need for legacy cruft (ahem, aside from the x86-64 architecture itself). Programs are preferably compiled Just-in-Time, but Ahead-of-Time compilation is also possible and will be our focus from now on. The BIN format that is used for AOT compilation has been described in the previous post, but we will quickly recap the most important points:

The following calling convention is observed in HolyC code and when interacting with it:

Comparison of the two calling conventions

Linux

Linux uses what is known as the System V Application Binary Interface (commonly just SysV ABI). It is specified in the x86-64 psABI document.

SysV ABI uses an executable format called ELF. It is extremely versatile, and for this reason also quite complex and difficult to grasp at first. However, the fundamental principle is not different from other common formats – namely, you have a big blob of code annotated with headers, sections and other data structures that help hook everything up, and sometimes reveal more than would be strictly needed just to run the program.

In terms of calling convention, the rules are as follows:

Hello world

Now is a good time to bring out your homework from last time. In case you slacked it off, you can use the following program that we will compile, dissect, adapt and re-assemble together.

Our minimal program, Example.HC looks like this:

PutS("Hello world\n");

That’s it! Unlike Print, PutS has a very straightforward calling convetion, so it’s perfect for our simple example. We can compile the program like this:

Cmp("Example");

which will produce Example.BIN.Z.

Dissection

Now we will look at the executable from a few different angles. I will use Linux for this, because it gives us a lot of tools for this job, which implies that first we have to somehow extract Example.BIN from the TempleOS system.

Having done that, let’s start by simply hex-dumping the contents of the file, to see if there is anything obviously interesting.

$ xxd Example.BIN
00000000: eb1e 0000 544f 5342 ffff ffff ffff ff7f  ....TOSB........
00000010: 3800 0000 0000 0000 6000 0000 0000 0000  8.......`.......
00000020: 680b 0000 00e8 0000 0000 c348 656c 6c6f  h..........Hello
00000030: 2077 6f72 6c64 0a00 1401 0000 0000 0100   world..........
00000040: 0000 1900 0000 0000 0806 0000 0050 7574  .............Put
00000050: 5300 0000 0000 0000 0000 0000 0000 0000  S...............

Swell. We can see the TOSB signature, our string, and also the name of the called function. Presumably, somewhere in the middle of all this is executable code.

Next, we make use of the bininfo tool that was introduced last time.

$ bininfo.py Example.BIN
BIN header:
    jmp                 [EB 1E]h
    alignment           1 byte(s)
    org                 7FFFFFFFFFFFFFFF (9223372036854775807)
    patch_table_offset  0000000000000038 (56)
    file_size           0000000000000060 (96)

Patch table:
  entry IET_ABS_ADDR ""
    at        1h
  entry IET_MAIN ""
    main function @        0h
  entry IET_REL_I32 "PutS"
    at        6h

This tells us the following:

Now that we have gathered some meta-data, let’s look at the actual code. First we have to untangle it from the BIN file, though. Observe that the patch table starts at position 56 in the file, and the BIN header is 32 bytes in length. That leaves 24 bytes of program code in between. Let’s extract it:

dd skip=32 count=24 if=Example.BIN of=Example.text bs=1
objdump -b binary -m i386:x86-64 -D -M intel Example.text

Some explanation is probably in order. objdump, which comes from binutils, is a versatile tool for analyzing and disassembling object files. It supports many mainstream formats, including ELF, PE and Mach-O, but unfortunately has no support for TempleOS BIN files. What it can do, however, is to disassemble flat binaries, and this is good enough for us. The first 2 options set up this mode of operation:

The following arguments tell objdump to -Disassemble, using intel syntax and finally the name of our image.

The output should look more or less like the following:

   0:	68 0b 00 00 00       	push   0xb
   5:	e8 00 00 00 00       	call   0xa
   a:	c3                   	ret  
b: 48 rex.W c: 65 6c gs ins BYTE PTR es:[rdi],dx e: 6c ins BYTE PTR es:[rdi],dx f: 6f outs dx,DWORD PTR ds:[rsi] 10: 20 77 6f and BYTE PTR [rdi+0x6f],dh 13: 72 6c jb 0x81 15: 64 0a 00 or al,BYTE PTR fs:[rax]

That’s a bunch of code for a simple print! What is going on? In fact, only the first 3 instructions are real code (our main function), while the rest correspond to the “Hello world” string. (objdump has no way of distinguishing between code and interleaved data!)

In case you are not fluent in x86 assembly, a quick explanation of what the code does:

Now the fun part: let’s cross-reference the binary code with the patch table!

Transforming relocations

Where TempleOS gets by with one universal table, ELF needs at least two structures: a symbol table and a relocation table. The symbol table contains names of all imported and exported symbols; the relocation table contains relocations (duh), but the point is that relocation entries may also point to symbol table entries, to use symbol adresses in relocation calculations. Keep the psABI document at hand, because it also documents the applicable relocation types. To speak in concrete terms:

Adapting calling conventions

How to deal with the discrepancy in the calling conventions? We could:

Clearly, you know where this is going.

The task is somewhat complicated by the fact that we always need to know the number of arguments to write a correct thunk – for example, while we could just speculatively copy up to N arguments from the stack into registers to call from HolyC to Linux, the calling code will expect the arguments to be popped from the stack by the called function. Our C function will not do this, so it has to be handled by the thunk. This is not a big deal when writing code by hand, but since we are lazy, we will want to auto-generate as much as possible. Whatever tool we end up using, somehow it will need to be aware of the function signatures – something that is notably not encoded in the BIN file.

A thunk from HolyC to Linux

Let’s now try to write such a translation thunk. Presume that we have changed all symbols in HolyC code to add a distinguishing suffix; for example, PutS becomes PutS$HolyC. Now we want to write a function that will handle HolyC calls to PutS and dispatch them to a function written in C.

First of all, we have to save all registers that HolyC expects to be preserved.

PutS$HolyC:
    push rsi
    push rdi
    push r10
    push r11

At entry, the return address is at the top of stack and the sole argument is just below it – at the address pointed by rsp + 8. However, our register preservation effort has shifted the stack and the argument is now at rsp + 40. C code expects it in RDI (discussed above), so we need to mov it there:

    mov rdi, qword ptr [rsp+40]

(If we had further arguments, we would continue with

    mov rsi, qword ptr [rsp+48]
    mov rdx, qword ptr [rsp+56]
    ...

but that is not the case for PutS)

Registers saved, arguments prepared… time to jump into the native function.

    call PutS

Now a lot of magic happens, but eventually, PutS should return back to the call site. What now?

We don’t care about the return value, but if we did, HolyC and SysV conveniently agree on placing it in RAX. What we do need to do, though, is restore the previously saved registers:

    pop r11
    pop r10
    pop rdi
    pop rsi

And finally, we need to pop the argument(s) pushed by HolyC code, and return to it. Since we only have one argument, we will be dropping 8 bytes from the stack (after the return address)

    ret 8

And there we go! Writing a thunk in the opposite direction is conveniently left as an exercise for the reader.

Converting object files

We have tackled the 2 main interfacing problems, and what is left now is writing a bunch of code to automate the process. Despair not, for it has been done already. Enter bin2elf.

Invoking

bin2elf.py -h

shows that there are several options that we can specify. We will only need some of them:

The complete invocation will then look like this:

bin2elf.py --import-defs ExampleImportDefs.HH \
           --export-defs ExampleExportDefs.HH \
           --export-main HCMain \
           --thunks-out Example.thunks.s \
           -o Example.o \
           Example.BIN

Example.o is now a standard relocatable ELF file, and can be inspected using the usual tools (including objdump). However, due to a limitation of the library used by bin2elf, the file is in ELF32 format. This is perfectly fine for us, since TempleOS only permits code in the lower 2 GiB, but GCC, for example, will not accept ELF32 objects with 64-bit code. To remedy this, we need to do one more conversion:

objcopy -I elf32-x86-64 -O elf64-x86-64 Example.o Example.o64

Now we have everything ready on the HolyC side. Let’s write some C!

#include <stdio.h>

void HCMain(void);

void PutS(const char* message) {
    printf("%s", message);
}

int main(int argc, char** argv) {
    HCMain();
}

(save as example.c, or whatever)

We have 3 components to put together:

Conveniently, we can compile & link everything together with a single line

gcc -o example example.c Example.o64 Example.thunks.s

And if you feel lucky:

$ ./example 
Hello world

Congratulations, you are now a heretic!

More could be said, but we will end here for the time being.

As a homework, you are invited to find and report (or fix!) a bug in bin2elf.