Suppose we want to link this assembly file, rv16.S:
.global _start
_start:
la a0, foobar_ptr
lhu a0, (a0)
1:
j 1b
foobar_ptr:
.hword foobar
.section .bss
.global foobar
foobar:
.word 0
Using this linker script, rv16.ld:
OUTPUT_ARCH( "riscv" )
ENTRY(_start)
SECTIONS
{
. = 0x0;
.text : { *(.text) *(.text.*) }
.rodata : { *(.srodata*) *(.rodata) *(.rodata.*) }
.data : { *(.sdata*) *(.data) *(.data.*) }
.bss : { *(.sbss*) *(.bss) *(.bss.*) }
}
The command line would look something like this:
riscv32-unknown-elf-gcc -c rv16.S
riscv32-unknown-elf-ld -T rv16.ld rv16.o -o rv16.elf
But the assembler reports an error like:
rv16.S:9: Error: cannot represent BFD_RELOC_16 relocation in object file
The assembler is upset about this line:
.hword foobar
This is because we have asked it to create a 16-bit relocation for a
pointer, and it doesn’t know how to do that, or refuses to. This is a
reasonable type of relocation to emit if your binary is linked in the
lower 64 kB of the address space, and you can save a lot of storage for
pointer literals if you don’t store each of them with 16 bonus bits that
are always zero. For example, SoC boot ROMs are usually small,
statically linked at a known address, and extremely sensitive to static
code size. This is easy to do with the arm-none-eabi GNU
toolchain (you just get a linker error if the pointer loses bits due to
truncation), but impossible with RISC-V GNU and LLVM toolchains.
As far as I can tell a suitable relocation does exist in the RISC-V
ELF PSABI: a normal 32-bit data relocation .word foobar
would be emitted as an R_RISCV_32, and here we could just
use an R_RISCV_ADD16 with an initial value of 0. I asked a
friend who is an expert on the RISC-V lld backend and he
mumbled something about linker relaxation while staring into the
distance, his eyes clouded and unfocused. I left him to his
contemplation and figured out a way to make the linker do what I want
using brute force.
gas will happily emit 16-bit relocations for a
difference of two symbols, using a matched pair of
R_RISCV_ADD16 and R_RISCV_SUB16
relocations.
.global _start
_start:
la a0, foobar_ptr
lhu a0, (a0)
1:
j 1b
foobar_ptr:
.hword foobar - wibble
.section .bss.foobar
.global foobar
foobar:
.word 0
.section .bss.wibble
.global wibble
wibble:
.word 0
Assembling and then disassembling with relocations:
luke@cube tmp % riscv32-unknown-elf-gcc -c rv16.S && riscv32-unknown-elf-objdump -dr rv16.o
rv16.o: file format elf32-littleriscv
Disassembly of section .text:
...
00000010 <foobar_ptr>:
10: 0000 .short 0x0000
10: R_RISCV_ADD16 foobar
10: R_RISCV_SUB16 wibble
So the problem is reduced to: “how can we use relative relocations to assemble absolute ones?” My hack is to add this to the linker script:
SECTIONS
{
...
PROVIDE(__opaque_zero_symbol = 0);
}
And modify 16-bit pointers in the source like so:
foobar_ptr:
.hword foobar - __opaque_zero_symbol
This now assembles and links as expected. The relocations in the object file are:
00000010 <foobar_ptr>:
10: 0000 .short 0x0000
10: R_RISCV_ADD16 foobar
10: R_RISCV_SUB16 __opaque_zero_symbol
Subtracting zero is a no-op, so this is effectively an absolute relocation.
⇥ Return to wren.wtf