Source: | https://ashw.io/blog/arm64-sysreg-lib/1 |
---|
Hi there, in this post I’ll be introducing a new project I’ve been working on in my spare time over the last few weeks: Arm64 System Register Library (abbrev. arm64-sysreg-lib), a new header-only C library for reading/writing 64-bit Arm system registers. This first post introduces the library itself, including how to download it and begin using it in your own projects, then future posts will dive into the technical details of how the library actually works “under the hood”.
Key features of the library include:
Bit structs defined for 263 system registers
S
afe values defined for all writeable registers, with all currently/previously RES1
bits set to 1
and all currently/previously RES0
bits cleared to 0
Accessors for read, unsafe write, safe write, and read-modify-write sequences defined according to register accessibility
All library calls optimise down to an average of between one and four A64 assembly instructions and are inlined with no branches and no static storage
Support for both gcc
and clang
, each with -Wall -Wextra -pedantic -Werror
flags and supporting both -std=c99
and -std=c11
Automatically generated by parsing the AArch64 System Register XML provided with Arm’s A-Profile CPU Architecture Exploration Tools
Open source project hosted on GitHub with permissive MIT license
All system registers define a union
of the form union <reg>
which defines a ._
member for raw access to the underlying register value and an anonymous struct
for manipulation of the register’s constituent bit fields.
Example:
union sctlr_el1
{
u64 _;
struct
{
u64 m : 1;
u64 a : 1;
u64 c : 1;
u64 sa : 1;
...
};
};
All writeable registers define a safe value of the form <reg>_SAFEVAL
that has all currently or previously RES1
fields set to 1
and all currently or previously RES0
fields cleared to 0
.
Example:
static const union sctlr_el3 SCTLR_EL3_SAFEVAL =
{
.res1_5_4 = 3,
.eos = 1,
.res1_16 = 1,
.res1_18 = 1,
.eis = 1,
.res1_23 = 1,
.res1_29_28 = 3,
};
Some of these fields are currently RES1
bits, such as .res1_5_4
for bits [5:4] and .res1_16
for bit [16]; setting these to 1
in
the safe value ensures portability when running on future CPU
implementations where those bits have been repurposed into new
fields, as in these cases a value of 1
will give the old behaviour while a value of 0
will give the new behaviour, and we don’t want to inadvertently
enable the new behaviour on those future implementations by
clearing these bits.
This is why the .eos
and .eis
fields are also set to 1
; these fields were previosuly RES1
bits
but were repurposed into new fields in later revisions of the
architecture. The user can choose to clear these to 0
explicitly if they want the new behaviour, but the library defaults to setting them to 1
in the safe value and, by extension, the safe_write_<reg>()
convenience macros discussed later.
All readable system registers define a read accessor of the form static inline union <reg> read_<reg>( void )
, which reads the current value of the system register into its corresponding union
.
Example:
/* C code */
#include "sysreg/mpidr_el1.h"
u64 foo( void )
{
return read_mpidr_el1().aff0;
}
/* Compiler output */
mrs x8, mpidr_el1
and x0, x8, #0xff
ret
All writeable system registers define a write accessor of the form static inline void unsafe_write_<reg>( union <reg> val )
, which directly writes val
into the system register. These functions are prefixed unsafe_
to emphasise the fact that they have no provision for helping to ensure that any currently or previously RES1
fields are set to 1
;
when using these functions it is the responsibility of the
programmer to ensure that any such bits are set appropriately so
as to ensure both correct behaviour and future portability. For
this reason, it is recommended that you instead use safe_write_<reg>()
wherever possible, or use the register’s safe value as a basis for constructing the value passed to unsafe_write_<reg>()
.
Example of unsafely writing constituent fields:
/* C code */
#include "sysreg/sctlr_el1.h"
void foo( void )
{
union sctlr_el1 val = { .m=1, .c=1, .i=1 };
unsafe_write_sctlr_el1(val);
}
/* Compiler output */
mov w8, #0x1005 // Danger! No RES1 bits set! See safe_write_<reg>()
msr sctlr_el1, x8
ret
Example of unsafely writing the register’s raw value:
/* C code */
#include "sysreg/sctlr_el1.h"
void foo( u64 raw )
{
union sctlr_el1 val = ;
unsafe_write_sctlr_el1(val);
}
/* Compiler output */
msr sctlr_el1, x0
ret
Writeable system registers define a convenience macro of the form safe_write_<reg>( … )
, and system registers that are both readable and writeable define a convenience macro of the form read_modify_write_<reg>( … )
.
These convenience macros are one of the main highlights of the
library and allow for powerful manipulation of 64-bit system
registers while still optimising down to only a handful of A64
assembly instructions with no branches and no static storage.
As mentioned earlier, the unsafe_write_<reg>()
function does not account for currently or previously RES1
bits, which may lead to portability issues if the programmer does
not correctly set these bits based on their desired behaviour. The safe_write_<reg>()
convenience macro solves this by allowing you to set a variadic
list of fields, with any unspecified fields defaulting to their
current/previous RES
value i.e. all currently or previously RES1
fields not specified in the variadic list will be set to 1
.
Example:
/* C code */
#include "sysreg/sctlr_el1.h"
void foo( void )
{
safe_write_sctlr_el1( .m=1, .c=1, .i=1 );
}
/* compiler output */
mov w8, #0x1985
movk w8, #0x30d0, lsl #16
msr sctlr_el1, x8
ret
Note how while we only specified .m=1
, .c=1
, and .i=1
, the value written to SCTLR_EL1
also has all currently or previously RES1
fields set to 1
, such as .itd=1
, .sed=1
, .eos=1
, etc.
Repeating the same, but this time explicitly clearing one of those currently or previously RES1
fields to 0
in the variadic list:
/* in C */
#include "sysreg/sctlr_el1.h"
void foo( void )
{
safe_write_sctlr_el1( .m=1, .c=1, .i=1, .itd=0 );
}
/* compiler output */
mov w8, #0x1905
movk w8, #0x30d0, lsl #16
msr sctlr_el1, x8
ret
Here we can see bit [7] corresponding to .itd
in the first MOV
has been cleared; the value moved into w8
is now 0x1905
vs 0x1985
in the earlier example.
The read_modify_write_<reg>()
convenience macro works in a similar way, but instead reads the
current value of a system register, overwrites a variadic list of
fields, then writes the result back to the system register, the key
thing here being that any fields not specified in the variadic list
are untouched.
Example:
/* in C */
void foo( void )
{
read_modify_write_sctlr_el1( .m=1, .c=1, .i=1 );
}
/* compiler output */
mrs x8, sctlr_el1
mov w9, #0x1005
orr x8, x8, x9
msr sctlr_el1, x8
ret
Repeating this but clearing a previously RES1
field such as .itd=0
:
/* in C */
void foo( void )
{
read_modify_write_sctlr_el1( .m=1, .c=1, .i=1, .itd=0 );
}
/* compiler output */
mrs x8, sctlr_el1
and x8, x8, #0xffffffffffffff7f
mov w9, #0x1005
orr x8, x8, x9
msr sctlr_el1, x8
ret
Note how the .itd
field is cleared using an AND
instruction before the other specified fields are ORR
’d in. I’ll dive into the technical details of how the variadic macro is able to do that in a future blog post.
Note
that when you clone the library from GitHub, it will already have
been built for you using the June 2020 (SysReg_xml_v86A-2020-06)
release of the AArch64 System Register XML. You can simply
add -I/path/to/arm64-sysreg-lib/include
to your compiler flags to begin using the library in your own projects straight away. You can also run the run-tests.py
script to build the compilation tests using your chosen compiler.
Alternatively, follow the instructions below to build the library yourself.
The prerequisites to build the library are:
Python 3.8+
Beautiful Soup 4 (pip3.8 install beautifulsoup4
)
First, download and extract the AArch64 System Register XML from the Arm A-Profile CPU architecture exploration tools page. You can do this manually, or instead use curl
:
$ curl -O https://developer.arm.com/-/media/developer/products/architecture/armv8-a-architecture/2020-06/SysReg_xml_v86A-2020-06.tar.gz
$ tar xf SysReg_xml_v86A-2020-06.tar.gz
Then run the provided run-build.py
script, pointing it at the extracted XML:
$ python3.8 run-build.py /path/to/SysReg_xml_v86A-2020-06
To test building the generated C headers with your chosen compiler, run the ./run-tests.py
script:
$ python3.8 run-tests.py [--keep] COMPILER_PATH [COMPILER_FLAGS]
For example:
$ python3.8 run-tests.py /path/to/aarch64-none-elf-gcc
It is assumed the compiler uses the same flags/switches as gcc
and clang
, and the script always invokes the compiler with the following flags:
-Wall -Wextra -pedantic -Werror
You may pass additional flags to the run-tests.py
script which will be passed in turn to the compiler, for example:
$ python3.8 run-tests.py /path/to/aarch64-none-elf-gcc -std=c99 -O3
If no -std
flag is provided, the script defaults to -std=c11
.
If no -O
flag is provided, the script defaults to -O2
.
If the compiler path contains substring clang
and no --target
flag is provided, the script defaults to --target=aarch64-none-elf
.
By default the script will cleanup the generated .o
object files after finishing the test run; pass the --keep
flag before the compiler path if you wish to keep these files.
At time of writing this library is still in development and is not yet able to parse all files included in the AArch64 System Register XML.
Of the 485 AArch64-*
files included in the SysReg_xml_v86A-2020-06
release that actually describe system register encodings:
126 are skipped as they correspond to instructions that use the system register encoding space (ic
, dc
, tlbi
, at
, cfp
, cpp
, and dvp
)
263 successfully build
96 fail to parse
Of the 96 that fail to parse:
43 fail as their fields vary, such a when a bit in another register is set to 1
19 fail due to being an arrayed register (<n>
)
23 fail due to having arrayed fields (<m>
, <n>
, or <x>
)
11 fail due to having variable length fields
Support for these registers will be added in future releases.
Not counted above are external system registers such as GICD_*
and GICR_*
, support for which will also be added later.
That’s all for now — hope you found it interesting! Please feel free to download the library from GitHub and experiment with using it in your own projects. Any feedback on usability and requests for future functionality would be greatly appreciated! Let me know if you find it useful. Also, keep an eye out for future blog posts in this series where I’ll dive into the technical details of how the library is generated and how features like the variadic convenience macros actually work.