• Bazel for ARM

  • Bazel for ARM

    Embedded software is usually easy to build. Any vendor will provide some sort of build system that “just works” with their SDK.

    You might be able to gest started quickly, but maintaining a project and, most importantly, get someone else up and running with your toolchain progressively becomes a nightmare.

    Even when a vendor uses standard Makefiles, the build process usually depends on tons of environment variables, user settable parameters, and system dependencies. You can kiss goodbye to reproducible firmware builds.

    After some experimentation with Docker and other build systems, I found Bazel:

    “Build and test software of any size, quickly and reliably”

    Sounds pretty amazing right? Not so fast, while Bazel is packed with functionality, getting it to do what you want is not easy.

    Setting up a custom toolchain

    To get started with a Bazel embedded project we’ll set up the different components of the toolchain.

    Folder structure

    We’ll start by creating a project with the following structure:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    
    ├── WORKSPACE
    │
    ├── project
    │   ├── BUILD.bazel
    │   └── /* SOURCE CODE */
    │
    └── toolchain
        ├── BUILD.bazel
        ├── compiler.BUILD
        ├── config.bzl
        └── arm-none-eabi
            ├── darwin
            │   └── /* DARWIN TOOLCHAIN  */
            ├── linux
            │   └── /* LINUX TOOLCHAIN   */
            ├── windows
            │   └── /* WINDOWS TOOLCHAIN */
            └── ...
                └── /* OTHER TOOLCHAIN   */
    

    For simplicity, from now on, we’ll look at the darwin (macOS) architecture, the steps are very similar for any other OS and examples can be found in the Github project folder.

    Custom compiler

    First things first. Embedded software usually requires a custom compiler. While Make will just allow you to install any compiler on your system and just plug it in its rule system, Bazel is not so kind.

    And it actually makes sense. A compiler is an integral part of a build system. Use different versions of a compiler and you will end up with different binaries from the same source code.

    Bazel solves this problem by allowing you to specify dependencies in the WORKSPACE file. We can download files in the form of an http_archive that will be stored in the external dependencies folder. We can start by downloading the the compilers for each OS in the WORKSPACE file named “arm_none_eabi”:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    
    # WORKSPACE
    
    workspace(name = "arm_none_eabi")
    
    load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
    
    http_archive(
        name = "arm-none-eabi-darwin",
        build_file = "@arm_none_eabi//toolchain:compiler.BUILD",
        sha256 = "1249f860d4155d9c3ba8f30c19e7a88c5047923cea17e0d08e633f12408f01f0",
        strip_prefix = "gcc-arm-none-eabi-9-2019-q4-major",
        url = "https://developer.arm.com/-/media/Files/downloads/gnu-rm/9-2019q4/gcc-arm-none-eabi-9-2019-q4-major-mac.tar.bz2?revision=c2c4fe0e-c0b6-4162-97e6-7707e12f2b6e&la=en&hash=EC9D4B5F5B050267B924F876B306D72CDF3BDDC0",
    )
    
    http_archive(
        name = "arm-none-eabi-linux",
        build_file = "@arm_none_eabi//toolchain:compiler.BUILD",
        sha256 = "bcd840f839d5bf49279638e9f67890b2ef3a7c9c7a9b25271e83ec4ff41d177a",
        strip_prefix = "gcc-arm-none-eabi-9-2019-q4-major",
        url = "https://developer.arm.com/-/media/Files/downloads/gnu-rm/9-2019q4/gcc-arm-none-eabi-9-2019-q4-major-x86_64-linux.tar.bz2?revision=108bd959-44bd-4619-9c19-26187abf5225&la=en&hash=E788CE92E5DFD64B2A8C246BBA91A249CB8E2D2D",
    )
    
    http_archive(
        name = "arm-none-eabi-windows",
        build_file = "@arm_none_eabi//toolchain:compiler.BUILD",
        sha256 = "e4c964add8d0fdcc6b14f323e277a0946456082a84a1cc560da265b357762b62",
        url = "https://developer.arm.com/-/media/Files/downloads/gnu-rm/9-2019q4/gcc-arm-none-eabi-9-2019-q4-major-win32.zip?revision=20c5df9c-9870-47e2-b994-2a652fb99075&la=en&hash=347C07EEEB848CC8944F943D8E1EAAB55A6CA0BC",
    )
    

    The downloaded compiler will need a compiler.BUILD, where we declare what files are in the downloaded folder in a way that Bazel understands. A good place to put this file is //toolchains/compiler.BUILD.

    Given that the folder structure of the gcc toolchain is the same on different operating systems, we can use the same BUILD file.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    
    # toolchains/compiler.BUILD
    
    package(default_visibility = ['//visibility:public'])
    
    filegroup(
        name = "gcc",
        srcs = glob(["bin/arm-none-eabi-gcc*"]),
    )
    
    filegroup(
        name = "ar",
        srcs = glob(["bin/arm-none-eabi-ar*"]),
    )
    
    filegroup(
        name = "ld",
        srcs = glob(["bin/arm-none-eabi-ld*"]),
    )
    
    filegroup(
        name = "nm",
        srcs = glob(["bin/arm-none-eabi-nm*"]),
    )
    
    filegroup(
        name = "objcopy",
        srcs = glob(["bin/arm-none-eabi-objcopy*"]),
    )
    
    filegroup(
        name = "objdump",
        srcs = glob(["bin/arm-none-eabi-objdump*"]),
    )
    
    filegroup(
        name = "strip",
        srcs = glob(["bin/arm-none-eabi-strip*"]),
    )
    
    filegroup(
        name = "as",
        srcs = glob(["bin/arm-none-eabi-as*"]),
    )
    
    filegroup(
        name = "size",
        srcs = glob(["bin/arm-none-eabi-size*"]),
    )
    
    filegroup(
        name = "compiler_pieces",
        srcs = glob([
            "arm-none-eabi/**",
            "lib/gcc/arm-none-eabi/**",
        ]),
    )
    
    filegroup(
        name = "compiler_components",
        srcs = [
            ":ar",
            ":as",
            ":gcc",
            ":ld",
            ":nm",
            ":objcopy",
            ":objdump",
            ":strip",
        ],
    )
    

    Note that the glob(["arm-none-eabi-<tool>*"]) function, is used to match UNIX and Windows gcc tools that might have different extensions.

    Perfect, now we have a compiler fully specified within our dependency tree! Let’s put it to work.

    Custom toolchain

    The next step is to tell Bazel (that already knows how a C compiler works) where to find the different custom tools we are providing.

    To do this we will need three things:

    • A cc_toolchain_suite rule to tell Bazel what toolchain to use based on the host system architecture.
    • An os-specific cc_toolchain rule for every architecture
    • A toolchain_info configuration provider that tells Bazel where each of the important compiler tools is located

    Toolchain suite

    We’ll place this rule in our toolchains/BUILD.bazel file. Ideally, all of our toolchains will be contained in the //toolchains package, so it makes sense to register this rule here.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    
    # toolchains/BUILD.bazel
    
    load("@rules_cc//cc:defs.bzl", "cc_toolchain_suite")
    
    cc_toolchain_suite(
        name = "arm-none-eabi",
        toolchains = {
            "darwin": "//toolchains/arm-none-eabi/darwin",
            "k8": "//toolchains/arm-none-eabi/linux",
            "x64_windows": "//toolchains/arm-none-eabi/windows",
        },
    )
    

    A cc_toolchain_suite rule is just a mapping between the host architecture and the toolchain rule to use.

    For example, if our system sports a darwin identifier, Bazel will use the cc_toolchain //toolchains/arm-none-eabi/darwin target.

    Toolchain rule

    Now let’s provide the actual cc_toolchain target. Each architecture will have its own, making our toolchain suite modular and ready to support new architectures in the future.

    Each target will be located at toolchains/<tolchain-name>/<architecture>/BUILD.bazel. For example, the arm-none-eabi toolchain for darwin systems will be located at toolchains/arm-none-eabi/darwin/BUILD.bazel.

    An OS specific toolchain folder will look like this:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    
    toolchains/arm-none-eabi/<architecture>
    ├── BUILD.bazel
    ├── arm-none-eabi-ar
    ├── arm-none-eabi-cpp
    ├── arm-none-eabi-gcc
    ├── arm-none-eabi-gcov
    ├── arm-none-eabi-ld
    ├── arm-none-eabi-nm
    ├── arm-none-eabi-objcopy
    ├── arm-none-eabi-objdump
    └── arm-none-eabi-strip
    

    As you can see, in addition to the BUILD.bazel file, there are also a bunch of arm-none-eabi-* files.

    These are actually wrappers script that invoke the externally downloaded compiler components. This workaround is needed since the external folder is not visible from within the project, but becomes visible at build time.

    The wrapper files have the following structure on UNIX:

    1
    2
    3
    
    #!/bin/bash
    set -euo pipefail
    external/arm-none-eabi-darwin/bin/arm-none-eabi-<tool> "$@"
    

    And this structure on Windows

    1
    
    "external/arm-none-eabi-windows/bin/arm-none-eabi-<tool>.exe" %*
    

    Finally we need to bundle the external compiler components and the wrappers scripts together, then feed this information to the cc_toolchain rule.

    We can do so by creating filegroups in the toolchain/arm-none-eabi/<architecture>/BUILD.bazel file, like so:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    
    # toolchains/arm-none-eabi/darwin/BUILD.bazel
    
    load("@rules_cc//cc:defs.bzl", "cc_toolchain")
    load("@arm_none_eabi//toolchain:config.bzl", "cc_arm_none_eabi_config")
    
    package(default_visibility = ["//visibility:public"])
    
    compiler = "arm-none-eabi-darwin"
    
    filegroup(
        name = "all_files",
        srcs = [
            ":ar_files",
            ":compiler_files",
            ":linker_files",
            "@{}//:compiler_pieces".format(compiler),
        ],
    )
    
    filegroup(
        name = "compiler_files",
        srcs = [
            "arm-none-eabi-gcc",
            "@{}//:compiler_pieces".format(compiler),
            "@{}//:gcc".format(compiler),
        ],
    )
    
    filegroup(
        name = "linker_files",
        srcs = [
            "arm-none-eabi-gcc",
            "arm-none-eabi-ld",
            "@{}//:ar".format(compiler),
            "@{}//:compiler_pieces".format(compiler),
            "@{}//:gcc".format(compiler),
            "@{}//:ld".format(compiler),
        ],
    )
    
    filegroup(
        name = "ar_files",
        srcs = [
            "arm-none-eabi-ar",
            "@{}//:ar".format(compiler),
        ],
    )
    
    filegroup(
        name = "objcopy_files",
        srcs = [
            "arm-none-eabi-objcopy",
            "@{}//:objcopy".format(compiler),
        ],
    )
    
    filegroup(
        name = "strip_files",
        srcs = [
            "arm-none-eabi-strip",
            "@{}//:strip".format(compiler),
        ],
    )
    
    cc_arm_none_eabi_config(
        name = "darwin_config",
        gcc_repo = compiler,
        gcc_version = "9.2.1",
        host_system_name = "x86-darwin",
        toolchain_identifier = "arm-none-eabi-darwin",
        wrapper_path = "arm-none-eabi/darwin",
    )
    
    cc_toolchain(
        name = "darwin",
        all_files = ":all_files",
        ar_files = ":ar_files",
        compiler_files = ":compiler_files",
        dwp_files = ":empty",
        linker_files = ":linker_files",
        objcopy_files = ":objcopy_files",
        strip_files = ":strip_files",
        supports_param_files = 0,
        toolchain_config = ":darwin_config",
        toolchain_identifier = "arm-none-eabi-darwin",
    )
    

    We finally have our cc_toolchain rule! But wait, what did we pass to toolchain_config?

    Well, this is actually a toolchain_info provider rule that we have to write ourselves. Once we have done that, all the components of the toolchain will be in place.

    Toolchain config

    This step is probably the most involved of the entire process. Basically we’ll be writing a custom rule to describe the toolchain components and implement them on the embedded knowledge Bazel has of how a compiler works.

    Below you can find an example implementation that hits on three important points:

    • Specifies the location of the tools in tool_paths using the wrapper_path helper function
    • Specifies important include_directories
    • Specifies defaults linked libraries
      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    
    # toolchains/config.bzl
    
    # toolchains/arm-none-eabi/darwin/config.bzl
    
    load("@bazel_tools//tools/build_defs/cc:action_names.bzl", "ACTION_NAMES")
    load("@bazel_tools//tools/cpp:cc_toolchain_config_lib.bzl", "feature", "flag_group", "flag_set", "tool_path")
    
    def wrapper_path(ctx, tool):
        wrapped_path = "{}/arm-none-eabi-{}{}".format(ctx.attr.wrapper_path, tool, ctx.attr.wrapper_ext)
        return tool_path(name = tool, path = wrapped_path)
    
    def _impl(ctx):
        tool_paths = [
            wrapper_path(ctx, "gcc"),
            wrapper_path(ctx, "ld"),
            wrapper_path(ctx, "ar"),
            wrapper_path(ctx, "cpp"),
            wrapper_path(ctx, "gcov"),
            wrapper_path(ctx, "nm"),
            wrapper_path(ctx, "objdump"),
            wrapper_path(ctx, "strip"),
        ]
    
        include_flags = [
            "-isystem",
            "external/{}/arm-none-eabi/include".format(ctx.attr.gcc_repo),
            "-isystem",
            "external/{}/lib/gcc/arm-none-eabi/{}/include".format(ctx.attr.gcc_repo, ctx.attr.gcc_version),
            "-isystem",
            "external/{}/arm-none-eabi/include/c++/{}/".format(ctx.attr.gcc_repo, ctx.attr.gcc_version),
            "-isystem",
            "external/{}/arm-none-eabi/include/c++/{}/arm-none-eabi/".format(ctx.attr.gcc_repo, ctx.attr.gcc_version),
        ]
    
        linker_flags = [
            "-L",
            "external/{}/arm-none-eabi/lib".format(ctx.attr.gcc_repo),
            "-L",
            "external/{}/lib/gcc/arm-none-eabi/{}".format(ctx.attr.gcc_repo, ctx.attr.gcc_version),
            "-llibc.a",
            "-llibgcc.a",
        ]
    
        toolchain_compiler_flags = feature(
            name = "compiler_flags",
            enabled = True,
            flag_sets = [
                flag_set(
                    actions = [
                        ACTION_NAMES.assemble,
                        ACTION_NAMES.preprocess_assemble,
                        ACTION_NAMES.linkstamp_compile,
                        ACTION_NAMES.c_compile,
                        ACTION_NAMES.cpp_compile,
                        ACTION_NAMES.cpp_header_parsing,
                        ACTION_NAMES.cpp_module_compile,
                        ACTION_NAMES.cpp_module_codegen,
                        ACTION_NAMES.lto_backend,
                        ACTION_NAMES.clif_match,
                    ],
                    flag_groups = [
                        flag_group(flags = include_flags),
                    ],
                ),
            ],
        )
    
        toolchain_linker_flags = feature(
            name = "linker_flags",
            enabled = True,
            flag_sets = [
                flag_set(
                    actions = [
                        ACTION_NAMES.linkstamp_compile,
                    ],
                    flag_groups = [
                        flag_group(flags = linker_flags),
                    ],
                ),
            ],
        )
    
        return cc_common.create_cc_toolchain_config_info(
            ctx = ctx,
            toolchain_identifier = ctx.attr.toolchain_identifier,
            host_system_name = ctx.attr.host_system_name,
            target_system_name = "arm-none-eabi",
            target_cpu = "arm-none-eabi",
            target_libc = "gcc",
            compiler = ctx.attr.gcc_repo,
            abi_version = "eabi",
            abi_libc_version = ctx.attr.gcc_version,
            tool_paths = tool_paths,
            features = [
                toolchain_compiler_flags,
                toolchain_linker_flags,
            ],
        )
    
    cc_arm_none_eabi_config = rule(
        implementation = _impl,
        attrs = {
            "toolchain_identifier": attr.string(default = ""),
            "host_system_name": attr.string(default = ""),
            "wrapper_path": attr.string(default = ""),
            "wrapper_ext": attr.string(default = ""),
            "gcc_repo": attr.string(default = ""),
            "gcc_version": attr.string(default = ""),
        },
        provides = [CcToolchainConfigInfo],
    )
    

    Feel free to modify the above file to match the compiler version you are using.

    Bazel configuration

    Finally, we need to tell Bazel that we want to use a custom toolchain when compiling our project. We can do it using two command line flags:

    • --crosstool_top
    • --host_crosstool_top

    These flags tell Bazel which toolchains to fetch for the target and host build respectively. Let’s add them to our .bazelrc file to make them defaults.

    1
    2
    3
    4
    5
    6
    7
    
    # .bazelrc
    
    # Target arm-none-eabi toolchain for target builds.
    build --crosstool_top=@arm_none_eabi//toolchain
    
    # Target the default cpp compiler for host builds.
    build --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
    

    Done! Now we can take advantage of our custom compiler. Hopefully a useful start for any embedded project that requires binary reproducibility and hermetic builds.

    Comments

    comments powered by Disqus