placeholder
thoughts and learnings in software engineering by Rotem Tamir

Building a Cgo Dependent Golang Library with Bazel

Credits: The solution described in this post is based on Alex Eagle’s suggestion in: https://github.com/bazelbuild/bazel-gazelle/issues/773

While migrating the Nexar backend mono-repo to use Bazel as our main build-system I encountered a particularly painful challenge of compiling Go packages with C dependencies, such as confluent-go-kafka or libgeos. For others who will hit this issue, this will be a short and practical post showing what I finally came up with.

First try: let Gazelle figure things out

Gazelle is a really great project from the Bazel community that helps us automatically generate BUILD files for our Golang Bazel projects. It can look at our go.mod files that describe our external dependencies and translate them for us into rules_go’s go_repository invocations that will fetch external dependencies into our WORKSPACE and create a BUILD file for them. For 99% of our mono-repo’s dependencies this worked smoothly; all we had to do is:

bazel run //:gazelle -- update-repos -from_file=go.mod -to_macro=deps.bzl%go_dependencies

And Gazelle would generate a macro that contains invocations to go_repository for each of our module’s dependencies. In this post, I focus on the few dependencies that this did not work well with, and document my way around those edges.

Our go.mod file contains a dependency on:

require (
	github.com/confluentinc/confluent-kafka-go v1.4.2
)

Gazelle analyzes our dependencies and emits this block in the deps.bzl file:

go_repository(
    name = "com_github_confluentinc_confluent_kafka_go",
    importpath = "github.com/confluentinc/confluent-kafka-go",
    sum = "h1:13EK9RTujF7lVkvHQ5Hbu6bM+Yfrq8L0MkJNnjHSd4Q=",
    version = "v1.4.2",
)

Trying to build we get:

$ bazel build @com_github_confluentinc_confluent_kafka_go//kafka:kafka
INFO: Analyzed target @com_github_confluentinc_confluent_kafka_go//kafka:kafka (6 packages loaded, 59 targets configured).
INFO: Found 1 target...
ERROR: /root/.cache/bazel/_bazel_root/f8087e59fd95af1ae29e8fcb7ff1a3dc/external/com_github_confluentinc_confluent_kafka_go/kafka/BUILD.bazel:3:11: GoCompilePkg external/com_github_confluentinc_confluent_kafka_go/kafka/kafka.a failed (Exit 1): builder failed: error executing command bazel-out/host/bin/external/go_sdk/builder compilepkg -sdk external/go_sdk -installsuffix linux_amd64 -src external/com_github_confluentinc_confluent_kafka_go/kafka/00version.go -src ... (remaining 71 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox builder failed: error executing command bazel-out/host/bin/external/go_sdk/builder compilepkg -sdk external/go_sdk -installsuffix linux_amd64 -src external/com_github_confluentinc_confluent_kafka_go/kafka/00version.go -src ... (remaining 71 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
/root/.cache/bazel/_bazel_root/f8087e59fd95af1ae29e8fcb7ff1a3dc/sandbox/processwrapper-sandbox/1/execroot/__main__/external/com_github_confluentinc_confluent_kafka_go/kafka/00version.go:24:10: fatal error: librdkafka/rdkafka.h: No such file or directory
 #include <librdkafka/rdkafka.h>
          ^~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
compilepkg: error running subcommand: exit status 2
Target @com_github_confluentinc_confluent_kafka_go//kafka:kafka failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 6.865s, Critical Path: 0.74s
INFO: 2 processes: 2 internal.
FAILED: Build did NOT complete successfully

Compiling CMake Projects From Within a Bazel Workspace

We need to compile our C dependencies into a static library (.a file) in order to provide it to the go linker. We could try to create new Bazel BUILD files on top of that project, but that would require us to deep dive into that dependency and mimic anything it’s authors have done in the original build, wouldn’t it make sense to try and invoke the CMake build from within Bazel? A crude way to do this is using a genrule, which allows us to invoke any arbitrary command as part of our action graph, but it turns out there’s a great project called rules_foreign_cc which are “[Bazel] Rules for building C/C++ projects using foreign build systems inside Bazel projects.”

We add it to our WORKSPACE file:

http_archive(
    name = "rules_foreign_cc",
    strip_prefix = "rules_foreign_cc-master",
    url = "https://github.com/bazelbuild/rules_foreign_cc/archive/master.zip",
)

load("@rules_foreign_cc//:workspace_definitions.bzl", "rules_foreign_cc_dependencies")

rules_foreign_cc_dependencies()

So instead of figuring out an elaborate, specific BUILD file for the C library we are using, we can simply use the foreign dependency rules to invoke the original CMake build and integrate it natively with our action graph.

We add this to our WORKSPACE:

http_archive(
    name = "librdkafka",
    build_file_content = """load("@rules_foreign_cc//tools/build_defs:cmake.bzl", "cmake_external")

filegroup(
    name = "sources",
    srcs = glob(["**"]),
)

cmake_external(
    name = "librdkafka",
    cache_entries = {
        "RDKAFKA_BUILD_STATIC": "ON",
        "WITH_ZSTD": "OFF",
        "WITH_SSL": "OFF",
        "WITH_SASL": "OFF",
        "ENABLE_LZ4_EXT": "OFF",
        "WITH_LIBDL": "OFF",
    },
    lib_source = ":sources",
    static_libraries = [
        "librdkafka++.a",
        "librdkafka.a",
    ],
    visibility = ["//visibility:public"],
)
""",
    sha256 = "ae27ea3f3d0d32d29004e7f709efbba2666c5383a107cc45b3a1949486b2eb84",
    strip_prefix = "librdkafka-1.4.0",
    urls = ["https://github.com/edenhill/librdkafka/archive/v1.4.0.tar.gz"],
)

Let’s break it down:

  • We use http_archive to pull in a resource from the internet into our Bazel project.
  • We provide a custom BUILD file for Bazel to use to build this target.
  • the invocation of cmake_external tells Bazel how to invoke CMake (cache_entries is the CMake input arguments), and what output files it provides (static_libraries) so this target can be depended on downstream in the Bazel action graph.

We can now build our header files:

$ bazel build @librdkafka//:librdkafka

INFO: Analyzed target @librdkafka//:librdkafka (3 packages loaded, 530 targets configured).
INFO: Found 1 target...
Target @librdkafka//:librdkafka up-to-date:
  bazel-bin/external/librdkafka/librdkafka/include
  bazel-bin/external/librdkafka/librdkafka/lib/librdkafka++.a
  bazel-bin/external/librdkafka/librdkafka/lib/librdkafka.a
  bazel-bin/external/librdkafka/copy_librdkafka/librdkafka
  bazel-bin/external/librdkafka/librdkafka/logs/CMake_script.sh
  bazel-bin/external/librdkafka/librdkafka/logs/CMake.log
  bazel-bin/external/librdkafka/librdkafka/logs/wrapper_script.sh
INFO: Elapsed time: 3.959s, Critical Path: 0.13s
INFO: 0 processes.
INFO: Build completed successfully, 2 total actions

In my specific use-case, I wanted to link the my build of the confluent-go-kafka library with zlib, so I could avoid getting this error:

bazel-out/k8-fastbuild/bin/external/librdkafka/librdkafka/lib/librdkafka.a(rdgz.c.o):rdgz.c:function rd_gz_decompress: error: undefined reference to 'inflateGetHeader'
bazel-out/k8-fastbuild/bin/external/librdkafka/librdkafka/lib/librdkafka.a(rdgz.c.o):rdgz.c:function rd_gz_decompress: error: undefined reference to 'inflate'
....

To do that I did a very similar thing to what I’ve done with the librdkafka library, adding to the WORKSPACE:

http_archive(
    name = "zlib",
    build_file_content = """load("@rules_foreign_cc//tools/build_defs:cmake.bzl", "cmake_external")

filegroup(
    name = "sources",
    srcs = glob(["**"]),
)

cmake_external(
    name = "zlib",
    cache_entries = {

    },
    lib_source = ":sources",
    static_libraries = ["libz.a"],
    visibility = ["//visibility:public"],
)
""",
    strip_prefix = "zlib-1.2.11",
    urls = [
        "https://github.com/madler/zlib/archive/v1.2.11.zip",
    ],
)

I could now build it:

$ bazel build @zlib//:zlib
INFO: Analyzed target @zlib//:zlib (0 packages loaded, 2 targets configured).
INFO: Found 1 target...
INFO: From CcCmakeMakeRule external/zlib/zlib/include:
Target @zlib//:zlib up-to-date:
  bazel-bin/external/zlib/zlib/include
  bazel-bin/external/zlib/zlib/lib/libz.a
  bazel-bin/external/zlib/copy_zlib/zlib
  bazel-bin/external/zlib/zlib/logs/CMake_script.sh
  bazel-bin/external/zlib/zlib/logs/CMake.log
  bazel-bin/external/zlib/zlib/logs/wrapper_script.sh

INFO: Elapsed time: 11.046s, Critical Path: 9.73s
INFO: 1 process: 1 darwin-sandbox.
INFO: Build completed successfully, 3 total actions

Linking the Static Libraries with the Go Library

Now, that we have our static libraries all built, we want to somehow modify the build procedure for this library. To see the BUILD file that is generated by the go_repository rule we can:

$ bazel query @com_github_confluentinc_confluent_kafka_go//kafka:kafka

# /root/.cache/bazel/_bazel_root/f8087e59fd95af1ae29e8fcb7ff1a3dc/external/com_github_confluentinc_confluent_kafka_go/kafka/BUILD.bazel:3:11
go_library(
  name = "kafka",
  visibility = ["//visibility:public"],
  generator_name = "kafka",
  generator_function = "go_library_macro",
  generator_location = "kafka_go/kafka/BUILD.bazel:3:11",
  srcs = ["@com_github_confluentinc_confluent_kafka_go//kafka:00version.go", "@com_github_confluentinc_confluent_kafka_go//kafka:adminapi.go", "@com_github_confluentinc_confluent_kafka_go//kafka:adminoptions.go", "@com_github_confluentinc_confluent_kafka_go//kafka:build_darwin.go", "@com_github_confluentinc_confluent_kafka_go//kafka:build_glibc_linux.go", "@com_github_confluentinc_confluent_kafka_go//kafka:config.go", "@com_github_confluentinc_confluent_kafka_go//kafka:consumer.go", "@com_github_confluentinc_confluent_kafka_go//kafka:context.go", "@com_github_confluentinc_confluent_kafka_go//kafka:error.go", "@com_github_confluentinc_confluent_kafka_go//kafka:error_gen.go", "@com_github_confluentinc_confluent_kafka_go//kafka:event.go", "@com_github_confluentinc_confluent_kafka_go//kafka:generated_errors.go", "@com_github_confluentinc_confluent_kafka_go//kafka:glue_rdkafka.h", "@com_github_confluentinc_confluent_kafka_go//kafka:handle.go", "@com_github_confluentinc_confluent_kafka_go//kafka:header.go", "@com_github_confluentinc_confluent_kafka_go//kafka:kafka.go", "@com_github_confluentinc_confluent_kafka_go//kafka:log.go", "@com_github_confluentinc_confluent_kafka_go//kafka:message.go", "@com_github_confluentinc_confluent_kafka_go//kafka:metadata.go", "@com_github_confluentinc_confluent_kafka_go//kafka:misc.go", "@com_github_confluentinc_confluent_kafka_go//kafka:offset.go", "@com_github_confluentinc_confluent_kafka_go//kafka:producer.go", "@com_github_confluentinc_confluent_kafka_go//kafka:testhelpers.go", "@com_github_confluentinc_confluent_kafka_go//kafka:time.go"],
  deps = ["@com_github_confluentinc_confluent_kafka_go//kafka/librdkafka:librdkafka"],
  importpath = "github.com/confluentinc/confluent-kafka-go/kafka",
  cgo = True,
  copts = select({"@io_bazel_rules_go//go/platform:android": ["-Ikafka/kafka"], "@io_bazel_rules_go//go/platform:darwin": ["-Ikafka/kafka"], "@io_bazel_rules_go//go/platform:ios": ["-Ikafka/kafka"], "@io_bazel_rules_go//go/platform:linux": ["-Ikafka/kafka"], "//conditions:default": []}),
  clinkopts = select({"@io_bazel_rules_go//go/platform:android": ["kafka/librdkafka/librdkafka_glibc_linux.a -lm -ldl -lpthread -lrt"], "@io_bazel_rules_go//go/platform:darwin": ["kafka/librdkafka/librdkafka_darwin.a -lm -lsasl2 -lz -ldl -lpthread"], "@io_bazel_rules_go//go/platform:ios": ["kafka/librdkafka/librdkafka_darwin.a -lm -lsasl2 -lz -ldl -lpthread"], "@io_bazel_rules_go//go/platform:linux": ["kafka/librdkafka/librdkafka_glibc_linux.a -lm -ldl -lpthread -lrt"], "//conditions:default": []}),
)

To get our library to properly compile, we want to make two modifications:

  1. Add a cdeps directive in which we will specify our freshly build static libraries as dependencies for our target.
  2. Modify the clinkopts attribute so that it does not point the compiler at a wrong path for the static libs anymore.

Luckily, the go_repository rule can receive a list of patch files and apply them as it is bringing in external dependencies into our WORKSPACE. To find the location of the BUILD file containing our go_library, we can:

bazel query @com_github_confluentinc_confluent_kafka_go//kafka:kafka --output location
/root/.cache/bazel/_bazel_root/f8087e59fd95af1ae29e8fcb7ff1a3dc/external/com_github_confluentinc_confluent_kafka_go/kafka/BUILD.bazel:3:11: go_library rule @com_github_confluentinc_confluent_kafka_go//kafka:kafka

Next we should duplicate the file, edit it to our liking, and produce a patch file (use diff -Naur), to create this patch file:

--- kafka/BUILD.bazel
+++ kafka/BUILD.bazel
@@ -28,19 +28,20 @@ go_library(
         "testhelpers.go",
         "time.go",
     ],
+    cdeps = ["@librdkafka//:librdkafka", "@zlib//:zlib"],
     cgo = True,
     clinkopts = select({
         "@io_bazel_rules_go//go/platform:android": [
-            "kafka/librdkafka/librdkafka_glibc_linux.a -lm -ldl -lpthread -lrt",
+            "-lm -ldl -lpthread -lrt",
         ],
         "@io_bazel_rules_go//go/platform:darwin": [
-            "kafka/librdkafka/librdkafka_darwin.a -lm -lsasl2 -lz -ldl -lpthread",
+            "-lm -lsasl2 -lz -ldl -lpthread",
         ],
         "@io_bazel_rules_go//go/platform:ios": [
-            "kafka/librdkafka/librdkafka_darwin.a -lm -lsasl2 -lz -ldl -lpthread",
+            "-lm -lsasl2 -lz -ldl -lpthread",
         ],
         "@io_bazel_rules_go//go/platform:linux": [
-            "kafka/librdkafka/librdkafka_glibc_linux.a -lm -ldl -lpthread -lrt",
+            "-lm -ldl -lpthread -lrt",
         ],
         "//conditions:default": [],
     }),

Finally, we need to update our go_repository rule to use our patch (I’ve saved mine under $(WORKSPACE)/bazelbuild/third_party/librdkafka:go.patch):

go_repository(
    name = "com_github_confluentinc_confluent_kafka_go",
    build_file_proto_mode = "disable",
    importpath = "github.com/confluentinc/confluent-kafka-go",
    patches = ["//bazelbuild/third_party/librdkafka:go.patch"],  # keep
    sum = "h1:13EK9RTujF7lVkvHQ5Hbu6bM+Yfrq8L0MkJNnjHSd4Q=",
    version = "v1.4.2",
)

(The # keep comment is there to tell gazelle not to override this line when re-generating the deps.bzl file).

We can now compile our target:

$ bazel build @com_github_confluentinc_confluent_kafka_go//kafka:kafka

INFO: Analyzed target @com_github_confluentinc_confluent_kafka_go//kafka:kafka (42 packages loaded, 7770 targets configured).
INFO: Found 1 target...
Target @com_github_confluentinc_confluent_kafka_go//kafka:kafka up-to-date:
  bazel-bin/external/com_github_confluentinc_confluent_kafka_go/kafka/kafka.a
INFO: Elapsed time: 16.464s, Critical Path: 3.20s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action

We did it, hooray! 🎉