While the fix was trivial in hindsight, tracing this bug back to its source was less so.
It started by receiving a bug report from a user who was experiencing a complete crash on their device. Working with the user, the affected device range was quickly narrowed down to all devices with Android <= 6
running on ARM 64 bit (arm64-v8a
). Since the only change was an upgraded toolchain, attention quickly turned towards the compiler itself.
However, the error only occurred on actual hardware running the code, and no amount of emulator testing managed to reproduce the issue. I then sent a debug package to the user, containing a debug script and binaries compiled with various versions and configurations. Running this, we managed to uncover a regression bug in the go compiler itself.
Unfortunately, the underlying Git repository had received thousands of commits between the working and non-working versions, and sending thousands of binaries (> 100 GB) to the affected users was deemed unfeasible. However, assuming an actual regression, I theorized that git bisect
should be able to reduce this effort massively by employing its binary search approach.
Still, two problems remained: I didn’t have the hardware for reproduction, and neither did I have a simplified test case for someone who actually had the hardware to test it. Basically, I had to get access to the affected hardware. By analysing telemetry data and cross-referencing the results with device farm inventory lists, I tracked down a service provider offering an affected device for scripted testing.
Using git bisect
and about 20 build/test cycles later, I managed to identify the offending commit. The commit had assumed specific versions of other toolchain components, and did not verify this assumption. I fixed this bug by explicitly setting the required component, thereby also emitting a meaningful error message for older or incompatible toolchain versions.
Due to the significance (and slight behaviour change) of the fix, it was also incorporated into the next release changelog.
Project link: https://github.com/golang/go/issues/38838