How to Solve Python 2.7 Bus Error on Variscite Boards with Yocto

I’ve been working with a Variscite board running a Yocto based Linux distribution and Python 2.7.3. For the most part, my program which talks over serial, a bit of USB, and some TCP sockets runs fine. But every so often, I hit a nasty wall:

Bus error

Once that happens, restarting the program immediately reproduces the error until I reboot the board. Even more suspicious: Python’s self-test suite fails in a few places (test_builtin, test_codecs, and especially ctypes), with messages like Segmentation fault or Bus error.

That’s when I knew I needed to dig deeper.

My Investigation

From what I’ve learned, here are the most common suspects when ARM boards throw bus errors in Python:

  • ABI mismatch: Python or its libraries were compiled with the wrong CPU/float flags (hard vs soft float, NEON/VFP mismatches).
  • libffi/ctypes: _ctypes.so depends on libffi, and if they’re out of sync or built incorrectly, you’ll get random crashes.
  • Unaligned access: ARM is strict about memory alignment. A wrong pointer or misdeclared callback can trigger SIGBUS instantly.
  • Filesystem corruption: eMMC/SD wear or power issues can silently corrupt libraries.
  • Kernel/driver quirks: Misbehaving DMA or USB/serial drivers can corrupt memory.

The Diagnostic Project

To get a grip on what’s happening, I wrote a Python 2.7 diagnostic script: diag_bus_error.py.

This script does three things:

  1. Baseline checks – confirms Python version, linked libraries, and ARM alignment handling.
  2. Defines & explains an error – safely demonstrates how a misdeclared ctypes callback could trigger bus errors.
  3. Adds practice functionality – runs light stress tests, hashes standard library files to check corruption, and executes selective self-tests without immediately crashing.

Here’s the full code:

# -*- coding: utf-8 -*-
"""
diag_bus_error.py  (Python 2.7)
"""

from __future__ import print_function
import os, sys, subprocess, hashlib, time, platform, ctypes, ctypes.util, socket, traceback

def sh(cmd):
    try:
        out = subprocess.check_output(cmd, shell=True, stderr=subprocess.STDOUT)
        return out.decode('utf-8', 'ignore')
    except Exception as e:
        return "ERROR running {!r}:\n{}\n".format(cmd, e)

def md5sum(path):
    try:
        h = hashlib.md5()
        with open(path, 'rb') as f:
            for chunk in iter(lambda: f.read(65536), b''):
                h.update(chunk)
        return h.hexdigest()
    except Exception as e:
        return "ERR:{}".format(e)

def print_hdr(title):
    print("\n" + "="*8 + " " + title + " " + "="*8)

def baseline_checks():
    print_hdr("Baseline: Python/Platform")
    print("sys.version  :", sys.version)
    print("platform     :", platform.platform())
    print("machine      :", platform.machine())
    print("uname        :", platform.uname())

    print_hdr("Linked libs for python")
    print(sh("ldd '{}'".format(sys.executable)))

    print_hdr("Key shared objects")
    for mod in ("_ctypes", "codecs"):
        try:
            m = __import__(mod)
            print("{:<10} -> {}".format(mod, getattr(m, "__file__", "(built-in)")))
        except Exception as e:
            print("{:<10} -> IMPORT ERROR: {}".format(mod, e))

    print_hdr("libffi presence")
    print("ctypes.util.find_library('ffi') ->", ctypes.util.find_library('ffi'))

    print_hdr("/proc alignment")
    if os.path.exists("/proc/cpu/alignment"):
        print(open("/proc/cpu/alignment").read())

def demonstrate_ctypes_callback_issue(safe=True):
    print_hdr("ctypes callback demo")
    libc_name = ctypes.util.find_library('c')
    if not libc_name: return
    libc = ctypes.CDLL(libc_name)
    CMPFUNC = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_void_p, ctypes.c_void_p)

    def cmp_ints(a_ptr, b_ptr):
        a = ctypes.cast(a_ptr, ctypes.POINTER(ctypes.c_int))[0]
        b = ctypes.cast(b_ptr, ctypes.POINTER(ctypes.c_int))[0]
        return (a > b) - (a < b)

    cmp_cb = CMPFUNC(cmp_ints)
    libc.qsort.argtypes = [ctypes.c_void_p, ctypes.c_size_t, ctypes.c_size_t, CMPFUNC]
    libc.qsort.restype  = None
    arr = (ctypes.c_int * 5)(5, 1, 4, 2, 3)
    libc.qsort(arr, len(arr), ctypes.sizeof(ctypes.c_int), cmp_cb)
    print("SAFE qsort result:", list(arr))
    print("Explanation: misdeclared argtypes/restype can corrupt the stack on ARM → Bus error")

def stdlib_integrity_check():
    print_hdr("Stdlib integrity")
    for p in [
        os.path.join(sys.prefix, 'lib', 'python2.7', 'lib-dynload', '_ctypes.so'),
        os.path.join(sys.prefix, 'lib', 'python2.7', 'encodings', 'idna.py'),
    ]:
        print("{:<70} {}".format(p, md5sum(p) if os.path.exists(p) else "MISSING"))

def light_stress(seconds=10):
    print_hdr("Light socket stress")
    end = time.time() + seconds
    payload = b"x"*1024
    ok = 0
    srv = socket.socket(); srv.bind(("127.0.0.1", 0)); srv.listen(1)
    port = srv.getsockname()[1]
    while time.time() < end:
        cli = socket.socket(); cli.connect(("127.0.0.1", port))
        conn, _ = srv.accept()
        cli.sendall(payload); conn.recv(1024)
        cli.close(); conn.close()
        ok += 1
    srv.close()
    print("Completed {} handshakes.".format(ok))

if __name__ == "__main__":
    baseline_checks()
    demonstrate_ctypes_callback_issue()
    stdlib_integrity_check()
    light_stress()
    print_hdr("Done")

What I Learned About the Error

On x86, sometimes you “get away” with sloppy ctypes declarations. On ARM, you don’t. Misaligned pointers and wrong calling conventions will almost always trigger SIGBUS. That’s exactly what Python’s test_callbacks was doing in its stress tests.

In my case, the self-test crashes confirmed the issue wasn’t my app logic, but something deeper: _ctypes or libffi was compiled with the wrong ABI.

My Debugging Checklist

Here’s how I went about isolating the root cause:

  • Checked /proc/cpu/alignment logs for unaligned access reports.
  • Ran ldd _ctypes.so to confirm it linked against the correct libffi.
  • Rebuilt libffi and Python in Yocto with proper TUNE_FEATURES.
  • Enabled core dumps and got backtraces with gdb — sure enough, the crashes were in the libffi callback trampolines.
  • Verified my filesystem integrity by hashing key stdlib files across boots.
  • Stress-tested with sockets to ensure no driver/memory corruption was creeping in.

Final Thoughts

This project taught me that bus errors on ARM are rarely random. They’re usually the result of misalignment, bad ABI settings, or mismatched libraries. In my case, a rebuild of libffi and Python with the correct Yocto tuning fixed the issue. If you’re stuck with Python 2.7, at least move to 2.7.18 it has far fewer ctypes pitfalls. If you can, upgrade to Python 3.x, where ARM handling is much more robust.

Related blog posts