Learn Zig Series (#17) - Packed Structs and Bit Manipulation

@scipio 71

3 months ago

StemSocial

Learn Zig Series (#17) - Packed Structs and Bit Manipulation

What will I learn

You will learn how packed structs lay out fields with exact bit widths;
the difference between packed struct and regular struct memory layout;
bitwise operations in Zig: AND, OR, XOR, shifts, and where to use them;
reading and writing individual bits and bit ranges using packed structs;
@bitCast for reinterpreting bit patterns between types;
endianness considerations with std.mem.nativeToBig and friends;
real-world example: parsing a DNS header with packed structs;
common packed struct pitfalls and how to avoid them.

Requirements

A working modern computer running macOS, Windows or Ubuntu;
An installed Zig 0.14+ distribution (download from ziglang.org);
The ambition to learn Zig programming.

Difficulty

Intermediate

Curriculum (of the `Learn Zig Series`):

Learn Zig Series (#17) - Packed Structs and Bit Manipulation

Welcome back! In episode #16 we explored sentinel-terminated types and C strings -- the :0 in [:0]u8, how std.mem.span() converts C string pointers to Zig slices, allocSentinel for heap-allocated null-terminated strings, and the coercion chain from *const [N:0]u8 down to [*c]const u8. That covered the boundary between Zig's safe slices and C's null-terminated world.

Now we're going lower. Way lower. Down to individual bits.

If you've ever needed to parse a binary protocol (DNS, TCP, MQTT, USB descriptors), talk to hardware registers on a microcontroller, or pack boolean flags into a single byte to save memory -- this episode is for you. Regular structs in Zig have padding between fields for alignment (as we saw in ep008 when we looked at memory layout). Packed structs eliminate that padding entirely and let you specify field sizes down to the individual bit. A 1-bit field takes one bit. A 4-bit field takes four bits. No waste.

Here we go!

Solutions to Episode 16 Exercises

Before we get into bit manipulation, here are the solutions to last episode's exercises on sentinel types and C strings. Complete code, copy-paste-and-run.

Exercise 1 -- cstrlen and cstrcmp without std library functions:

const std = @import("std");
const testing = std.testing;

fn cstrlen(s: [*:0]const u8) usize {
    var i: usize = 0;
    while (s[i] != 0) : (i += 1) {}
    return i;
}

fn cstrcmp(a: [*:0]const u8, b: [*:0]const u8) i32 {
    var i: usize = 0;
    while (a[i] != 0 and b[i] != 0) : (i += 1) {
        if (a[i] < b[i]) return -1;
        if (a[i] > b[i]) return 1;
    }
    // One or both strings ended
    if (a[i] == 0 and b[i] == 0) return 0;
    if (a[i] == 0) return -1; // a is shorter
    return 1; // b is shorter
}

test "cstrlen basic" {
    try testing.expectEqual(@as(usize, 5), cstrlen("hello"));
    try testing.expectEqual(@as(usize, 0), cstrlen(""));
    try testing.expectEqual(@as(usize, 11), cstrlen("hello world"));
}

test "cstrcmp ordering" {
    try testing.expect(cstrcmp("apple", "banana") < 0);
    try testing.expect(cstrcmp("banana", "apple") > 0);
    try testing.expectEqual(@as(i32, 0), cstrcmp("hello", "hello"));
    try testing.expect(cstrcmp("", "") == 0);
    try testing.expect(cstrcmp("", "a") < 0);
    try testing.expect(cstrcmp("abc", "abcd") < 0);
}

The key insight: both functions scan byte-by-byte using the sentinel many-pointer. cstrlen counts until it hits zero. cstrcmp walks both strings in parallel, comparing character by character, and handles the case where one string is a prefix of the other (the shorter string sorts first).

Exercise 2 -- CStringBuilder with ArrayList:

const std = @import("std");
const testing = std.testing;

const CStringBuilder = struct {
    list: std.ArrayList(u8),

    pub fn init(allocator: std.mem.Allocator) CStringBuilder {
        return .{ .list = std.ArrayList(u8).init(allocator) };
    }

    pub fn append(self: *CStringBuilder, bytes: []const u8) !void {
        try self.list.appendSlice(bytes);
    }

    pub fn toOwnedSlice(self: *CStringBuilder) ![:0]u8 {
        try self.list.append(0); // add sentinel
        const slice = try self.list.toOwnedSlice();
        // slice includes the sentinel at the end
        // Return as [:0]u8 with len = total - 1 (excluding sentinel)
        return slice[0 .. slice.len - 1 :0];
    }

    pub fn deinit(self: *CStringBuilder) void {
        self.list.deinit();
    }
};

test "CStringBuilder assembles fragments" {
    var builder = CStringBuilder.init(testing.allocator);
    defer builder.deinit();

    try builder.append("hello");
    try builder.append(" ");
    try builder.append("world");

    const result = try builder.toOwnedSlice();
    defer testing.allocator.free(result);

    try testing.expectEqualStrings("hello world", result);
    try testing.expectEqual(@as(usize, 11), result.len);
    try testing.expectEqual(@as(u8, 0), result[result.len]);
}

test "CStringBuilder empty string" {
    var builder = CStringBuilder.init(testing.allocator);
    defer builder.deinit();

    const result = try builder.toOwnedSlice();
    defer testing.allocator.free(result);

    try testing.expectEqual(@as(usize, 0), result.len);
    try testing.expectEqual(@as(u8, 0), result[0]);
}

The trick is in toOwnedSlice: you append a zero byte, call toOwnedSlice() on the ArrayList (which gives you []u8), then sentinel-slice it with [0..len-1 :0]. The sentinel assertion holds because you just put that zero byte there.

Exercise 3 -- calling C's getenv("PATH") and splitting by ::

const std = @import("std");
const c = @cImport({
    @cInclude("stdlib.h");
});

pub fn main() void {
    const path_ptr = c.getenv("PATH");
    if (path_ptr) |ptr| {
        const path = std.mem.span(ptr);
        var iter = std.mem.splitScalar(u8, path, ':');
        var count: usize = 0;
        while (iter.next()) |dir| {
            std.debug.print("[{d}] {s}\n", .{ count, dir });
            count += 1;
        }
        std.debug.print("\nTotal: {d} directories in PATH\n", .{count});
    } else {
        std.debug.print("PATH is not set\n", .{});
    }
}

Build with exe.linkLibC() in your build.zig. The getenv return is ?[*:0]u8 -- the if unwraps the optional, std.mem.span converts to a slice, and splitScalar handles the rest. On Windows you'd split by ; instead of :.

Right, on to packed structs ;-)

Regular structs vs packed structs

When you define a regular struct in Zig, the compiler is free to insert padding bytes between fields for alignment. We covered this in ep008 -- a struct { a: u8, b: u32, c: u8 } doesn't occupy 6 bytes as you might expect. The compiler adds 3 bytes of padding after a to align b on a 4-byte boundary, then another 3 after c for struct alignment. Total: 12 bytes for 6 bytes of actual data.

A packed struct eliminates all of that. Fields are placed consecutively in memory with no padding whatsoever, and you can specify field widths down to individual bits:

const std = @import("std");
const testing = std.testing;

const RegularFlags = struct {
    enabled: bool,     // 1 byte (bool is 1 byte in Zig)
    priority: u8,      // 1 byte
    mode: u8,          // 1 byte
};

const PackedFlags = packed struct {
    enabled: bool,     // 1 bit
    priority: u3,      // 3 bits (values 0-7)
    mode: u4,          // 4 bits (values 0-15)
};

test "size comparison" {
    // Regular struct: 3 bytes (no padding needed here since all u8)
    try testing.expectEqual(@as(usize, 3), @sizeOf(RegularFlags));

    // Packed struct: 1 byte! (1 + 3 + 4 = 8 bits = 1 byte)
    try testing.expectEqual(@as(usize, 1), @sizeOf(PackedFlags));
}

test "packed struct usage" {
    var flags = PackedFlags{
        .enabled = true,
        .priority = 5,
        .mode = 12,
    };

    try testing.expectEqual(true, flags.enabled);
    try testing.expectEqual(@as(u3, 5), flags.priority);
    try testing.expectEqual(@as(u4, 12), flags.mode);

    flags.priority = 7; // max for u3
    try testing.expectEqual(@as(u3, 7), flags.priority);
}

That PackedFlags struct is exactly 1 byte. One bool (1 bit), one u3 (3 bits), one u4 (4 bits) -- they pack together into 8 bits with zero waste. You read and write the fields like any normal struct; the compiler generates the bitwise masking and shifting behind the scenes.

This is the core value proposition of packed structs: you describe the layout declaratively, and the compiler handles the bit twiddling. No manual & 0x0F or << 4. No error-prone shift constants that break when you add a field. The type system knows exactly where each field is and how wide it is.

Bitwise operations refresher

Before we go further with packed structs, let's make sure the foundational bitwise operations are clear. You'll need these whenever you work with raw bit patterns, even outside of packed structs:

const std = @import("std");
const testing = std.testing;

test "bitwise AND -- masking bits" {
    const value: u8 = 0b1101_0110;
    const mask: u8 = 0b0000_1111;
    // AND keeps only bits that are 1 in BOTH operands
    try testing.expectEqual(@as(u8, 0b0000_0110), value & mask);
    // Common use: extract the lower 4 bits
}

test "bitwise OR -- setting bits" {
    const value: u8 = 0b1100_0000;
    const flags: u8 = 0b0000_0011;
    // OR sets bits that are 1 in EITHER operand
    try testing.expectEqual(@as(u8, 0b1100_0011), value | flags);
    // Common use: turn on specific bits without affecting others
}

test "bitwise XOR -- toggling bits" {
    const value: u8 = 0b1010_1010;
    const toggle: u8 = 0b1111_0000;
    // XOR flips bits that are 1 in the second operand
    try testing.expectEqual(@as(u8, 0b0101_1010), value ^ toggle);
    // Common use: toggle flags, simple checksums
}

test "bitwise NOT -- inverting all bits" {
    const value: u8 = 0b1010_1010;
    try testing.expectEqual(@as(u8, 0b0101_0101), ~value);
}

test "left shift -- multiply by powers of 2" {
    const value: u8 = 0b0000_0011; // 3
    try testing.expectEqual(@as(u8, 0b0000_1100), value << 2); // 3 * 4 = 12
    // Each left shift doubles the value
}

test "right shift -- divide by powers of 2" {
    const value: u8 = 0b0000_1100; // 12
    try testing.expectEqual(@as(u8, 0b0000_0011), value >> 2); // 12 / 4 = 3
    // Each right shift halves the value (integer division)
}

Zig uses the same operators as C: & (AND), | (OR), ^ (XOR), ~ (NOT), << (left shift), >> (right shift). One important difference: Zig's shift operators require the shift amount to be a u3 (for u8), u4 (for u16), u5 (for u32), etc. -- meaning you CAN'T shift by more than the bit width of the type. In C, shifting a uint8_t by 9 is undefined behavior. In Zig it's a compile error. Safety first.

Field sizes and integer types

One of the beautiful things about Zig is that integer types aren't limited to 8, 16, 32, 64. You can have u1, u3, u7, u12, u24 -- any width you need. Packed structs take full advantage of this:

const std = @import("std");
const testing = std.testing;

const Color = packed struct {
    red: u5,        // 5 bits (0-31)
    green: u6,      // 6 bits (0-63)
    blue: u5,       // 5 bits (0-31)
};

test "RGB565 color format" {
    // RGB565 is a common 16-bit color format in embedded displays
    try testing.expectEqual(@as(usize, 2), @sizeOf(Color));

    const bright_red = Color{ .red = 31, .green = 0, .blue = 0 };
    const pure_green = Color{ .red = 0, .green = 63, .blue = 0 };
    const white = Color{ .red = 31, .green = 63, .blue = 31 };

    try testing.expectEqual(@as(u5, 31), bright_red.red);
    try testing.expectEqual(@as(u6, 63), pure_green.green);
    try testing.expectEqual(@as(u5, 31), white.blue);
}

const Permissions = packed struct {
    read: bool,      // 1 bit
    write: bool,     // 1 bit
    execute: bool,   // 1 bit
    _padding: u5 = 0, // 5 bits to fill the byte
};

test "Unix-style permission bits" {
    try testing.expectEqual(@as(usize, 1), @sizeOf(Permissions));

    const rwx = Permissions{ .read = true, .write = true, .execute = true };
    const rx = Permissions{ .read = true, .write = false, .execute = true };

    try testing.expect(rwx.read);
    try testing.expect(rwx.write);
    try testing.expect(!rx.write);
}

That Color struct maps directly to the RGB565 pixel format used in many embedded LCD displays. 5 bits red, 6 bits green (human eyes are more sensitive to green), 5 bits blue. 16 bits total, exactly 2 bytes, no waste. If you were writing a display driver in C, you'd be doing (red << 11) | (green << 5) | blue everywhere. In Zig you just assign to .red, .green, .blue and the packed struct handles the layout.

The _padding field with a default value of 0 is a common pattern. Packed structs need to add up to a whole number of bytes (8, 16, 24, 32... bits). If your fields don't naturally sum to a byte boundary, you add explicit padding. The underscore prefix tells the reader (and some linters) this field is intentionally unused.

@bitCast -- reinterpreting bit patterns

@bitCast is how you convert between types that have the same bit layout. It doesn't change any bits -- it just tells the compiler "interpret these exact bits as a different type":

const std = @import("std");
const testing = std.testing;

const StatusByte = packed struct {
    error_code: u4,
    warning: bool,
    ready: bool,
    busy: bool,
    valid: bool,
};

test "bitCast between packed struct and integer" {
    // Build a status byte from fields
    const status = StatusByte{
        .error_code = 3,
        .warning = false,
        .ready = true,
        .busy = false,
        .valid = true,
    };

    // Reinterpret as a raw u8
    const raw: u8 = @bitCast(status);
    // error_code=3 is 0011, warning=0, ready=1, busy=0, valid=1
    // Layout (LSB first): 0011 0 1 0 1 = 0b1010_0011 = 0xA3
    try testing.expectEqual(@as(u8, 0b1010_0011), raw);

    // Go the other direction -- raw byte to structured fields
    const parsed: StatusByte = @bitCast(@as(u8, 0b1010_0011));
    try testing.expectEqual(@as(u4, 3), parsed.error_code);
    try testing.expect(!parsed.warning);
    try testing.expect(parsed.ready);
    try testing.expect(!parsed.busy);
    try testing.expect(parsed.valid);
}

test "bitCast for floating point inspection" {
    // You can also bitCast between float and int of the same size
    const pi: f32 = 3.14159;
    const bits: u32 = @bitCast(pi);
    // IEEE 754 representation of pi
    try testing.expectEqual(@as(u32, 0x40490FD0), bits);

    // And back
    const reconstructed: f32 = @bitCast(bits);
    try testing.expectApproxEqAbs(pi, reconstructed, 0.00001);
}

This is incredibly powerful. You receive a byte from a hardware register or network packet, @bitCast it to a packed struct, and now you have named fields. No manual masking. No shift arithmetic. No "wait, is the ready bit at position 5 or 6?" -- the struct definition IS the documentation.

The rule for @bitCast is simple: both types must have exactly the same size in bits. A u8 can be cast to any packed struct that's 8 bits. A u16 can be cast to any 16-bit packed struct. The compiler checks this at compile time.

Endianness: when byte order matters

Here's where it gets tricky, and where I've seen a lot of confusion (including from myself, honestly, the first time I dealt with network protocols in a systems language). Packed structs in Zig use the native byte order of your CPU. On x86 and ARM (which is nearly everything these days), that's little-endian -- the least significant byte comes first in memory.

But network protocols (TCP/IP, DNS, HTTP/2) use big-endian (most significant byte first). So when you receive bytes from the network and want to interpret them as a packed struct, you might need to swap bytes first:

const std = @import("std");
const testing = std.testing;

test "endianness with std.mem" {
    // Network byte order (big-endian) value: 0x1234
    const network_value: u16 = 0x1234;

    // Convert from big-endian to native (little-endian on x86/ARM)
    const native = std.mem.bigToNative(u16, network_value);
    // On little-endian: bytes are swapped
    // On big-endian: no change

    // Convert from native back to big-endian for sending
    const back_to_network = std.mem.nativeToBig(u16, native);
    try testing.expectEqual(network_value, back_to_network);
}

test "byte swapping a u32" {
    const big_endian: u32 = 0xDEADBEEF;
    const swapped = @byteSwap(big_endian);
    try testing.expectEqual(@as(u32, 0xEFBEADDE), swapped);
}

std.mem.bigToNative and std.mem.nativeToBig are the functions you'll use most often. They're no-ops on big-endian architectures and byte-swaps on little-endian. @byteSwap is the raw builtin that always swaps, regardless of platform.

The important rule: always convert to native byte order BEFORE @bitCasting to a packed struct, and convert back to network byte order AFTER @bitCasting from a packed struct. If you forget this step, all your multi-byte fields will have their bytes reversed and nothing will make sense.

For single-byte packed structs (like our StatusByte earlier), endianness doesn't matter -- there's only one byte, so there's nothing to reorder.

Real-world example: parsing DNS headers

Let's put it all together with a real protocol. A DNS header is 12 bytes with a very specific bit layout. Every DNS query and response starts with this header. The structure (from RFC 1035) is:

                                1  1  1  1  1  1
  0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      ID                         |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    QDCOUNT                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ANCOUNT                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    NSCOUNT                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ARCOUNT                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Here's how we model this in Zig:

const std = @import("std");
const testing = std.testing;

const DnsFlags = packed struct(u16) {
    // Bits are ordered LSB-first in packed structs, but DNS is big-endian.
    // We define fields in order from LSB to MSB of the native representation.
    // After byte-swapping the raw u16 from big-endian to native, the layout is:
    rcode: u4,       // Response code
    z: u3,           // Reserved (must be 0)
    ra: bool,        // Recursion Available
    rd: bool,        // Recursion Desired
    tc: bool,        // Truncation
    aa: bool,        // Authoritative Answer
    opcode: u4,      // Operation code
    qr: bool,        // Query (0) or Response (1)
};

const DnsHeader = struct {
    id: u16,
    flags: DnsFlags,
    qdcount: u16,
    ancount: u16,
    nscount: u16,
    arcount: u16,

    pub fn parse(bytes: *const [12]u8) DnsHeader {
        return .{
            .id = std.mem.bigToNative(u16, std.mem.readInt(u16, bytes[0..2], .big)),
            .flags = @bitCast(std.mem.bigToNative(
                u16,
                std.mem.readInt(u16, bytes[2..4], .big),
            )),
            .qdcount = std.mem.bigToNative(u16, std.mem.readInt(u16, bytes[4..6], .big)),
            .ancount = std.mem.bigToNative(u16, std.mem.readInt(u16, bytes[6..8], .big)),
            .nscount = std.mem.bigToNative(u16, std.mem.readInt(u16, bytes[8..10], .big)),
            .arcount = std.mem.bigToNative(u16, std.mem.readInt(u16, bytes[10..12], .big)),
        };
    }

    pub fn isResponse(self: DnsHeader) bool {
        return self.flags.qr;
    }

    pub fn isAuthoritative(self: DnsHeader) bool {
        return self.flags.aa;
    }

    pub fn getResponseCode(self: DnsHeader) u4 {
        return self.flags.rcode;
    }
};

test "parse a DNS query header" {
    // A real DNS query for example.com
    // ID=0x1234, QR=0(query), Opcode=0(standard), RD=1, QDCOUNT=1
    const raw = [12]u8{
        0x12, 0x34, // ID
        0x01, 0x00, // Flags: RD=1, rest=0
        0x00, 0x01, // QDCOUNT=1
        0x00, 0x00, // ANCOUNT=0
        0x00, 0x00, // NSCOUNT=0
        0x00, 0x00, // ARCOUNT=0
    };

    const header = DnsHeader.parse(&raw);

    try testing.expectEqual(@as(u16, 0x1234), header.id);
    try testing.expect(!header.isResponse());
    try testing.expect(header.flags.rd);
    try testing.expect(!header.flags.aa);
    try testing.expectEqual(@as(u4, 0), header.getResponseCode());
    try testing.expectEqual(@as(u16, 1), header.qdcount);
    try testing.expectEqual(@as(u16, 0), header.ancount);
}

test "parse a DNS response header" {
    // A DNS response: QR=1, AA=1, RD=1, RA=1, RCODE=0 (no error)
    const raw = [12]u8{
        0x12, 0x34, // ID (matches the query)
        0x85, 0x80, // Flags: QR=1, AA=1, RD=1, RA=1
        0x00, 0x01, // QDCOUNT=1
        0x00, 0x02, // ANCOUNT=2
        0x00, 0x00, // NSCOUNT=0
        0x00, 0x00, // ARCOUNT=0
    };

    const header = DnsHeader.parse(&raw);

    try testing.expect(header.isResponse());
    try testing.expect(header.isAuthoritative());
    try testing.expect(header.flags.rd);
    try testing.expect(header.flags.ra);
    try testing.expectEqual(@as(u4, 0), header.getResponseCode());
    try testing.expectEqual(@as(u16, 2), header.ancount);
}

Notice a few things happening here. The DnsFlags is a packed struct(u16) -- the (u16) tells Zig this packed struct is backed by a u16 (16 bits). Fields are defined from LSB to MSB in the native representation. Because DNS uses big-endian, we byte-swap the raw u16 to native order before @bitCasting to DnsFlags.

The DnsHeader is a regular struct (not packed) that contains the parsed fields. We use std.mem.readInt to read multi-byte integers from the raw byte array with explicit endianness, then bigToNative to convert to the platform's native order. This two-step process -- read bytes, convert endianness -- is the standard pattern for parsing network protocols.

Compare this to the C approach where you'd have a mess of ntohs() calls and manual bitmask operations: (flags >> 15) & 1 for QR, (flags >> 11) & 0xF for opcode, etc. With packed structs, you just say header.flags.qr and the compiler does the right thing.

Packed struct quirks and limitations

Packed structs are powerful but they have some restrictions you need to know about:

You can't take pointers to fields. Because fields might not be byte-aligned (a u3 field starts at bit 5, for instance), you can't create a normal pointer to it. The compiler rejects &my_packed.field if the field isn't byte-aligned:

const std = @import("std");

const Packed = packed struct {
    a: u3,
    b: u5,
    c: u8,
};

test "pointer restrictions" {
    var p = Packed{ .a = 1, .b = 2, .c = 3 };

    // This works -- c is byte-aligned (starts at bit 8)
    const ptr_c: *u8 = &p.c;
    ptr_c.* = 42;

    // This would NOT compile:
    // const ptr_a: *u3 = &p.a;  // error: cannot take address of packed struct field
    // Because a starts at bit 0 and is only 3 bits wide -- not addressable.

    // You can still read and write through the struct though
    p.a = 7;
    _ = p.a;
}

Packed structs can contain other packed structs (nesting), but NOT regular structs. The inner struct must also be packed because the compiler needs to guarantee the bit layout:

const Inner = packed struct {
    x: u4,
    y: u4,
};

const Outer = packed struct {
    header: Inner,    // OK -- inner is packed
    data: u8,
};
// Total: 16 bits (4 + 4 + 8)

Default values work but require care. You can give packed struct fields default values, and they work exactly as you'd expect:

const ConfigRegister = packed struct {
    enabled: bool = false,
    mode: u3 = 0,
    speed: u4 = 8,     // default speed setting
};

test "defaults in packed struct" {
    const cfg = ConfigRegister{};
    _ = cfg;
    // enabled=false, mode=0, speed=8
}

The backing integer type. You can explicitly declare what integer type backs a packed struct with packed struct(uN). If you don't specify, Zig infers it from the total bit width. For 8 bits you get u8, for 16 bits u16, etc. If your fields don't add up to a standard width (8, 16, 32, 64), Zig still supports it -- but you lose the ability to easily @bitCast to and from standard integers.

Working with hardware registers

If you're doing embedded programming, packed structs are how you define memory-mapped hardware registers. Here's a simplified example modeled after a typical UART peripheral:

const std = @import("std");
const testing = std.testing;

const UartStatus = packed struct(u8) {
    tx_empty: bool,     // Transmit buffer empty
    tx_complete: bool,  // Transmission complete
    rx_ready: bool,     // Data received and ready
    overrun: bool,      // Receive buffer overrun error
    framing_err: bool,  // Framing error detected
    parity_err: bool,   // Parity error detected
    _reserved: u2 = 0,  // Reserved bits
};

const UartControl = packed struct(u8) {
    tx_enable: bool,    // Enable transmitter
    rx_enable: bool,    // Enable receiver
    parity: u2,         // 0=none, 1=odd, 2=even
    stop_bits: u1,      // 0=one stop bit, 1=two
    word_length: u2,    // 0=5bit, 1=6bit, 2=7bit, 3=8bit
    _reserved: u1 = 0,
};

fn readRegister(addr: usize) u8 {
    // In real embedded code, this would be a volatile read from hardware
    // For demonstration, we simulate with a value
    _ = addr;
    return 0b0000_0101; // tx_empty=true, rx_ready=true
}

test "interpret hardware register" {
    const raw = readRegister(0x4000_1000);
    const status: UartStatus = @bitCast(raw);

    try testing.expect(status.tx_empty);
    try testing.expect(!status.tx_complete);
    try testing.expect(status.rx_ready);
    try testing.expect(!status.overrun);
}

test "configure UART" {
    var ctrl = UartControl{
        .tx_enable = true,
        .rx_enable = true,
        .parity = 0,       // no parity
        .stop_bits = 0,    // one stop bit
        .word_length = 3,  // 8-bit data
    };

    const raw: u8 = @bitCast(ctrl);
    // In real code: write raw to the control register address
    // writeRegister(0x4000_1004, raw);

    try testing.expect(ctrl.tx_enable);
    try testing.expectEqual(@as(u2, 3), ctrl.word_length);
    _ = raw;
}

In production embedded Zig, you'd use @as(*volatile u8, @ptrFromInt(address)) to read/write the actual hardware registers. The packed struct gives you named access to individual bits and bit-fields without manual masking. When you write status.rx_ready the compiler generates the exact same machine code as (raw >> 2) & 1 -- but your source code is readable and maintable.

Combining packed structs with comptime

Remember comptime from ep009? You can generate packed struct types at compile time, which is useful for protocol parsers that need to handle variable-width fields:

const std = @import("std");
const testing = std.testing;

fn BitField(comptime widths: []const u8) type {
    comptime var total: usize = 0;
    for (widths) |w| total += w;

    if (total % 8 != 0)
        @compileError("total bit width must be a multiple of 8");

    // For simplicity, return a u8/u16/u32 based on total width
    return switch (total) {
        8 => u8,
        16 => u16,
        32 => u32,
        else => @compileError("unsupported total width"),
    };
}

test "comptime bit width calculation" {
    const T = BitField(&.{ 1, 3, 4 }); // 8 bits total
    try testing.expectEqual(@as(usize, 1), @sizeOf(T));

    const T2 = BitField(&.{ 4, 4, 8 }); // 16 bits total
    try testing.expectEqual(@as(usize, 2), @sizeOf(T2));
}

This is a simplifed example, but the pattern extends to generating full packed struct types at comptime using @Type and std.builtin.Type. Real protocol libraries use this to define message formats from schema descriptions at compile time -- the schema drives the type generation, and you get zero-cost parsing with full type safety.

Common mistakes with packed structs

Here are the mistakes I've made (and seen others make) most often:

Mistake 1: Forgetting about endianness. If your packed struct represents network data, you MUST byte-swap before @bitCast. If it represents a hardware register on the local CPU, you probably don't need to swap (the CPU reads its own registers in native order). Get this wrong and every multi-byte field is garbled.

Mistake 2: Fields not summing to a byte boundary. If your fields add up to, say, 11 bits, you need explicit padding to reach 16:

// WRONG -- 11 bits, not a clean byte boundary
// const Bad = packed struct { a: u3, b: u4, c: u4 };

// CORRECT -- pad to 16 bits
const Good = packed struct(u16) {
    a: u3,
    b: u4,
    c: u4,
    _pad: u5 = 0,
};

Mistake 3: Assuming field order matches bit order. In Zig's packed structs, the first field occupies the LOWEST bits. So in packed struct { a: u4, b: u4 }, field a is bits 0-3 and field b is bits 4-7. This is important when you @bitCast to an integer -- a is in the low nibble, not the high nibble. C bitfields have implementation-defined ordering (seriously -- different compilers can lay out the same bitfield struct differently). Zig's ordering is always defined and consistent.

Mistake 4: Using packed structs when you don't need them. Packed structs have restrictions (no field pointers for non-byte-aligned fields, must use packed inner structs). If you're just grouping related data and don't need a specific memory layout, use a regular struct. Only reach for packed struct when you actually need bit-level control over the layout.

Exercises

Define a packed struct IpFlags that represents the 3-bit flags field and 13-bit fragment offset from an IPv4 header (16 bits total). The flags are: reserved (1 bit, always 0), don't fragment (1 bit), and more fragments (1 bit). The fragment offset is 13 bits. Write a function parseIpFlagsOffset(raw: u16) IpFlags that takes a big-endian u16 and returns the parsed struct (remember to handle endianness). Test with known values: a packet with DF=1 should have dont_fragment == true, and a packet with fragment offset 185 should parse correctly.
Build a BitVector backed by a u64 that supports set(bit_index), clear(bit_index), isSet(bit_index) bool, and count() u7 (popcount -- number of set bits). Use bitwise operations directly (AND, OR, shifts) rather than packed structs for this one. Test by setting bits 0, 5, 63, verifying they're set, clearing bit 5, and checking the count. Bonus: implement iterator() that yields the indices of all set bits in ascending order.
Define a packed struct for an I2C transaction byte that has: a 7-bit device address and a 1-bit read/write flag (0=write, 1=read), totaling 8 bits. Write buildAddress(device: u7, read: bool) u8 that constructs the address byte, and parseAddress(raw: u8) struct { device: u7, rw: bool } that decodes it. Then use @bitCast to verify that your manual construction matches the packed struct interpretation. Test with device address 0x50 (a common EEPROM address) and both read and write modes.

Dusssssss, wat hebben we nou geleerd?

Packed structs eliminate padding and let you specify field sizes down to individual bits. A packed struct { a: u3, b: u5 } is exactly 1 byte, and you read/write fields by name while the compiler generates the bit masking.
Regular structs add padding for alignment; packed structs pack fields consecutively. Use packed struct only when you need exact control over the memory layout (hardware registers, binary protocols, compact flags).
Zig supports arbitrary integer widths (u1, u3, u5, u12, etc.) which packed struct fields take full advantage of. The compiler enforces that values fit -- assigning 8 to a u3 is a compile error.
@bitCast reinterprets bits between types of the same size without changing any data. Cast a u8 to a packed struct(u8) and you get named fields; cast the struct back and you get the raw byte. Zero cost.
Endianness matters when parsing network or file data. Use std.mem.bigToNative / nativeToBig before/after @bitCast. Single-byte packed structs don't need byte swapping.
Bitwise operators (&, |, ^, ~, <<, >>) work like C but with compile-time shift amount validation. Zig won't let you shift by more than the type's bit width.
Packed structs can't have pointers taken to non-byte-aligned fields (the compiler rejects it). They can nest other packed structs but not regular structs.
The combination of packed structs, @bitCast, and Zig's arbitrary-width integers makes parsing binary protocols significantly cleaner than the C approach of manual masks and shifts -- and the compiler generates identical machine code.

Next time we're changing gears entirely. We've been dealing with synchronous, one-thing-at-a-time code this entire series. But real programs need to handle multiple things concurently -- waiting for network data while processing user input, managing timeouts, multiplexing I/O. We'll start looking at how Zig approaches concurrency and what event-driven programming looks like in a language without a runtime ;-)

Bedankt en tot de volgende keer!

@scipio

stem stemsocial steemstem zig programming

0.000

1 comments

@hivebuzz 74

3 months ago

Congratulations @scipio! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)

	You distributed more than 20000 upvotes. Your next target is to reach 21000 upvotes.

_{You can view your badges on your board and compare yourself to others in the Ranking}
_{If you no longer want to receive notifications, reply to this comment with the word STOP}

0.000

Learn Zig Series (#17) - Packed Structs and Bit Manipulation

Learn Zig Series (#17) - Packed Structs and Bit Manipulation

What will I learn

Requirements

Difficulty

Curriculum (of the Learn Zig Series):

Learn Zig Series (#17) - Packed Structs and Bit Manipulation

Solutions to Episode 16 Exercises

Regular structs vs packed structs

Bitwise operations refresher

Field sizes and integer types

@bitCast -- reinterpreting bit patterns

Endianness: when byte order matters

Real-world example: parsing DNS headers

Packed struct quirks and limitations

Working with hardware registers

Combining packed structs with comptime

Common mistakes with packed structs

Exercises

Dusssssss, wat hebben we nou geleerd?

Bedankt en tot de volgende keer!

Curriculum (of the `Learn Zig Series`):