Why you shouldn’t use __attribute__((packed))

gcc supports an extension that allows structures or structure members to be marked with __attribute__((packed)), which tells gcc to leave out all padding between members. Sometimes you need this, when you want to make sure of the layout of a structure. For example, you might have something like

    struct my_struct {
            uint8_t  field1;
            uint16_t field2;
    } __attribute__((packed));

Without the packed attribute, the struct will have padding between field1 and field2, and that’s no good if this struct is something that has to match hardware or be sent on a wire.

However, it’s actively harmful to add the attribute to a structure that’s already going to be laid out with no padding. Sometimes, when a structure needs to be laid out without padding (because of hardware or wire protocol), people are tempted to add the attribute to a struct like the following “just to let the compiler know”

    struct my_struct {
            uint32_t  field1;
            uint32_t field2;
    };

But adding __attribute__((packed)) goes further than just telling gcc that it shouldn’t add padding – it also tells gcc that it can’t make any assumptions about the alignment of accesses to structure members. And this leads to disastrously bad code on some architectures.

To see this, consider the simple code

    struct foo { int a; };
    struct bar { int b; } __attribute__((packed));

    int c(struct foo *x) { return x->a; }
    int d(struct bar *x) { return x->b; }

On architectures like x86, x86-64 and powerpc, both functions generate the same code. But take a look at what happens on ia64:

   0000000000000000 <c>:
      0:       13 40 00 40 10 10       [MBB]       ld4 r8=[r32]
      6:       00 00 00 00 10 80                   nop.b 0x0
      c:       08 00 84 00                         br.ret.sptk.many b0;;

   0000000000000010 <d>:
     10:       09 70 00 40 00 21       [MMI]       mov r14=r32
     16:       f0 10 80 00 42 00                   adds r15=2,r32
     1c:       34 00 01 84                         adds r32=3,r32;;
     20:       19 80 04 1c 00 14       [MMB]       ld1 r16=[r14],1
     26:       f0 00 3c 00 20 00                   ld1 r15=[r15]
     2c:       00 00 00 20                         nop.b 0x0;;
     30:       09 70 00 1c 00 10       [MMI]       ld1 r14=[r14]
     36:       80 00 80 00 20 e0                   ld1 r8=[r32]
     3c:       f1 78 bd 53                         shl r15=r15,16;;
     40:       01 00 00 00 01 00       [MII]       nop.m 0x0
     46:       e0 70 dc ee 29 00                   shl r14=r14,8
     4c:       81 38 9d 53                         shl r8=r8,24;;
     50:       0b 70 40 1c 0e 20       [MMI]       or r14=r16,r14;;
     56:       f0 70 3c 1c 40 00                   or r15=r14,r15
     5c:       00 00 04 00                         nop.i 0x0;;
     60:       11 00 00 00 01 00       [MIB]       nop.m 0x0
     66:       80 78 20 1c 40 80                   or r8=r15,r8
     6c:       08 00 84 00                         br.ret.sptk.many b0;;

gcc gets scared about unaligned accesses and generates six times as much code (96 bytes vs. 16 bytes)! sparc64 goes similarly crazy, bloating from 12 bytes to 52 bytes:

   0000000000000000 <c>:
      0:       81 c3 e0 08     retl
      4:       d0 42 00 00     ldsw  [ %o0 ], %o0
      8:       30 68 00 06     b,a   %xcc, 20 <d>

   0000000000000020 <d>:
     20:       c6 0a 00 00     ldub  [ %o0 ], %g3
     24:       c2 0a 20 01     ldub  [ %o0 + 1 ], %g1
     28:       c4 0a 20 02     ldub  [ %o0 + 2 ], %g2
     2c:       87 28 f0 18     sllx  %g3, 0x18, %g3
     30:       d0 0a 20 03     ldub  [ %o0 + 3 ], %o0
     34:       83 28 70 10     sllx  %g1, 0x10, %g1
     38:       82 10 40 03     or  %g1, %g3, %g1
     3c:       85 28 b0 08     sllx  %g2, 8, %g2
     40:       84 10 80 01     or  %g2, %g1, %g2
     44:       90 12 00 02     or  %o0, %g2, %o0
     48:       81 c3 e0 08     retl
     4c:       91 3a 20 00     sra  %o0, 0, %o0
     50:       30 68 00 04     b,a   %xcc, 60 <d+0x40>

So the executive summary is: don’t add __attribute__((packed)) to your code unless you know you need it.

3 Responses to “Why you shouldn’t use __attribute__((packed))”

  1. Andre says:

    in fact, that’s what __attribute__((aligned(x))) is for

  2. yuanpu says:

    andre can you explain us the _attribute__((aligned(x))) in the way of this

  3. olden says:

    I suspect André meant using e.g:
    __attribute__((packed, aligned(x)))