I finally had time to finish looking at this, using the mess at the bottom which does more than it needs as I was trying to break it. Indeed uint32_t *s are 4-byte aligned, which is what you'd expect. 'third' shows up offset 12 from 'first' so one is not double-word aligned. Compiling that for armv7 with -Ofast the two assignments to w and x compile down to
@DEBUG_VALUE: test:x <- R1
@DEBUG_VALUE: test:w <- R0
.loc 1 39 0 @ /Users/rols/Code/C/TestAlign/TestAlign/Test1.c:39:0
ldrd r0, r1, [r4, #12]
which handily tells you R0 is w and R1 is x and in this case r4 happens to be t, yes the compiler removed q. So that's a double-word load from t[ 12 ]. And when you run it indeed that address is 4 byte aligned (in my case it was 0x17548b2c) and the ldrd works just fine.
Note
In ARMv7-M, LDRD
(PC-relative) instructions must be on a word-aligned address.
So that seems to be most of the pieces. The compiler expects uint32_t to be word-aligned because it carefully aligns it such for any that it creates, ldrd works word-aligned on armv7 (which isn't what I wrote the first time around) and -Ofast thus uses it.
Roland
typedef struct testStruct {
uint32_t first[ 1 ];
char second[5];
uint32_t third[ 4 ];
} testStruct;
uint32_t test( int first )
{
testStruct s = { { 1 }, { 'a', 'b', 'c', 'd', 'e' }, { 2, 3, 4, 5 } };
testStruct *t = calloc( sizeof( testStruct ), 1 );
memcpy( t, &s, sizeof( testStruct ) );
printf( "t at %p, length, %lu, first offset %lu, second offset %lu, third offset %lu\n",
t,
sizeof( testStruct ),
((void*)(&(t->first))) - (void*)t,
((void*)(&(t->second))) - (void*)t,
((void*)(&(t->third))) - (void*)t
);
uint32_t *q = t->third;
printf( "q at %p\n", q );
uint32_t w = q[ 0 ];
uint32_t x = q[ 1 ];
free( t );
return w + x;// + y + z;
}