In practice ? maybe. In theory ? no. If allocateObject gets inlined,
the compiler _or_ processor could process the write into myVariable
(making it non-NULL) before all the necessary allocation steps are
complete. This is legal since no ordering information has been specified
in the source.
If pre-emption occurs and another thread gets scheduled _on the same
processor_ and enters the same function, then it could see myVariable
but an incomplete initialization.
Since inter-procedural and inter-module inlining is one
of the best way to expose optimization opportunities, you can bet
that compilers will increasingly use it.
This practice is only valid if the compiler/processor guarantees
that all or part of the ordering implied by the source is honored
at execution time. Some platforms guarantee that, some don't.
The problem comes when there are multiple CPUs and cache between the
shared memory. The combined compiler, OS, and hardware must be able
in insure that a variable, specifically myVariable above, is treated
atomically across the processors. That is, myVariable, if changed in
one processor and stored to memory, must be seen as changed on any
other processor even if still in cache on that processor. Secondly,
the lock must work atomically across all processors.
The second requirement is a must for any multi-processor OS, so we can
assume that one is true.
Given our experience with ACE across many processors, we have only
seen one we had to add a machine instruction to correct the first
However, I am not a darwin developer and I would like assurance from
them that this is true, in both the examples shown.
I don't know enough about the PPC semantics, but I can tell you
that this is not valid on Itanium cpus. Even if we disregard reordering
(legally) performed by the compiler, the processor is free to perform
loads and stores out of order, _unless_ they are marked as ordered,
using the 'acquire' and 'release' qualifiers (ld.acq, st.rel) or
if a (full) memory fence instruction is used.
'st.rel' and 'ld.acq' work in pairs:
Some thread Another thread
st .... ld.acq rx=[addr]
st .... ld ...
st.rel [addr]=newVal ld ...
The semantics are that if a ld.acq sees the new value, then subsequent
loads following this ld.acq will also see the values that were
set in stores preceding the st.rel. st.rel is a write barrier,
while ld.acq is a load barrier.
To use the pattern above in a valid way, myVariable should be
declared volatile. The Itanium compilers I am familiar with will
emit a ld.acq on reading a volatile variable, and a st.rel when
setting a volatile variable.
My understanding is that the Alpha processor had similar requirements
to guarantee correctness. Unfortunately the C standard does not offer
enough guarantees to make this usage trivially portable, one has
to check the semantics of each target platform (compiler + processor)
to validate the idiom. At this point I do not know of any platform
where the usage of volatile won't work, yet I'll repeat that this
is not guaranteed by any standard (C<year>/Posix/Unix <year>/etc.).
Apologies for using examples not immediately relevant to Darwin,
but I don't want this idiom to become more prevalent. It is broken,
even if in not so obvious ways.
Regards - Eric
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden