Re: suboptimal code-gen of decrement in GCC 4.2.1
Re: suboptimal code-gen of decrement in GCC 4.2.1
- Subject: Re: suboptimal code-gen of decrement in GCC 4.2.1
- From: Jens Alfke <email@hidden>
- Date: Wed, 14 Oct 2009 14:16:19 -0700
On Oct 14, 2009, at 2:07 PM, Alastair Houghton wrote:
Not impossible, no, but there *is* a weird irregularity with DEC and
INC, namely that they don't touch the carry flag. That, of course,
creates a dependence on previous values of the flags register, which
means that testing the flags after a DEC or INC is potentially
expensive (it could force the processor to wait for anything that
might have affected the flags to complete).
Oh, good point. I saw the bit about the carry flag, but hadn't
considered what that meant in terms of pipelining.
The Intel Architecture Optimization Manual actually suggests that
INC and DEC should be replaced with ADD or SUB instructions for this
very reason.
My guess (and it is just a guess) is that GCC is generating the TEST
instruction to work around this problem in a different way by
resetting the flags register before testing it.
If I force the compiler to use a SUB instruction instead, by changing
"--mRefCount" to "m_refCount -= 9", it still generates similar code,
including the unnecessary CMP.
Is it possible to use inline assembly to force the optimal
instructions to be generated?
—Jens _______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden