Re: Interpreting Shark Results
Re: Interpreting Shark Results
- Subject: Re: Interpreting Shark Results
- From: Wade Tregaskis <email@hidden>
- Date: Tue, 18 Aug 2009 08:53:43 -0700
currently, I am trying to optimize some routines. Using Shark, I
noticed that there are lots of cache misses -- an obvious
performance problem. But then again, I am unsure how to interpret
what Shark tells me. For example, in this case:
0x1000aea5a movups +216(%rsp), %xmm2
0.6% 0.6% 0x1000aea62 dpps $255, %xmm15, %xmm5
5.9% 5.9% 0x1000aea69 dpps $255, %xmm15, %xmm2
0.0% 0.0% 0x1000aea70 movaps %xmm3, %xmm0
0x1000aea73 leal (%rcx, %r9), êx
0x1000aea77 movaps %xmm4, %xmm6
About a thousand events are recorded at an instruction that does not
reference memory in any way. Even the instruction immediately
adjacent to it only reference CPU registers. Is there something
obvious I am missing.
That profiling method, PMI, is not precise on Intel; by the time the
interrupts are actually handled and Shark can record the state of the
machine, a few extra instructions have likely slipped by. Out of
order execution also fuzzes things substantially. Luckily cache
misses tend to stall the core, so the results are relatively
accurate. I'd hazard a guess that the misses were caused by the movups.
Wade
P.S. A more appropriate list for questions like this would be email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden