Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: DTrace: Probe Performance Hit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: DTrace: Probe Performance Hit

Subject: Re: DTrace: Probe Performance Hit
From: Ken Thomases <email@hidden>
Date: Sat, 8 May 2010 01:10:11 -0500

On May 7, 2010, at 10:57 AM, Eric Gorr wrote:

> 	for ( x = 0; x < 1000000000; x++ )
> 	{
> 		y = [self myMethod:x];
>
> 		if ( DTRACETEST_TEST_MINE_ENABLED() )
> 		{
> 			DTRACETEST_TEST_MINE();
> 		}
> 	}

> If I remove DTRACETEST_TEST_MINE_ENABLED block, the time between the start and end probes is ~8.8 seconds.
> Adding DTRACETEST_TEST_MINE_ENABLED back in, but keeping the probe disabled (i.e. DTRACETEST_TEST_MINE_ENABLED returns false), took ~8.9 seconds.
>
> In the Release build,
>
> If I remove DTRACETEST_TEST_MINE_ENABLED block, the time between the start and end probes is ~5.7 seconds.
> Adding DTRACETEST_TEST_MINE_ENABLED back in, but keeping the probe disabled, took ~6.4 seconds.
>
> The overhead has increased significantly in the release build.
>
> This doesn't seem like it is as close to zero cost as I initially thought it was, leading me to think that i may not be measuring what I think I am measuring.

Well, I'm no expert at DTrace's performance, but you've just measured the cost of DTRACETEST_TEST_MINE_ENABLED() as being a fraction of a nanosecond.  That's pretty close to zero.

This is the worst sort of micro-benchmark.   You could liberally sprinkle such probes throughout a large codebase, run it for years, and never have the overhead accumulate to any significant amount of time.

I suspect what actually happened is not that DTRACETEST_TEST_MINE_ENABLED() cost ~0.7 nanoseconds, but that its presence interfered with the optimization of the loop.  That explains why it appears to only have ~0.1 nanoseconds of overhead in the debug build.  Since this code is completely artificial in nature, you'd be measuring an effect that you wouldn't see in real code.

If you really do expect to have a tight loop doing precious little other than checking DTRACETEST_TEST_MINE_ENABLED(), you could hoist it out of the loop (invoke it once, save its value into a const local variable, and check that variable within the loop).

	const BOOL enabled = DTRACETEST_TEST_MINE_ENABLED();
	for ( x = 0; x < 1000000000; x++ )
	{
		y = [self myMethod:x];

		if ( enabled )
		{
			DTRACETEST_TEST_MINE();
		}
	}

Mind you, that will prevent you from effectively enabling the probe while that loop is running.

You could also use the GCC branch prediction hinting built-in (__builtin_expect()) to say that you expect it to be false.  Ideally, the DTrace-generated macro would already include that, but it doesn't seem to.  (Tested on Mac OS X 10.6.3.)  You could modify or wrap the macro to use that.

> Based on these results, it would seem that one would want to keep probes out of the tightest loops in an application.
>
> But, considering how pervasive such things are even in Mac OS X, my conclusion does not seem to be correct.

Huh?  Tight loops with probes in them, and precious little else, are common in Mac OS X?  How do you figure?

Regards,
Ken

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References:
	>DTrace: Probe Performance Hit (From: Eric Gorr <email@hidden>)

Prev by Date: Re: DTrace Questions
Next by Date: Re: Xcode 3.2.2 Blows Right By Breakpoints. Was: Re-enabling shared library breakpoint...
Previous by thread: DTrace: Probe Performance Hit
Next by thread: Xcode 3.2.2 Blows Right By Breakpoints. Was: Re-enabling shared library breakpoint...
Index(es):
- Date
- Thread