Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: Poor performance accessing wired buffers from user-space on G5.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Poor performance accessing wired buffers from user-space on G5.



Are you by any chance specifying the kIOMapInhibitCache flag when you map the mem into userland? I saw a performance hit just like what you describe when I was doing this for my DMA buffers. It turned out I didn't need that flag; problem solved. One additional note - in addition to seeing this performance hit on the G5, I was *also* seeing it on G4's, but only when they were running Panther. It didn't show up in G4's running Jaguar.

If that's not it, you could test your DART theory by disabling the DART. I believe you can do this by setting boot-args:

boot-args="dart=0"

This only works if you have less than 2G of RAM installed.

Jim

At 4:22 PM -0400 4/27/05, email@hidden wrote:
Hello all,

I'm trying to overcome a performance problem we're having with our video
capture card drivers on the G5.

The nature of our problem is that we're seeing drastic  differences in
performance between G4 and G5 systems, and not the differences we'd expect.
In a G4 system, when we read from or write to a wired, contiguous, buffer
allocated by a IOBufferMemoryDescriptor call, we see roughly the advertised
memory performance of the system.  However on the G5, we are incurring huge
overhead, resulting in significantly slower performance where we would have
expected about a 3X increase.

We have tried literally dozens of things to get around this problem.  In our
testing, we have discovered that in User-space, we can allocate two buffers
and copy between them at the expected G5 performance levels.  However, if
one of the buffers is our kernel-allocated buffer, it appears we are taking
a huge hit.  We've set the buffer up as "user/kernel shared" we've tried
setting the directions to match the direction of the copy.  We've even tried
setting the direction as in/out.  As an experiment, we've allocated two
buffers in the kernel, and done the copy at interrupt level.  In every case,
the performance is an order of magnitude less than the same code on the G4.

Some details:
We allocate wired buffers in the kext, of around 800KBytes (The size of a
standard-definition video frame) each.  We allocate them as page-aligned and
contiguous so that we can DMA to/from them from our hardware card without
setting up scatter-gather.  This results in almost 200 vm pages per buffer.
We have a QuickTime component which fills these buffers from a given QT
buffer.

At this point, we are assuming that we are incurring a hit brought on by the
DART reloading its address tables, or perhaps issues with VM on the G5.  The
same code on G4 machines seems fine.

We have recently come across an Apple soft-VDIG example which includes a
kext, and it looks like our implementation is exactly as in the example, and
as you might expect, the example code performs just as poorly on the G5.

- Mike Stroven
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-drivers mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Poor performance accessing wired buffers from user-space on G5. (From: "email@hidden" <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.