Re: RAM Cache vs the Pageout Demon
Re: RAM Cache vs the Pageout Demon
- Subject: Re: RAM Cache vs the Pageout Demon
- From: Michael Smith <email@hidden>
- Date: Sat, 21 Oct 2006 13:47:39 -0700
On Oct 21, 2006, at 12:02 PM, <email@hidden> wrote:
My commercial app has a ram cache system for image sequence
playback; images
can be tens of megabytes and there can be gigabytes of them. On OS
X, when the
RAM cache is large, it looks like the darwin pageout demon is
activating
prematurely, paging out part of the RAM cache preemptively to meet its
free-space target, even though there is no competing demand.
Consequently, the
app thrashes catastrophically to its knees.
The Darwin free page reserve target is actually very small; a few
percent of physical RAM in a large configuration.
The details of system activity seems to matter---I have two
different possible
display paths (CG and GL), one of which transiently allocates and
frees an
extra image buffer. The path that allocates/frees causes the
machine to thrash
at RAM cache sizes that the other path is unaffected by, even
though the total
consumption is about the same.
I need to better understand/control this issue and develop
countermeasures.
The problem is simply that you are caching too much data; you need to
understand the actual footprint of your working set so that you can
cache less.
I think OS X's preemptive pageout to maintain minimum free list (see
http://developer.apple.com/documentation/Performance/Conceptual/
ManagingMemory/Articles/AboutMemory.html)
is counterproductive, vs waiting for actual free-space demand.
You do, eh? Well, let's assume just for the moment that the kernel
needs those free pages in order to make forward progress when an
application like yours thinks it needs every last page in the system.
I'm investigating what the free-space target is (how it is set) and
whether it
might be changed.
The free page reserve levels are not adjustable; they are already
tuned to provide as many pages as possible to applications. This
avenue will not yield you any useful results.
The minimal fix is clearly to ratchet down the maximum permissible
ram cache
size well under (50%?) the machine's actual physical size. Though
this is
undesirable, it probably is better than letting the UI go unusable.
There are several better techniques that you might consider
applying. First and foremost though, you need to understand your
actual working set across the two display paths you have. Once you
have this in hand, you can more realistically assess the amount of
free memory required to display an image before bringing it into your
cache, and thus whether the sum of cached images poses a threat to
your working set.
Or, can the task priority be raised---will that give my app
priority competing
for free and inactive pages?
You are cannibalising your own pages, so this won't help you.
Even if OS X wants to grab some pages to maintain
its free list, I need it to grab them from inactive processes, not
the ones it
thinks are inactive in my app.
If those tasks are truly more inactive than you, you will already
have stolen all the pages from them that you can.
I'm thinking that "wiring" 80% of the RAM into my app is unwise.
Without understanding your application, it's difficult to see how you
can justify holding that much unprocessed image data in memory. It
sounds like you need to start by more carefully examining how you
cache images, and what you think you're trying to achieve.
Other possibilities include some sort of RAM-tickling demon.
If you do that, be aware that laughing RAM is slower than RAM that is
focussed on the job at hand.
Suggestions welcome.
In a followup message, you indicated that you are using third-party
image loading libraries, and by this I infer that your interface to
them is along the lines of "render this filename into a buffer".
This robs you of a good deal of control over your working set size;
you can pessimistically assume that the library will consume pages
equal to 2x the size of the file at a minimum. 1x will go straight
to the inactive queue, 1x will go to the active queue. In addition,
it will require pages to back the buffer it renders into, plus any
others required by intermediate datastructures.
If you are willing to spend some time analysing these libraries
(really mandatory, if you care about performance), you should examine
how they read files. You may find a couple of easily-optimised cases:
- The file is opened, read into a buffer, then closed. You can
easily replace this with either a) mmap, or b) a read with F_NOCACHE
set.
- The file is processed as a stream. This is another candidate for
F_NOCACHE, if the stream is not rewound.
In both cases, this can reduce the library's working set substantially.
You also have poor realtime control over this sort of interface,
which is a problem given the only reason I can think of you wanting
to have many images buffered; I assume you want to be able to flip
rapidly through a set. Given that you can't cache all of them, it
strikes me as being more sensible to cache many fewer and begin your
image flip speed reduction earlier, with a progressive slowdown while
you are reading ahead/behind. Ultimately your long-sequence flip
rate is bounded by how fast you can load images, not how many you can
hold in memory, so wasting large amounts of memory with a partial
subset of the image set is a false optimisation.
Another point to consider; you say "images can be tens of
megabytes". Do you display images at full resolution, or are you
scaling them for display? If the latter, consider scaling at render
time and just keep the display-size version in memory. If you
support resizing, as long as you're not rescaling for live resize you
can just catch the first window resize and re-render the full
resolution version so that when the resize is over you're (nearly)
ready to redraw. Likewise, if your usage model is such that zooms on
images away from initial display size is rare, you can re-render on
demand as and when zooming is required. This obviously trades a
little lag on first resize/zoom vs. constant excess memory pressure
while flipping, so you'll have to tune accordingly.
The bottom line to all of this is that first you must understand what
your application is doing. Once you understand the current state of
things, you can consider what it actually needs to do, and how to get
from where you are to where you need to be.
Don't make assumptions. Measure first.
= Mike
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden