Re: kernel paging (was Hard-limits for kern.maxproc)
Re: kernel paging (was Hard-limits for kern.maxproc)
- Subject: Re: kernel paging (was Hard-limits for kern.maxproc)
- From: Terry Lambert <email@hidden>
- Date: Fri, 6 Feb 2009 16:39:01 -0800
On Feb 6, 2009, at 11:50 AM, Andrew Gallatin wrote:
Steve Checkoway wrote:
> As a side note, does the kernel not page out its own memory? It
seems
> like it should have access to more space using the same mechanism,
as
> long as it is careful about which data is truly on disk and which
is in
> memory.
Mostly not. Paging kernel memory is a nightmare. I cannot think of
any OS which actually does that off the top of my head, except maybe
AIX. I know that certain kernel-resident portions of processes were
pagable in FreeBSD a long time ago, and it caused quite a few
problems.
Windows since Window95 has supported flags in the PE format (modified
COFF) binaries for things like VxDs (the Windows equivalent of KEXTs)
on a per-page basis. The most common use for this is tagging
something as an "init" section, meaning that page (s) can be discarded
after driver initialization. In these sections, you can put things
like static constructors and driver global initialization code which
is not used on a per-device instance for stuff like probing, attach,
detach, normal running, etc.. Driver writers who know what they are
doing can also tag things as pageable. They also support flagging for
discard-after-attach for things like firmware loads; this is typically
used only for drivers for hardware integrated into the system that
needs new firmware shoved down at boot time (rather than flashing it
and not paying the overhead each time you boot). Mostly, that
translates to things that will be attached once and never detached
because to remove them you'd need a soldering iron, or to at least
crack the case and pull a card, like integrated SCSI controllers,
Winmodems, some network cards, and things like that, and you pay a
boot penalty for it because you didn't just flash the device firmware
with what you wanted it to be.
You could be somewhat more aggressive and tag each kernel page that's
pageable to indicate that it's not in the paging path (i.e. you don't
page things which are needed to swap in or out the pages you are
paging). This typically results in almost everything being marked as
non-pageable, since anything that can be used to access storage
*might* be accessed to do paging, including USB key-fob or network
drivers. You could maybe go even further, and decide to mark as
"precious" only the pages in the *active* paging path, rather than the
pages that *could* be in an active paging path at some point in the
future, but at that point you are making things complex enough that
complexity becomes a problem on its own.
And if you do this, the big complexity penalty is that code and text
with different discardability/pageability criteria can't be coresident
with code with different attributes, so you end up with internal page-
attribute based fragmentation, which costs you an average
(statistically speaking) of 2K for the first conflicting attribute
and 4K for each one thereafter.
Investigationally, there's not more than a couple 10's of megs to save
with discard-after-use pages in Mac OS X, even if you tag everything
it's possible to tag. Unfortunately, the more aggressive approach
itself can't address the immediate problem.
Pageability itself is not a solution, since pageable pages still use
up virtual address space and mappings for the pages. Sure, they
aren't resident in physical RAM if they are paged out, but if you
support demand-paging them in when they are touched, in order to do
that, you have to have an address mapping, whether or not it's backed
by physical pages of RAM. And mappings have to be wired in both the
KVA and the physical RAM.
So there's some small wins, but you could beat those easily by just
making sure your program doesn't unnecessary use mappings by using a
lot more memory than it needs; it's down in the noise.
-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden