Lists

Open Menu Close Menu

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: kernel paging (was Hard-limits for kern.maxproc)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: kernel paging (was Hard-limits for kern.maxproc)

Subject: Re: kernel paging (was Hard-limits for kern.maxproc)
From: Terry Lambert <email@hidden>
Date: Sat, 07 Feb 2009 14:15:05 -0800

On Feb 7, 2009, at 3:34 AM, Steve Checkoway <email@hidden> wrote:

On Feb 7, 2009, at 3:14 AM, Terry Lambert wrote:
On Feb 7, 2009, at 12:39 AM, Steve Checkoway <email@hidden> wrote:
Thank you Terry, Michael, and Michael. That's all very interesting! Kernel design decisions are always fascinating.

Just to be clear, I understand the kernel virtual address space being the scarce resource. I was asking about the possibility of using a mechanism similar to paging where it was actually reusing portions of the virtual address space. Of course, demand-paging wouldn't be possible since the same (virtual) addresses would refer to different data, depending on some context. Given the apparent complexity of normal paging, my idea seems completely infeasible.
v7 UNIX, CP/M, MP/M, and DOS used to use an approach called overlays to do exactly this. v8 and v9 UNIX also did this (they had to fit in the 16 bit address space of the PDP hardware, meaning 64K of code and 64K of data, at best), but you are unlikely to find them anywhere.

To do this, you still have to break the dependency relationship between overlays and overlay code: code that is loaded in an overlay can't depend on other code in the same overlay region.
Interestingly enough, I'm looking at an older 16-bit embedded system that takes this approach. By carefully maintaining an overlay call stack (I don't have a better name for it) and a table of functions and which overlay they are in, it's possible for one function to call another function in a different overlay by calling a helper function that's always mapped into the address space that remaps the other region. In this case, the code resides in several ROMs and the mapping can be changed by a single instruction. Obviously it'd be more complicated to do in a general purpose kernel.

Call graphs are by definition directed graphs. DCGs are not severable, while DAGs are. It is computationally hard to detect what are call Hamiltonian cycles - A calls B calls C calls ... calls B - but it's possible using what's called Warshal's algorithm, and if you are clever, it's possible to do this incrementally.

So it's not actually that complicated to do, but it dies need planning.

It is possible to do this for third party drivers using a technique called static analysis. This is relatively straight forward for fix sized aligned instructions -- call them ppc and arm -- and a lot harder for variable sized instruction set processors. You effectively look at the code and decide the subtrees of code that will be used for a given set if code: the functions called by the functions called by the ... even through calls through funtions pointers or vtables.
Certainly this is not decidable in general by straight-forward reduction from the halting problem. When is the analysis happening? When the driver is loaded into the kernel?

It's not analogous to the halting problem. It just becomes 4! harder when you have to guess the instruction set boundary on a 32 bit processor, and you end up with false positives when you guess wrong. It comes down to determining code reachability.

Ideally, you do the work ahead of time so the calculation can be done once. For dtrace on PPC, we actually analyze at runtime using some very smart code (my office mate has a patent pending on the technique).

So it's doable, but the amount if effort involved would be more than, for example, moving the graphics card push pages out of the kernel address space, which would be an order of magnitude win over that approach.
I'm not entirely sure what graphics card push pages are, but it seems pretty clear that an overlay approach is complex, one that seems unnecessary given a larger kernel virtual address space.

Graphics cards like to do speculative work while there are cycles available not doing other things. For this to work, they need the ability to force eviction of the results when they aren't going to actually be needed, to make room for new actual or speculative work. Straight LRU would evict stuff they actually need, since they are actually calculating ahead into increasingly less probable futures.

The current implementation for most modern graphics drivers uses kernel page mappings for this to avoid the overhead of protection domain crossings (switching between kernel mode - ring 0 - and user mode - ring 3 - is expensive due to current processor designs). So they have to be able to push these mappings out when they become an excluded future, and that means they live in kernel address space, which makes that space unavailable for other things.

There are tricks you can do in terms if not doing the work for backing store pages, but they are only useful on benchmarks. If a human clicks a window forward and you avoided calculating the backing store contents if that window because it was occluded and you thought you could get away with it, then you have to eat a user-visible delay while you do the calculation you could have done speculativllely. So if you write to the benchmarks, the numbers are good but the real world performance is terrible.

This is one of the reasons why graphics nerds are obsessed with frame rates past the 60 a second you can possibly ever display on an LCD panel with a 120Hz refresh rate: they don't know *which* 60 of the ones they've calculated they will need to display.

As always, thanks for the fascinating look into the computing past.

It's not ancient history, as much as we might wish it were. 8-). Everything old is new again.

-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: kernel paging (was Hard-limits for kern.maxproc)
From: Steve Checkoway <email@hidden>


References:  
  >Re: Hard-limits for kern.maxproc (From: Nathan <email@hidden>)
  >Re: Hard-limits for kern.maxproc (From: Rand Childs <email@hidden>)
  >Re: Hard-limits for kern.maxproc (From: Amanda Walker <email@hidden>)
  >Re: Hard-limits for kern.maxproc (From: Nathan <email@hidden>)
  >Re: Hard-limits for kern.maxproc (From: Andrew Gallatin <email@hidden>)
  >Re: Hard-limits for kern.maxproc (From: Steve Checkoway <email@hidden>)
  >Re: Hard-limits for kern.maxproc (From: Andrew Gallatin <email@hidden>)
  >Re: kernel paging (was Hard-limits for kern.maxproc) (From: Terry Lambert <email@hidden>)
  >Re: kernel paging (was Hard-limits for kern.maxproc) (From: Steve Checkoway <email@hidden>)
  >Re: kernel paging (was Hard-limits for kern.maxproc) (From: Terry Lambert <email@hidden>)
  >Re: kernel paging (was Hard-limits for kern.maxproc) (From: Steve Checkoway <email@hidden>)




Prev by Date:
Re: kernel paging (was Hard-limits for kern.maxproc)

Next by Date:
Re: kernel paging (was Hard-limits for kern.maxproc)

Previous by thread:
Re: kernel paging (was Hard-limits for kern.maxproc)

Next by thread:
Re: kernel paging (was Hard-limits for kern.maxproc)

Index(es):

Date
Thread