Re: Mac-on-Linux and VM internals

10 Mar 2004

      On Tue, Mar 09, 2004 at 12:40:57PM -0500, Jim Magee wrote:
...
...
Well, that is not quite true. MOL does take full control of the MMU but
...
...
in a completely OS independent manner(*). New versions of
...
...
OSX won't break this code. In fact, the same code is used both under
...
...
Linux and OSX (so the code is truly kernel independent).
...

...
I'm sorry, but not to be picky, exactly how does your KEXT take control
...
of the exception vectors (after the system is running one OS) in a way
...
that is "totally independent" of the OS?
MOL relocates the very first instruction of each exception vector

(the instruction 0x200, 0x300 and so on) and inserts a branch to

the new exception vector.
...
What if one of the other CPUs
...
is accessing the exception vectors you are changing at the time?
That is safe. The other CPU will either run the old code or

take the branch to the new code (which essentially just branches back

to old code). Removing the hook is equally safe. After the branch has been

overwritten with the old instruction, no CPU can be in the MOL hook

once a few CPU cycles has elapsed.
...
What happens when we load two different MOL-like products' KEXTs at the
...
same time?  Won't they get in each other's way?
Well, they would have to patch and depatch in the same order, opposite

order which would admittedly be a problem if one didn't coordinate things.
...
What if we decide, later on, to change the exception handlers we
...
have installed (maybe to more optimized versions for the platform)?
You mean, on the fly after the system has booted? Sound like a worse hack

than what MOL does without the (good) excuse that MOL can't easily make

modifications to the kernel proper...
...
Won't we overwrite yours?
Sure... but how realistic is that scenario?
...
All these issues require your solution to have knowledge of how the
...
"base" OS works - and usually in ways that the base OS isn't willing to
...
be exposed
I'm having difficulties seeing a "real world" case here...
...
(never mind the potential need for your KEXT to be
...
re-released for every new hardware platform that ships - as not all
...
these mechanisms are supported in a binary-compatible way between
...
hardware releases either).
Well, "every new hardware platform" is a gross ex-aggregation :-).

Sure, major new CPU features like AltiVec or the G5 needs to

be explicitly supported.
...
I know you probably know all that already.  But I just wanted to keep
...
you a little more honest.  ;-)
Well, there is one dependency which I did forget; MOL needs to be able to

allocate a few bytes of RAM within the range of an absolute branch

(it is impossible to safely patch the exception vectors otherwise).
...
...
Actually, if the three points above were provided by the kernel then
...
...
the MOL VM engine would be guaranteed to function. I bet that this
...
...
is a lot less invasive than the current support for BlueBox :-).
...

...
Probably not under my definition of invasive.
...

...
If MOL goes out to lunch, what does the machine look like?  Is there
...
any way to recover it?
Well, if there is a bug in the kernel module then it can obviously bring

down the system. However, the VM part of MOL is designed to be able

to report errors and continue gracefully even if a "catastrophic" event

like an unexpected exception should occur. In fact, MOL was able to

gracefully recover from the double exception hardware problem

(certain 750 models can take a double exception if both the TAU and the DEC

exceptions are enabled simultaneously) and print a meaningful error message.

I actually tracked down this hardware bug quite some time before it

got documented in a chip errata because I got the "MOL Internal Error..."

message...
...
How much of it has to be in the kernel address space?
The CPU and the MMU virtualization. Part of the CPU virtualization (i.e.

the emulation of certain supervisor instructions) could go into userspace

but the performance impact due to the double context switch makes

a bit expensive.
...
Does the user know it is MOL that went out to lunch (when it
...
does go out to lunch), or does it appear that the whole
...
system/Darwin/Mac OS X kernel have gone out to lunch?
Rhetorical question :-). Badly written kernel code will always

cause stability problems.
...
How compact is the code that actually intervenes with the rest of the system?
Well, MOL is split into two parts; the kernel module part and a userspace

process. The userspace code handles hardware emulation (e.g. emulates

the PCI bus), supports client side drivers, handles I/O and so forth.

It constitutes the bulk of the code and it runs as an ordinary user

process without any magic attributes.

The userspace process talks with the kernel module through an ioctl

interface (mostly to initialize and configure the virtual machine).

It also triggers the switch to virtualization mode (but the switch itself

is handled by the kernel module).

The kernel module it essentially a black box which does not talk much

with the rest of the world. It consists of two parts, the low-level

(assembly) code which handles exceptions and (essentially) runs with

the MMU off and the high-level C code which maintains the MMU

virtualization.

It is only the C-code that invokes external code. The only kernel

services used are atomic operations, memory allocations and spinlocks.

Well, MOL also has to be able to obtain the physical address of wired down

memory (the assembly code runs with the MMU off...).
...
Or more importantly, how entangled is it with the rest of the MOL code?
It is clearly separated.
...
Your original statement about needing to wire down all of MOL to keep the
...
system running seems somewhat telling in this regard.
No no... not the all of MOL. Just the virtualized RAM. The RAM has to be

wired down (currently) because the MOL module has no way of knowing

that a particular page is about to be swapped out (or whatever). If the

kernel module could be told (through some suitable mechanism) about this

event, then it would be trivial to flush any references to that page.

That is, MOL is designed to function as a software TLB cache with respect

to the operating system when pageable memory is used for RAM.
...
What if someone suspended a thread running inside the MOL
...
task (maybe with a debugger)? Would the system keep running?
Yes, no problem.
...
Are there key places where it wouldn't?
Nope.
...
Almost none of these things are issues with Classic/Virtual-PC, etc...
...
on Mac OS X.  Sure, we pay a small performance hit as a result.  But I
...
consider that "less invasive."
Well, you are comparing Apples with Oranges :-). Classic does not

really need a full blown MMU virtualization since MacOS 9 does not utilize

the MMU to any significant degree.

As for Virtual-PC... well, the huge performance impact here is the

CPU emulation. If the cost of emulating the MMU is of the same magnitude,

then it is probably an acceptable overhead.

For MOL the situation is very different (assuming a MMU demanding OS like Linux or

OSX is run within MOL). The CPU virtualization has close to zero overhead.

A costly MMU virtualization would really impact performance. The example

below should clarify my argument:

Virtual-PC: 1 virtual-pc cycle = 20 cpu emulation cycles + 5 mmu emulation cycles

MOL: 1 MOL-cycle = 1 instruction cycle + 5 mmu emulaion cycles.

That is, MOL's performance would be 1/5 of what it is today while VPC only

would only loose 20% of its speed... No doubt, the real-world numbers are a bit

different but this is the core of the issue.

/Samuel

_______________________________________________

darwin-kernel mailing list | darwin-kernel@lists.apple.com

Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/darwin-kernel

Do not post admin requests to the list. They will be ignored.

Samuel Rydh

tags

participants (1)