I'd like to ask a simple question. This question has been asked about
ten times on various mailing lists here in the past and whenever
somebody asked this question directly or indirectly, he never
received a reply. Either the question was ignored in the reply or the
post never received any reply in the first place. This either means
nobody wants to answer it or nobody can answer it... maybe nobody
knows it, which would be very, very sad, as that is one of the most
fundamental things one should know. It's also nowhere documented. I
even searched many MacOS internals books for an answer, but I had no
success. It's simply not explained anywhere. It can't be that we have
to file a support incident just to receive a reply to this very
simple question. So I try it again and I will try it over and over
again, till I find someone who can answer it.
When you use top or Activity Monitor, does not matter, you receive
three interesting memory results per process. The VM stats are not
interesting (no matter if VM private or VM total). The three values
of interest are:
1. The Resident Memory Size (aka Real Memory)
2. The Private Memory Size
3. The Shared Memory Size
So far, all these values seems self explaining... but they are not at
all for one single reason:
How comes 1 + 2 != 3???
There are processes, where 1 + 2 is much bigger than 3. How comes?
Where is the memory lost? Which pages count towards 1 or 2, but are
ignored for 3? For other processes, 1 + 2 is much smaller than 3. How
comes? Which memory is ignored for 3, but is accounted towards 1 or 2?
Is this really so hard to answer?
I looked at the source of top, here's what I found out (some people
might find this useful):
A) The resident memory is obtained using the call task_info(). Since
it is nowhere documented what this call will count towards memory of
the task and what not in the field resident_size, it's basically
magic and I can't investigate this value any further.
B) The other two values are obtained by getting every memory region
of a process using mach_vm_region() (same function vmmap uses) and
show some interesting behavior. Basically all private memory is
summed up as private memory. Real shared memory (e.g. mmap memory) is
summed up as shared memory. This is expected behavior. Memory that is
in theory share-able (Copy on write) is treated differently depending
on whether it is actually shared. E.g. code of the process or
libraries used by the process is counted towards private memory if it
is only referenced once (it could be shared, but currently it isn't).
That means libraries shared with other processes are counted towards
shared memory. This is indeed very clever. If I start a second
instance of my process, my code regions are shared between both
processes, so the private memory decreases and the shared one raises,
which reflects what really goes on within the system. Same with some
library code. E.g. libraries that only my process uses are first
counted towards my private memory. With a second instance using them,
they are counted towards shared memory. You see, pretty clever and
C) Only "touched" memory is accounted at all. This is also very
clever. Just because I load a library does not mean, that all library
code is in memory. Only space for all is reserved. Unless I ever
access any page of that library, the page is not really in memory.
Same for my code or malloc'ed memory. This is also desired behavior.
D) Still top overestimates the real memory my process uses. E.g. I
use a library that another process uses, too. I only need one
function of the library. If I was the only one using it, only the
page containing this function would be real memory, that means only
one page would be accounted towards my process (the rest stayed
virtual). Now the other process uses all of the library. Because of
this, all pages of the library are loaded to real memory. Since I use
the library, too, now all the loaded pages are accounted to my
process as shared memory, just as they are accounted to the other
process. In fact, only one of the pages is in memory because of my
process and in use by my process. The rest is only in memory because
of the other process. So more shared memory is accounted towards my
process than my process actually uses. But since it is accounted
towards shared memory, this is something I can live with (forcing the
kernel to remember because of which process which virtual page has
been mapped into physical memory would be pretty much overkill and
bloat all memory structures enormously).
After reading the top source, I understand pretty much how the values
Shared and Private memory are calculated (the source basically is
self explaining in this aspect), what they actually mean and how to
interpret them. But I have no idea what resident memory is, what it
means, where it comes from and why it is so much different than the
sum of private and shared for many processes.
Also playing around with vmmap could not really answer that. BTW,
vmmap is pretty much useless to build a tool to determine the real
memory use of a process. This is because a section, that is COW (Copy
on Write) will stay COW, even if some pages within it have already
been copied. These pages would now count towards private, but vmmap
won't tell me how many pages have already been shadowed like this and
whether a page is shared between many processes or already private
towards one of the processes makes a huge different. What I'm missing
is a tool that works just like vmmap, but reports statistics about
every single *PAGE* of memory of a process space, not just every
larger region and that takes into account if this page has been
copied or not for a COW page. Then one could really make a tool that
prints out the real shared and private memory amounts of my process,
except for issue (D), where everything of the library that has
physical memory is counted towards my shared memory (to correct that,
I had to eliminate all other processes using the library and then
create the values).
Okay, your turn, please share you wisdom :)
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden