Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)
Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)
- Subject: Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)
- From: Markus Hanauska <email@hidden>
- Date: Thu, 6 Dec 2007 14:03:57 +0100
On Sat, 2007-12-01, at 20:07, Michael Smith wrote:
You're making an error here in assuming that Darwin is a "BSD
kernel" in the first place.
I'm sorry for that, you are of course right, it is not a BSD kernel :)
The code is (by definition) a correct and complete description of
what the system does.
Yes, but a real documentation to the code is basically what C is to
assembler. Assembler is also a perfect way to show someone how
something works, but if you can read the C code... I think you get
the point :P
You asked "why don't resident (1) plus private (2) equal shared
(3)", not "why don't private (2) plus shared (3) equal resident (1)".
Yes, also my bad. However, I guess the reply would not have been much
different if I had asked it correctly, would it?
You're making a fundamental mistake in thinking that any of this
accounting is being done for your (the application developers')
sake. The VM keeps accounting information for its own purposes,
and the statistics that are maintained and tracked bear for the
most part directly on how the implementation works, and what it
needs to know in order to provide the services it does.
Okay, in that case how about offering a tool that keeps the data
available that the developer wants to know? Such a tool might slow
down program execution by 500%, but I would not mind, when looking
for "where does the memory go", speed, is not an issue.
Then what is it telling me? If the values are so useless for
everything, then why displaying them in the first place?
That's a good question, but well outside the scope of this
conversation as it strays into marketing territory and the somewhat
conservative attitudes that many old-school Unix administrators
have towards tools.
Okay, that explains why these numbers are in top, but I had not
expected them in Activity Monitor. All the other numbers in AM are
perfectly accurate, meaningful and the way you'd interpret them on
the first look is the right way, all but the memory stats.
What you are asking for is what is generally referred to as the
working set size for your application.
Ahhh, now we are we I wanted to go. At least the thing now has a
name: "working set size".
It is a very difficult number to compute for a variety of reasons:
It doesn't need to be easy, only accurate.
- There are adaptive algorithms in play with both time- and space-
related parameters that will cause your working set to expand or
contract based on a number of factors not excluding CPU speed,
other thread activity in the system, actual physical memory, disk
speed, etc.
How do speed parameters influence the memory needs? Has it to do with
caching? The memory need of my tool should be constant at any time.
I'm more interested in the "now" value, than in the "possible" value.
IOW, I'd actually freeze the whole system to create such as snapshot
if necessary.
- There are other tasks in the system (server processes) that need
to make timely forward progress in order for your application to do
likewise, and their working sets for both your application and
other applications' work that may gate yours (and thus other
applications' working sets either partially or entire) need to be
considered.
Yes, but that can be ignored. Of course my app might indirectly cause
memory to be used, e.g. for the Keychain Service or within the
kernel, but this memory can be ignored. As a developer I must rely on
the system to work as expected and this means the system will have
whatever memory demands it must have. They are beyond my control and
I must rely that the other developers who wrote the system made sure
that they are sane and only as big as absolutely necessary.
- The working set for an application can vary wildly based on the
configuration of the system it's running on, the user-set
preferences, the document or data being worked on, and so forth.
Certainly, but on a specific system, at a specific time, I could
freeze everything as said above and say "tell me now how big my
working set is, on that system, with the current state of the app".
- The number varies wildly with time; some applications' working
set size relates fairly directly to what the application is doing
at the time, or has periodic aspects that relate to a long-running
task, but for some applications it is grossly impacted by external
or non-direct factors that don't correlate to the application's
activity at all.
"Freezing" :)
Because this number is very difficult to derive, and because
deriving it at most tells you about limit conditions, it is
generally better as a developer to focus your attention on the
factors that affect the number and which are a consequence of your
application's behaviour instead.
Yes, but these factors are not limited to malloc. Most developers
focus only on malloc. This is wrong for many reasons:
E.g. Stack size. Stacks have a maximum size, but they grow towards
that size as needed. Of course you can influence the grow rate as a
developer. E.g. avoiding recursive implementations where alternatives
are available (iterative ones), this will reduce stack usage. Avoid
static memory on stack. Instead of filling your stack with a char
[1024], do a malloc(1024) and free it again after the function exits.
Avoiding deep function calls if there are alternatives (message
passing, etc.). Keeping the stack small is within my control.
And what is also within my control is against how many libraries I
link. E.g. there might be a system library that offers a
functionality I need. I can use it, meaning I link against this lib,
the lib is loaded and maybe shared (if someone else uses it, too) or
not shared (if I'm the only one using it). If this lib may be loaded
just because I use it, it's again me influencing how memory is being
used. Instead of loading the lib, I could implement the functionality
myself. Now it won't be shared, but if the lib uses three times the
memory than my implementation does, implementing itself would save
memory again. So memory usage because of libraries is also within my
control.
Not to forget the code of my application. I can influence the code
size, making the code more compact, which might make the code slower
(e.g. loop unrolling is done for speed, but makes the code bigger),
using less inlining, avoiding duplicate code (which will cause more
function calls to happen and this again increases the stack size
needed), and so on. The code itself is within my control.
Tracing malloc is no problem, can be done easily. Tracing Stacks is
harder, but vmmap will work here with the -resident option. But how
about the library or code issue? The code itself can probably be
monitored with vmmap, too. But how useful is vmmap for libraries? It
would be only useful if it can really tell me, how much memory is
lost for the libraries because of my app and who else except my
process is needing this library (I guess linking against a dylib that
is used by three other system services is probably not problematic,
it's in memory anyway).
--
Best Regards,
Markus Hanauska
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden