Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)

Subject: Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)
From: Michael Smith <email@hidden>
Date: Sat, 1 Dec 2007 11:07:31 -0800


On Nov 30, 2007, at 12:04 PM, email@hidden wrote:

I have only searched Apple's mailing list archive and I have not
found anything; Apple has too many ifdef __APPLE__ in the kernel to
say for sure that xnu behaves like any other BSD kernel.

You're making an error here in assuming that Darwin is a "BSD kernel" in the first place.

The MacOS VM, which is what you're currently being confused by, is a distant descendant of the Mach 2.5 VM, which is where the free BSD's started as well, but it has evolved in very different directions as the operating system itself has had very different motivating factors. Other than in the most general sense, it is not very useful to compare them.

It's also nowhere documented.


This is, techically speaking, not true; it is documented by the
(freely available) source code and the activity of your system.


Well, source code is no real documentation. The difference is, it
took me about 1-2 hours to figure out how top calculates private and
shared memory, something a documentation had told me in 2 sentences
in 3 minutes.

The code is (by definition) a correct and complete description of what the system does. It is the ultimate and most expressive documentation, although it's not so good at describing intent.

1. The Resident Memory Size (aka Real Memory)
2. The Private Memory Size
3. The Shared Memory Size
So far, all these values seems self explaining... but they are not at all for one single reason:

How comes 1 + 2 != 3???
What makes you think that they would be?
Because all memory a process "has" (all pages in a process address
space) are either shared or private.

This depends on your definition of "shared" and "private"; you are here defining them as complete and exclusive but that is presumptive and in fact not correct.

So the real memory a process
used should be the sum of both. That the private and shared are
calculated in such a unfavourable fashion is another thing.

You asked "why don't resident (1) plus private (2) equal shared (3)", not "why don't private (2) plus shared (3) equal resident (1)".

"resident memory" is an old OS term, and you will find it defined
in many OS textbooks.  If you have reached this point, it's more or
less assumed that you will have read Tanenbaum or something more
recent, and the concept should be self-explanatory.


Resident according to top's own explanation is:

'This column reflects the amount of physical memory currently
allocated to each process. This is also known as the "resident set
size" or RSS. A process can have a large amount of virtual memory
allocated (as indicated by the SIZE column) but still be using very
little physical memory.'

Top is a multi-platform utility with a long history; this definition predates modern VM systems and is necessarily simplistic. However, it's not fundamentally in conflict with reality, and had you read this before making your earlier comments you'd have been less confused.

MAP_PRIVATE is handled as COW


Which already makes a bit less sense, but it's no issue, since COW
memory with only one reference is still accounted towards private
memory by top (and private memory will only have one reference).

If you've checked the code I'll take it on face value, but I'm not certain that a COW mapping against an object is actually cross- referenced against other COW mappings (due to the interplay with copy objects and the extreme cost involved) to determine whether the range in question is practically shared or not.

In a pedantic interpretation, COW pages are only 'shared' if someone else has a mapping for that page, otherwise they are merely 'shareable'.

and MAP_SHARED is erroneously always accounted as shared even if
nobody else has that range of the object mapped.
That is a bit stupid, since it does not reflect reality.

As I note above, the alternative would require an exhaustive search of the arbitrarily large set of possibly overlapping mappings.

It would help this conversation if you would stop saying "stupid" when you really mean "I do not understand".

OTOH drawing
buffers of Cocoa apps are in fact really shared (between the
application and the window server, that's what I saw on some
technote) and still saying "they are shared" seems awkward, as it's
technically correct, but these buffers only exist because of the
application and without it they will also vanish from the window
server, so they could as well be accounted towards that application
only.

Whilst this is basically an erroneous digression, it's a good place to hang a point.

You're making a fundamental mistake in thinking that any of this accounting is being done for your (the application developers') sake. The VM keeps accounting information for its own purposes, and the statistics that are maintained and tracked bear for the most part directly on how the implementation works, and what it needs to know in order to provide the services it does.

As a consequence, the VM is entirely ignorant (and rightly so) of the application or framework level semantics applicable to the interaction between (say) a client and the window server. That interaction uses Mach shared memory semantics (amongst others), and gets the same behaviour that anyone else using those interfaces gets.

It matters not that you-the-developer think that the window server "doesn't count"; from the perspective of the VM and the system as a whole, for whom that information is maintained, it does, and thus the numbers reflect that.

As I've previously noted, if you want to look at things from an application-centric perspective, you should be using tools that are designed for that purpose.

Top is not telling you, except in the crudest sense, how much real
memory your process uses.


Then what is it telling me? If the values are so useless for
everything, then why displaying them in the first place?

That's a good question, but well outside the scope of this conversation as it strays into marketing territory and the somewhat conservative attitudes that many old-school Unix administrators have towards tools.

Note that Activity Monitor (which displays the same values as top,
but I have no source to back this up, I can only compare them) is no
developer tool. It is used by normal, simple minded users, that have
no idea what "virtual memory" actually means. And this utility
displays real, private and shared memory to the user. What do you
expect will a simple minded user think, when seeing these values?


Based on what I'd call reasonable experience, I'd say "not very much".

Simple minded users are using these values to determine memory
consumption of a process. Now if these values are so far away from
reality, they are useless and should not even be displayed to simple
minded users. Instead, this all should be replaced with a single
value, that really can give the user a rough estimation of how much
memory this process needs.

That number doesn't exist. Computing the working set size of a given collection of threads and their dependencies is more expensive than the system can afford at runtime, when for the most part such a gross number is not useful or interesting as such.

Understanding the real memory usage of your task is better handled
with different tools.


Now it gets interesting. That would be which tools?

I would encourage you to tinker with Shark and Instruments


^^^^ (edited X-Ray to the shipping name "Instruments")

  instruments to see if you can't get a better understanding of what
your process is doing and what that costs.


These tools basically tell me how much memory I have "wasted" using
malloc. Even in the best case, they might take my stack into account.
But what they won't take into account is the memory lost by caching
code pages, the memory lost by loaded libraries and their code pages,
and so on. And in rarest cases they will even distinguish private and
shared memory at all.

You are trying to force your perception of the way the system uses memory onto it and the tools. You might benefit from stepping back and taking in the way that the tools talk about the system, as they are the product of many years of experience tuning applications, and as such they tend to reflect the things that have previously been worth looking at.

Knowing how a given system reacts to what your application is doing at a given time can be entertaining, but it's much more interesting to know how systems in general will react when your application is running, and this abstract view is more generally useful.

So if I know my processes uses malloc to allocate about 800 kb of
memory during runtime, this could not be any further away from the
real number of memory the system loses by running my process as it
is. This is an approximation of memory usage, that is far, far away
from reality and a much worse approximation than all the values
mentioned before.

If I assume by "loses" you mean "uses", then the number varies wildly based on what your application and the system are both doing at the time.

The number I'm looking for is:

How much memory does the system need, to keep my process running,
assuming it was not allow to swap anything of it (everything needs to
be in memory) and here it should take any memory into account that
belongs to my process directly or indirectly, which includes malloc,
stack, code, static memory, code of libraries and static memory of
libraries, coded needed by the dyld - and that clearly distinguishes
between shared and unshared memory.

What you are asking for is what is generally referred to as the working set size for your application.

It is a very difficult number to compute for a variety of reasons:

- There are adaptive algorithms in play with both time- and space- related parameters that will cause your working set to expand or contract based on a number of factors not excluding CPU speed, other thread activity in the system, actual physical memory, disk speed, etc. - There are other tasks in the system (server processes) that need to make timely forward progress in order for your application to do likewise, and their working sets for both your application and other applications' work that may gate yours (and thus other applications' working sets either partially or entire) need to be considered. - The working set for an application can vary wildly based on the configuration of the system it's running on, the user-set preferences, the document or data being worked on, and so forth. - The number varies wildly with time; some applications' working set size relates fairly directly to what the application is doing at the time, or has periodic aspects that relate to a long-running task, but for some applications it is grossly impacted by external or non-direct factors that don't correlate to the application's activity at all.

Because this number is very difficult to derive, and because deriving it at most tells you about limit conditions, it is generally better as a developer to focus your attention on the factors that affect the number and which are a consequence of your application's behaviour instead. These are generally easier to identify, and since they're something that you can do something about, a better place to start anyway.

 = Mike

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: Resident Memory, Private Memory, Shared Memory (top,	Activity 	Monitor and vmmap)
From: Markus Hanauska <email@hidden>


Prev by Date:
Re: X11 Changes in Leopard

Next by Date:
Darwin 9 vs IOPlatformPlugins

Previous by thread:
Re: X11 Changes in Leopard

Next by thread:
Re: Resident Memory, Private Memory, Shared Memory (top,	Activity 	Monitor and vmmap)

Index(es):

Date
Thread