Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)

Subject: Re: Resident Memory, Private Memory, Shared Memory (top, Activity Monitor and vmmap)
From: Markus Hanauska <email@hidden>
Date: Thu, 6 Dec 2007 14:03:57 +0100


On Sat, 2007-12-01, at 20:07, Michael Smith wrote:

You're making an error here in assuming that Darwin is a "BSD kernel" in the first place.


I'm sorry for that, you are of course right, it is not a BSD kernel :)

The code is (by definition) a correct and complete description of what the system does.

Yes, but a real documentation to the code is basically what C is to assembler. Assembler is also a perfect way to show someone how something works, but if you can read the C code... I think you get the point :P

You asked "why don't resident (1) plus private (2) equal shared (3)", not "why don't private (2) plus shared (3) equal resident (1)".

Yes, also my bad. However, I guess the reply would not have been much different if I had asked it correctly, would it?

You're making a fundamental mistake in thinking that any of this accounting is being done for your (the application developers') sake. The VM keeps accounting information for its own purposes, and the statistics that are maintained and tracked bear for the most part directly on how the implementation works, and what it needs to know in order to provide the services it does.

Okay, in that case how about offering a tool that keeps the data available that the developer wants to know? Such a tool might slow down program execution by 500%, but I would not mind, when looking for "where does the memory go", speed, is not an issue.

Then what is it telling me? If the values are so useless for
everything, then why displaying them in the first place?
That's a good question, but well outside the scope of this conversation as it strays into marketing territory and the somewhat conservative attitudes that many old-school Unix administrators have towards tools.

Okay, that explains why these numbers are in top, but I had not expected them in Activity Monitor. All the other numbers in AM are perfectly accurate, meaningful and the way you'd interpret them on the first look is the right way, all but the memory stats.

What you are asking for is what is generally referred to as the working set size for your application.

Ahhh, now we are we I wanted to go. At least the thing now has a name: "working set size".

It is a very difficult number to compute for a variety of reasons:


It doesn't need to be easy, only accurate.

- There are adaptive algorithms in play with both time- and space- related parameters that will cause your working set to expand or contract based on a number of factors not excluding CPU speed, other thread activity in the system, actual physical memory, disk speed, etc.

How do speed parameters influence the memory needs? Has it to do with caching? The memory need of my tool should be constant at any time. I'm more interested in the "now" value, than in the "possible" value. IOW, I'd actually freeze the whole system to create such as snapshot if necessary.

- There are other tasks in the system (server processes) that need to make timely forward progress in order for your application to do likewise, and their working sets for both your application and other applications' work that may gate yours (and thus other applications' working sets either partially or entire) need to be considered.

Yes, but that can be ignored. Of course my app might indirectly cause memory to be used, e.g. for the Keychain Service or within the kernel, but this memory can be ignored. As a developer I must rely on the system to work as expected and this means the system will have whatever memory demands it must have. They are beyond my control and I must rely that the other developers who wrote the system made sure that they are sane and only as big as absolutely necessary.

- The working set for an application can vary wildly based on the configuration of the system it's running on, the user-set preferences, the document or data being worked on, and so forth.

Certainly, but on a specific system, at a specific time, I could freeze everything as said above and say "tell me now how big my working set is, on that system, with the current state of the app".

- The number varies wildly with time; some applications' working set size relates fairly directly to what the application is doing at the time, or has periodic aspects that relate to a long-running task, but for some applications it is grossly impacted by external or non-direct factors that don't correlate to the application's activity at all.


"Freezing" :)

Because this number is very difficult to derive, and because deriving it at most tells you about limit conditions, it is generally better as a developer to focus your attention on the factors that affect the number and which are a consequence of your application's behaviour instead.

Yes, but these factors are not limited to malloc. Most developers focus only on malloc. This is wrong for many reasons:

E.g. Stack size. Stacks have a maximum size, but they grow towards that size as needed. Of course you can influence the grow rate as a developer. E.g. avoiding recursive implementations where alternatives are available (iterative ones), this will reduce stack usage. Avoid static memory on stack. Instead of filling your stack with a char [1024], do a malloc(1024) and free it again after the function exits. Avoiding deep function calls if there are alternatives (message passing, etc.). Keeping the stack small is within my control.

And what is also within my control is against how many libraries I link. E.g. there might be a system library that offers a functionality I need. I can use it, meaning I link against this lib, the lib is loaded and maybe shared (if someone else uses it, too) or not shared (if I'm the only one using it). If this lib may be loaded just because I use it, it's again me influencing how memory is being used. Instead of loading the lib, I could implement the functionality myself. Now it won't be shared, but if the lib uses three times the memory than my implementation does, implementing itself would save memory again. So memory usage because of libraries is also within my control.

Not to forget the code of my application. I can influence the code size, making the code more compact, which might make the code slower (e.g. loop unrolling is done for speed, but makes the code bigger), using less inlining, avoiding duplicate code (which will cause more function calls to happen and this again increases the stack size needed), and so on. The code itself is within my control.

Tracing malloc is no problem, can be done easily. Tracing Stacks is harder, but vmmap will work here with the -resident option. But how about the library or code issue? The code itself can probably be monitored with vmmap, too. But how useful is vmmap for libraries? It would be only useful if it can really tell me, how much memory is lost for the libraries because of my app and who else except my process is needing this library (I guess linking against a dylib that is used by three other system services is probably not problematic, it's in memory anyway).


--
Best Regards,
    Markus Hanauska


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >Re: Resident Memory, Private Memory, Shared Memory (top,	Activity 	Monitor and vmmap) (From: Michael Smith <email@hidden>)




Prev by Date:
Re: bootstrap_register: permission denied: what's next?

Next by Date:
Re: Darwin-dev Digest, Vol 4, Issue 258

Previous by thread:
Re: Resident Memory, Private Memory, Shared Memory (top,	Activity 	Monitor and vmmap)

Next by thread:
Darwin 9 vs IOPlatformPlugins

Index(es):

Date
Thread