Re: Find argument passed to running process

2 Sep 2009

      site_archiver@lists.apple.com
Delivered-To: darwin-dev@lists.apple.com

To actually answer the question, look at the source code for "ps":

<http://developer.apple.com/mac/library/qa/qa2001/qa1123.html>

Do you really not understand the concept of a sustainable binary API?
For example, a commercial application sold by your employer.
-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list      (Darwin-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl...
On Sep 2, 2009, at 2:37 AM, Nico Schmidt <lists@nschmidt.name> wrote:

On 02.09.2009, at 10:07, Terry Lambert wrote:

On Sep 1, 2009, at 9:38 PM, Stephen J. Butler wrote:

On Tue, Sep 1, 2009 at 9:55 AM, Rakesh Singhal<rakesh.singhal@gmail.com
...
wrote:
I can find pid for any running process using code given here

http://developer.apple.com/mac/library/qa/qa2001/qa1123.html. But
I want to

find the arguments passed to this running process, when it was
launched. I

checked the kinfo_proc structure but I could not find any thing
related to

arguments.

<http://www.opensource.apple.com/source/adv_cmds/adv_cmds-119/ps.tproj/ps.c
...
It calls sysctl with { CTL_KERN, KERN_PROC, what, flags } (look a

little higher up for what gets stuffed in those vars). From the

result, it then looks at "kp[i]->kp_proc->p_comm" (saveuser
function).

"The UNIX Programming FAQ lists a number of alternative ways to do
this.

Of these, the only approach that works on Mac OS X is exec'ing the ps

command line tool. exec'ing ps will require parsing the tool's output

and will not use system resources as efficiently as Listing 1."
...on the other hand, the ps method is more likely to continue to
work in future versions of the OS.  Also note that qa is 7 or 8
years old, based on how you read the dates.

The problem with parsing the output of ps is that it only works with
the most harmless arguments.

If a command line argument contains funny characters like a line
feed for example it will show up as "^M" in ps's output.

I always wondered why there is no way to get the information ps
delivers in a library function. Parsing output may have worked

in the days where a Unix user would not even imagine putting
anything but printable non-whitespace characters in a file name.

The answer is because it requires committing to an API which, for
binary compatibility reasons, can't be changed thereafter for close to
a decade without breaking one or more commercial applications.
When you do that, it constrains your underlying implementation
details, and that prevents performance improvements and other positive
changes to architecture.
Publishing an API without considering very long term consequences is
worse than publishing no API.
For example, consider that procfs exposes a flat numeric PID
namespace, which means that if some important product starts relying
on it, until that product both recognizes a need to get off that API,
and has sufficient time to migrate its customers to a version which
actually gets off that API, whatever paradigm the OS comes up with for
representing things after that now has to be capable of being jammed,
however inefficiently, into a flat numeric namespace. Or the
application breaks.
Now multiply this problem by the number of applications vendors out
there. How likely are they to all move off a "bad" interface within
say 2 years of it being deprecated? 5? How do you keep new code,
written either by people who don't know any better, or who know better
but are up against a deadline and willing to do dirty tricks to meet
it, from using the interface?
This is a common problem with all data interfaces, like procfs, which
was not well thought out in terms of long term consequences. It
doesn't matter if you are talking about a data iterator interface like
opendir/readdir/closedir applied against a schelling point, like the
known path to the location in a procfs where the directory entries
represent a list of integers-as-ASCII strings in a flat numeric
namespace, or it's something else. Like jamming a PID into the third
32 bit integer value in an OID tuple in that sysctl (above) that was
suggested in answer to this question, thus capping the systems future
PID values because of the OID value field size. Or deciding because an
interface is protected from access by non-root processes that you can
cast a pointer to an int, then pass it back as a token and cast it
back into a pointer, "because everyone knows pointers and integers are
the same size".
Data interfaces are a bad idea. Procedural interfaces are only barely
better, since you can codify constraints into them with bad API
design, too. Don't get me started on early systems with 32 bit off_t
types as the third argument to the read and write calls, as one
example. But at least a procedural interface, even an iterator,
doesn't _have_ to expose underlying structure, the way opendir() does
when used against another data interface (e.g. the exposed file system
namespace of a procfs, for example, with a well known layout that
can't change after code starts to depend on it).
So far there is no compelling business reasons I'm aware of to design
a sustainable API for this particular information, other than talking
to the system utilities which have their data interfaces more or less
frozen in place by international standards. On the other hand, there
are some pretty compelling engineering reasons to *not* do it,
starting with the constraints it would inevitibly place on engineers
trying to make changes later.
This email sent to site_archiver@lists.apple.com