Re: (OT) path_helper [was Re: Xterm not reading dotfiles]
Re: (OT) path_helper [was Re: Xterm not reading dotfiles]
- Subject: Re: (OT) path_helper [was Re: Xterm not reading dotfiles]
- From: Vernon Williams <email@hidden>
- Date: Mon, 19 Nov 2007 20:31:59 -0600
Monday, November 19, 2007, 6:14pm
I would be interested in seeing what kind of bookkeeping you would
do to remove the second sort. Not that it would probably make much
difference in speed, but I love algorithms, love coming up with them,
love getting new ideas.
If one wasn't trying to preserve the original order of the remaining
values, one clearly would not need the second sort.
Indeed, I have frequently used such a technique when I did not care
about the order of the remaining values. For example, I have a simple
C filter program to take lines of sorted text and output the lines with
duplicates removed.
The sorted list with duplicates removed, before the second sort,
will not likely have the values in the initial order, and it is hard
(at least for me) to see how to get it back in order without doing
another sort, or some searching through the array that would be
more or less equivalent to a sort. The sort, at least, is guaranteed
to be pretty fast, and is a nice, clean way to do it, with minimal fuss.
And if there are a lot of duplicates, the shortened list will sort even
faster than the original.
Now that I think about it a bit more, I guess you could do an index
sort, where you have an array of indices which gets sorted, so that
the original array remains unchanged. Then you could use the
indices into the original array to find duplicates and mark them in
an auxiliary boolean array. Finally, you could go through the
original array and print only those items not marked as duplicated.
Was this what you had in mind? I actually thought about this
briefly, but then I would have had to use something something
other than qsort, designed explicitly for an index sort, or use
global data to access the original array. Or maybe use an array
of pointers to the original items (though more pointers typically
reduces clarity).
Having thought of this, I just sat down and made a modified
version of my original program to do as I just described above,
with the array of items as a global variable, and it seems to work.
I am glad you mentioned the possibility of eliminating the second
sort, as it prodded me to figure this out. I think I still like the
first
method better, without having to use a global variable (which I
try to avoid when possible), which seems clearer to me even
though longer than the new way, but it always nice to have an
alternative method. I am sometimes amazed at the vastly
different ways many programming problems can be solved,
indicating the many ways creativity can be exercised in
programming or mathematics.
The new version of the program is attached below, for anyone
who might be curious.
Vernon Williams
On Nov 19, 2007, at 8:41 AM, Andrew J. Hesford wrote:
On Nov 19, 2007, at 4:03 AM, Vernon Williams wrote:
Monday, November 19, 2007, 3:16am
For whatever it is worth, attached below is a little C program I just
hacked out to list its command arguments to standard output,
in order, but with any duplicates (except for the first) removed.
It could readily be converted into a filter as well. It is a fast,
general
purpose program for removing duplicates, using two qsorts,
which might be useful in the current problem, at least for the
algorithm. I have not tested it extensively, but it seems to work
for the cases I have tried. If anybody tries it and finds any
problems,
I would be glad to hear about it.
Vernon Williams
Yes, I have used qsort to remove duplicate data from lists before, and
it is quite useful. In fact, you can remove one of the qsort calls, if
you do a little extra bookkeeping while you are identifying duplicates
in the sorted list. When Perl builds a hash, I would assume it is
using something like qsort to order its arrays, so the underlying
mechanisms might be similar.
--
Andrew J. Hesford <email@hidden>
Department of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign
/* C Program remove_dups1 Level 0 Date 11/19/07
/*
/* Purpose: to remove duplicated arguments and write the remaining
/* arguments to standard output, one per line; if an
/* argument is duplicated, then only the first occurrence
/* is kept.
/*
/* Usage: remove_dups1 { arg }
/*
/* Written by Vernon Williams, November 19, 2007.
/*
/*--------------------------------------------------------------------------*/
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
/* types: */
struct sort_item {
char * item;
int index;
};
/* global variables and arrays: */
static int num_items;
static struct sort_item * items;
int main (int num_args, char * * args)
{
/* local variables and arrays: */
int it;
int * indices;
int * keep;
int status;
/* external functions: */
int item_comparison (const void *, const void *);
status = 1;
if (num_args > 1) {
num_items = num_args - 1;
/* allocate the items array (arguments and argument numbers): */
items = (struct sort_item *) malloc (num_items *
sizeof (struct sort_item));
/* allocate the array indices to sort and the array of flags */
/* for items to keep (after removing duplicates): */
indices = (int *) malloc (num_items * sizeof (int));
keep = (int *) malloc (num_items * sizeof (int));
if (items != NULL && indices != NULL && keep != NULL) {
/* allocations succeeded, so set up the arrays: */
for (it = 0; it <= num_items - 1; it++) {
items [it].item = args [it + 1];
items [it].index = it + 1;
indices [it] = it;
keep [it] = 1;
}
/* sort the index array by argument and then argument number: */
qsort (& (indices [0]), num_items, sizeof (indices [0]),
& item_comparison);
/* mark the items not to keep (those duplicated): */
for (it = 0; it <= num_items - 2; it++) {
if (strcmp (items [indices [it]].item,
items [indices [it + 1]].item) == 0) {
keep [indices [it + 1]] = 0;
}
}
/* write the arguments to keep to standard output */
/* in the proper order, one per line: */
for (it = 0; it <= num_items - 1; it++) {
if (keep [it]) {
printf ("%s\n", items [it].item);
}
}
status = 0;
}
}
return status;
}
/*--------------------------------------------------------------------------*/
/* compare the strings in two items, and the indices in the two items */
/* if the two strings are the same: */
int item_comparison (const void * item1_ptr, const void * item2_ptr)
{
/* local variables: */
int * itm1_ptr;
int * itm2_ptr;
int index1;
int index2;
char * str1;
char * str2;
int ind1;
int ind2;
int result;
itm1_ptr = (int *) item1_ptr;
itm2_ptr = (int *) item2_ptr;
index1 = (* itm1_ptr);
index2 = (* itm2_ptr);
str1 = items [index1].item;
str2 = items [index2].item;
result = strcmp (str1, str2);
if (result == 0) {
ind1 = items [index1].index;
ind2 = items [index2].index;
if (ind1 < ind2) {
result = -1;
} else if (ind1 > ind2) {
result = 1;
} else {
result = 0;
}
}
return result;
}
_______________________________________________
Do not post admin requests to the list. They will be ignored.
X11-users mailing list (email@hidden)
This email sent to email@hidden