sem_otime getting reset to 0
site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com After a semaphore is created the sem_otime member of semid_ds returned by an IPC_STAT will be initialized to 0. After the first semop call it is set to non-zero and AFAIK should never be reset to 0. However this is precisely what I'm seeing. Attached is a program [attachment removed and message resent due to list moderation - will send semc.c on demand directly] that performs a variety of semaphore operations. This command opens a semaphore with IPC_CREAT, IPC_EXCL, mode 777, 1 sem in the array, and initializes it's value to 10: $ ./semc /tmp/foo -o 1 1 777 1 10 CMD_OPEN: semid: 0 semnum: 0 path: /tmp/foo flags: oflags: O_CREAT|O_EXCL mode: 0777 numsems: 1 value: 10 393216 = cmd_open("/tmp/foo", 1, O_CREAT|O_EXCL, 0777, 10) We can query the value with: $ ./semc 393216 0 -g CMD_GETVAL: semid: 393216 semnum: 0 path: flags: oflags: mode: 0000 numsems: 0 value: 0 10 = cmd_getvalue(393216, 0, GETVAL) 10 is right. So far so good. But if we try to open the semaphore *without O_EXCL* there's a problem: $ ./semc /tmp/foo -o 1 0 777 1 10 CMD_OPEN: semid: 0 semnum: 0 path: /tmp/foo flags: oflags: O_CREAT mode: 0777 numsems: 1 value: 10 149: semid_get: Operation timed out Command failed The problem is that it is necessary to check sem_otime to make sure it is != 0 to avoid a race condition where a caller A creates a sem with semget but a second caller B swoops in, opens the sem and performs and operation (e.g. post, wait, getval, etc) before caller A has the opportunity to initialize the sem value with semop. The code to do this in the attached semc.c follows: 130 if ((semid = semget(key, nsems, 0)) != -1) { 131 struct semid_ds buf; 132 133/* This inner try-loop ensures that the semaphore is initialized before 134 * we return even if the semaphore has been created with semget but not 135 * yet initialized with semctl. See Stevens' UNPv2 p274. 136 */ 137 arg.buf = &buf; 138 for (max = MAX_TRIES; max; max--) { 139 if (semctl(semid, 0, IPC_STAT, arg) == -1) { 140 ERR(errno, "semctl"); 141 return -1; 142 } 143 if (buf.sem_otime != 0) { 144 return semid; 145 } 146 sleep(1); 147 } 148 149 ERR(errno = ETIMEDOUT, "semid_get"); 150 return -1; This hole in the API is described very well in Stevens' Unix Network Programming Volume 2 page 274. So the problem with Darwin is that creating a sem, exiting, and then opening the sem a second time appears to reset the sem_otime value. This code works as expected (sem_otime does not get reset to 0) on Linux 2.4, HP-UX, OSF1, and BSD. It's only Darwin that behaves differently. This is a major problem because there caller opening an existing semaphore will just hang waiting for the other non-existant program to initialize it. Pls advise, Mike -- IRC - where men are men, women are men, and the boys are FBI agents. _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... This email sent to site_archiver@lists.apple.com
participants (1)
-
Michael B Allen