site_archiver@lists.apple.com Delivered-To: darwin-kernel@lists.apple.com Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=T0k4meNW2+TA8RY+clBtkYsSYHd3PdsAfan1VJolbe5fZHaOWQKXntMOI1XekCNmp6SbJGx0oxxVpo6M7oGFu+nJjYnaZUqHOyvN147DMWRBpjytzRMJxei/LTEbgKx8aRIBJajmfS3bupr/mLCKmUCdYGCz6fo/6AVfe1R8/Ds= Hi, Thanks for your reply. With help of the experts, I have finally able to do cross boundary communication successfully. Below is the code snippet for your review. As per Terrys and Michael comments, I have allocated the Command packet memory dynamically and the response packet is kept statically allocated. Shared Structure: ================= typedef struct { unsigned long CmdSize; // Size of command parameters unsigned char InstCode; // Opcode part of CmdCode (set by driver) } CMD_HEAD, *PCMD_HEAD; typedef struct { unsigned long RspSize; // Size of data returned unsigned char ErrorInfo; // Error information } RSP_HEAD, *PRSP_HEAD; typedef union _IOCTL_COMMAND { PCMD_HEAD pApiCmd;// Pointer to Command head RSP_HEAD ApiRsp; // Response head buffer } IOCTL_COMMAND , *PIOCTL_COMMAND; typedef struct { unsigned long pIndex; //the index required by GetDeviceInfo unsigned long iBufferSz; //The command Sz unsigned long oBufferSz; //the expected output Buffer size IOCTL_COMMAND IOCmd; } COMMAND_EXCHANGE, *PCOMMAND_EXCHANGE; At Application Layer: ===================== // assign the input buffer to structIn structIn.IOCmd.pApiCmd = iBuffer; // iBuffer is the void* buffer // Struct Input and Structure output kernReturn = IOConnectMethodStructureIStructureO(*dataPort, myDriverAPICommand, inBuffSize, &outBuffSize, &structIn, &structOut); // copy the output data in oBuffer memcpy(oBuffer, &(structOut.IOCmd.ApiRsp) , oBufferSize); At KEXT Layer: ============= In IOUserClient class: IOReturn myDriverUserClient::CommonAPICmd( COMMAND_EXCHANGE* inpkt, COMMAND_EXCHANGE* outpkt){ PCMD_HEAD pCmd = NULL; PRSP_HEAD pRsp = NULL; pCmd = (PCMD_HEAD)IOMalloc(inputSize); pRsp = (PRSP_HEAD)&(inpkt->IOCmd.ApiRsp); if( (pCmd == NULL) || (pRsp == NULL)) { DBLog(("\nMemory could not be allocated\n")); return kIOReturnNoResources; } status = copyin( (user_addr_t) inpkt->IOCmd.pApiCmd, pCmd, inputSize); .. .. } Bulk communication Module: ========================= In IOService: And then pass the command info to the BULK Communication module: bulkDeviceCommuncation() { IOReturn status; UInt32 size; IOMemoryDescriptor *memDescOut = NULL; IOMemoryDescriptor *memDescIn = NULL; // Validate and allocate if (myDataStructure->fOutPipe == NULL) { status = kIOReturnNoDevice; goto out; } if ((memDescOut = IOMemoryDescriptor::withAddress(&Command, sizeof(COMMAND), kIODirectionOut) == NULL) { status = kIOReturnNoMemory; goto out; } if ((memDescIn = IOMemoryDescriptor::withAddress(&Response, sizeof(RESPONSE), kIODirectinIn) == NULL) { status = kIOReturnNoMemory; goto out; } // do outbound command if ((status = memDescOut->prepare()) != kIOReturnSuccess) goto out; status = myDataStructure->fOutPipe->Write(memDescOut, 1000, 2000, sizeof(COMMAND), NULL); memDescOut->complete(); if (status != kIOReturnSuccess) goto out; // do inbound response if ((status = memDescIn->prepare()) != kIOReturnSuccess) goto out; status = myDataStructure->fInPipe->Read(memDescIn, 1000, 2000, size, NULL, &size); memDescIn->complete(); out: if (memDescOut) memDescOut->release(); if (memDescIn) memDescIn->release(); return(status); } Please send your comments for the same. Thanks and Regards, Rohit Dhamija Michael Smith writes:
On Jun 5, 2006, at 7:52 AM, Andrew Gallatin wrote:
I'm simply trying to account for why it takes ~8x as long for a simple syscall,
Which "simple" syscall?
I think lmbench uses getppid for its null syscall. Sure, you could hack getppid to run like a bat out of hell, but, as I mentioned in a different branch of this thread, what I really care about is getting into (and out of) the ioctl handler for my driver quickly. I have a hacked version of lmbench which measures this, but that's pretty hard to duplicate. See my original email from 2004 about this: http://lists.apple.com/archives/Darwin-kernel/2004/Jun/msg00099.html
and ~4x as long to get an ioctl into our driver when running MacOSX vs ppc64 linux on the same hardware. On x86 linux, when switching from the 1G/3G split to 4G/4G, we see a similar bloat of ioctl times. I had simply assumed that the 4G address space was the issue. Perhaps more of the blame lies elsewhere.
Syscall entry/exit is certainly more expensive on Mac OS than Linux. It comprises such a miniscule part of the elapsed system time on typical systems that whilst it shows up a lot on micro-benchmarks, it's not a very rewarding target for optimisation. Our software needs a way to (quickly) request our driver pin and send (or receive) data from some large buffer. The longer this takes, the less efficient HPC computations are. Drew
Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/rohit.dhamija%40gmail.c... This email sent to rohit.dhamija@gmail.com -- Rohit Dhamija(M) 9818446545 _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a... On 6/6/06, Andrew Gallatin <gallatin@cs.duke.edu> wrote: This email sent to site_archiver@lists.apple.com