On Wednesday, December 11, 2002, at 01:38 PM, Michael George wrote: According to specs, the PCI bus in the G4s can do 64-bit transfers, at least for DMA transfers. GCC and the include files on OS X have the "long long" data type, which is 64-bits. However, one of our hardware designers said that when he tested 64-bit IO from the processor to the device on the PCI bus, that the high-order bits were all cleared and data was only in the bottom long. I am wondering if it is indeed the case that CPU <--> PCI bus transfers are limited to 4 bytes, or if we're overlooking something. G4 general purpose registers are all 32-bit. When GCC emits code for a 32-bit target, 64-bit 'long long' variables and operations on them are decomposed into multiple 32-bit ops. So, writing a 'long long' to memory/PCI/whatever will probably compile to two 32-bit store instructions, not one 64-bit (the only true 64-bit store operation available on 32-bit PPC is for floating point doubles). GCC could be clever and use the FPRs when copying 'long long' variables from one memory location to another without doing any computation on them in the middle. I don't know whether it does this. I'd look at the disassembly to find out for sure what it is doing. If it is doing 32-bit writes, that's not the end of the story. Some recent processors and/or PCI bridges may be capable of coalescing multiple 32-bit writes into 64-bit operations on the PCI bus. But if that didn't happen, it's very likely that the two 32-bit writes made it onto the PCI bus as individual transactions, and therefore one would expect the high order bits to be unused. To get maximum processor -> PCI write performance you really need to get the CPU to do burst writes (which of course are inherently 64-bit) instead of single-beat writes. PCI multiplexes address and data on the same pins, so the fewer addresses you send over PCI, the more bus cycles there are which get used for useful data transfer. A burst write is usually the maximum amount of contiguous data the processor can send in one operation, and thus is useful for getting the PCI bridge to do longer PCI bursts. This will generally require marking the destination for the writes as cacheable, as most PPCs cannot generate burst writes for anything but cacheable memory. Marking some PCI memory as cacheable means you really have to be careful in the driver, since cache coherence is NOT maintained in hardware for PCI memory. _______________________________________________ darwin-kernel mailing list | darwin-kernel@lists.apple.com Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/darwin-kernel Do not post admin requests to the list. They will be ignored.