Re: 64-bit PCI transfers
Re: 64-bit PCI transfers
- Subject: Re: 64-bit PCI transfers
- From: Tim Seufert <email@hidden>
- Date: Wed, 11 Dec 2002 15:46:31 -0800
On Wednesday, December 11, 2002, at 01:38 PM, Michael George wrote:
According to specs, the PCI bus in the G4s can do 64-bit transfers, at
least for DMA transfers.
GCC and the include files on OS X have the "long long" data type,
which is 64-bits. However, one of our hardware designers said that
when he tested 64-bit IO from the processor to the device on the PCI
bus, that the high-order bits were all cleared and data was only in
the bottom long.
I am wondering if it is indeed the case that CPU <--> PCI bus
transfers are limited to 4 bytes, or if we're overlooking something.
G4 general purpose registers are all 32-bit. When GCC emits code for a
32-bit target, 64-bit 'long long' variables and operations on them are
decomposed into multiple 32-bit ops. So, writing a 'long long' to
memory/PCI/whatever will probably compile to two 32-bit store
instructions, not one 64-bit (the only true 64-bit store operation
available on 32-bit PPC is for floating point doubles).
GCC could be clever and use the FPRs when copying 'long long' variables
from one memory location to another without doing any computation on
them in the middle. I don't know whether it does this. I'd look at
the disassembly to find out for sure what it is doing.
If it is doing 32-bit writes, that's not the end of the story. Some
recent processors and/or PCI bridges may be capable of coalescing
multiple 32-bit writes into 64-bit operations on the PCI bus. But if
that didn't happen, it's very likely that the two 32-bit writes made it
onto the PCI bus as individual transactions, and therefore one would
expect the high order bits to be unused.
To get maximum processor -> PCI write performance you really need to
get the CPU to do burst writes (which of course are inherently 64-bit)
instead of single-beat writes. PCI multiplexes address and data on the
same pins, so the fewer addresses you send over PCI, the more bus
cycles there are which get used for useful data transfer. A burst
write is usually the maximum amount of contiguous data the processor
can send in one operation, and thus is useful for getting the PCI
bridge to do longer PCI bursts.
This will generally require marking the destination for the writes as
cacheable, as most PPCs cannot generate burst writes for anything but
cacheable memory. Marking some PCI memory as cacheable means you
really have to be careful in the driver, since cache coherence is NOT
maintained in hardware for PCI memory.
_______________________________________________
darwin-kernel mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/darwin-kernel
Do not post admin requests to the list. They will be ignored.