Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Possible (difficult to reproduce) Intel memcpy bug



This might not be the right list but I figure that memcpy is technically performance because it uses SSE (and hell, which Darwin list do I pick? x86? kernel? user?). Ordinarily, I would submit a bug report but I haven't been able to reproduce the problem in external code.

Summary: I have some code that does a ton of FFT / DSP processing which started to produce wildly inaccurate results when ported to the Core Duo (the exact same code runs perfectly on PowerPC G4 / G5). I eventually narrowed the problem down to what appears to be memcpy. However, if I were to place bets on me being wrong or Apple's memcpy being bogus, I'd know where I'd put the money.

Essentially I have now limited the code to something which looks like this (which is still contained within the main part of the application):
unsigned loopCount = 0;
while (1) {
++loopCount;

for (int spectrum = 0; spectrum < numSpectra; spectrum += spectraPerGroup) {
int offset = spectrum * spectraLen;
const float *r1 = data.realp + offset;
const float *i1 = data.imagp + offset;
float *r2 = chirped.realp + offset;
float *i2 = chirped.imagp + offset;

memcpy(r2, r1, groupSize * sizeof(float));
memcpy(i2, i1, groupSize * sizeof(float));

if (memcmp(r1, r2, groupSize * sizeof(float)) || memcmp(i1, i2, groupSize * sizeof(float))) {
unsigned numDifferences1 = 0;
unsigned numDifferences2 = 0;
for (int i = 0; i < groupSize; ++i) {
if (r1[i] != r2[i])
++numDifferences1;
if (i1[i] != i2[i])
++numDifferences2;
}

std::cout << "-- finished after loop " << loopCount << " --\n";
std::cout << "no. differences 1 = " << numDifferences1 << '\n';
std::cout << "no. differences 2 = " << numDifferences2 << '\n';
std::cout << " no. bytes copied = " << groupSize * sizeof(float) << '\n';
std::cout << " offset = " << offset << " (" << spectrum << ")\n";
std::cout << " spectraPerGroup = " << spectraPerGroup << '\n';


				std::cout << "r1 = " << r1 << '\n';
				std::cout << "i1 = " << i1 << '\n';
				std::cout << "r2 = " << r2 << '\n';
				std::cout << "i2 = " << i2 << std::endl;

				break;
			}
		}
	}

When run on an Intel Core Duo (MacBook Pro), it produces the following output (and output continues to be produced ad infinitum):
-- finished after loop 181 --
no. differences 1 = 0
no. differences 2 = 16
no. bytes copied = 524288
offset = 917504 (114688)
groupSize = 131072
spectraPerGroup = 16384
numSpectra = 131072
spectraLen = 8
r1 = 0x3388040
i1 = 0x3788040
r2 = 0x3b89040
i2 = 0x3f89040
-- finished after loop 225 --
no. differences 1 = 8
no. differences 2 = 0
no. bytes copied = 524288
offset = 524288 (65536)
groupSize = 131072
spectraPerGroup = 16384
numSpectra = 131072
spectraLen = 8
r1 = 0x3208040
i1 = 0x3608040
r2 = 0x3a09040
i2 = 0x3e09040


No output is produced when run on a PowerPC G4 (1 GHz PowerBook), even after running for a length of time approaching infinity.

Also, note that the data in the DSPSplitComplex structure (from the Accelerate framework) is allocated something like:
data.realp = [ 64 byte aligned (after standard malloc) of length 2097152 (float) ]
data.imagp = data.realp + 1048576
chirped.realp = [ 64 byte aligned (after standard malloc) of length 2097152 (float) ]
chirped.imagp = chirped.realp + 1048576


Even though I don't have separate code which reproduces the problem, I do have a copy of the code which should be able click 'n' build via Xcode (and run via the command line). It is open source so if someone else wants to have a look that's fine.



The only possible thing I can think of is that there is some problem during the memcpy (possibly due to the dual processors and the use of the MOVNTDQ instruction...?). The fact that the differences come in bursts of 16 bytes (multiples of 4 floats) seems to point this way.

Any additional debugging advice would be extremely helpful.



r i c k
_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/perfoptimization-dev/email@hidden

This email sent to email@hidden


Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.