On 6/4/07, Dan Shoop <
email@hidden> wrote:
At 7:34 PM -1000 6/3/07, Sergio Trejo wrote:
Has anyone else experienced Apple RAID
errors, specifically one with the code "-9998"?
Most important here is *what* is generating this error code. So,
where are you seeing it?
[
This is especially relevant since this error code reads as:
controllerHasFixedHeight (-9998):
controllerHasFixedHeight = -9998
]
Normally thse codes aren't meaningful, but the operation that
failed and generated them is the thing to note, along with actual
textual errors rather than codes.
Thanks for pasting the above controllerHasFixedHeight error information (I was not able to find that specific information yesterday). Is it located in a header (.h) file somewhere within the recesses of Mac OS X Server? Even if it is known where to find these error codes, I concur that human parsing of this information to grok semantics of what this error means is probabilistically impossible at first glance.
I have earlier today but have since been
unable to repeat these errors and that's the frustrating part (if I
could repeat the steps to create these errors, then I could document
what to avoid)!
It all started yesterday when I wanted to (before production) do a dry
run of a RAID destruction and restoration process (important to do dry
runs before putting a system into production). So, to start things
off, I took a perfectly fine RAID mirror (both members had Online
status and the mirror itself had Online status according to the
results of "diskutil checkRAID"). I then split one of the
members off of the RAID and checked the status after the split (still
no problem - had Online status for both the RAID and the remaining
included mirror), and then I used the diskutil destroyRAID verb to
destroy the RAID. Simple as pie, no problems were reported after the
destroyRAID verb was executed.
I then erased both of the disc volumes
(which were former members of the now deceased RAID volume) with the
appropriate file system type which in this case was HFS Case-sensitive
Journaled a.k.a. HFSX Journaled.
And you are doing this with what tool(s)?
DiskUtility (I have the feeling I will be in big trouble for using the convenience of the Disk Utility GUI-based application rather than using diskutil on the command line (ducking)).
No problem erasing these volumes. I
then chose one volume (in the example below the volume associated with
the device assigned to "disk1s9") to restore a (compressed,
checksummed) disc image to. Restoration was perfect as best I could
tell (no complaints from asr), just like this:
# asr -source current-snapshot.dmg -target /dev/disk1s9
-erase -noprompt
Validating target...done
Validating source...done
Erasing target device /dev/disk1s9...done
Retrieving scan information...done
Validating sizes...done
Restoring
....10....20....30....40....50....60....70....80....90....100
Verifying
....10....20....30....40....50....60....70....80....90....100
Next, I then attempted to create a new mirror from the just-restored
disc volume. I have done this before and the enableRAID verb use does
not appear to be rocket science in terms of use and difficulty to
understand, so here's what happened:
# diskutil enableRAID mirror disk1s9
Changing filesystem size on disk 'disk1s9'...
Attempting to change filesystem size from 169261555712 to 169261531136
bytes
Filesystem grow failed, 28
Disk Management could not shrink the filesystem to fit the new RAID
headers
Error enabling disk to RAID Invalid request
(-9998)
So this error is really not from AppleRAID but diskutil??? Or are
you also seeing it elsewhere?
Most definitely when using the command-line diskutil (not DiskUtility app). I encountered this error once with the enableRAID verb and another time with the addToRAID verb.
The error would seem self explanatory here. You don't have room
in the partition map for the necessary RAID headers.
What would cause my partition map to (unintentionally) change, particularly after successfully performing an asr restore at the device level to this dedicated target non-RAID partition from a compressed and checksum verified disc image snapshot (using hdiutil) which was originally made of the same partition after it had been split off from the RAID mirror successfully before destroying the mirror with the diskutil destroyRAID verb? Does the destroyRAID verb cause the partition map to morph? It was the subsequent use of the enableRAID verb with diskutil that on said target partition (after it had been successfully restored with asr) that generated this error. I have not been able to repeat this error-causing process (all re-attempts have so far been successful).
I had never experienced such errors
before from using enableRAID so I then tried to enableRAID from the
other available and opposing disc volume (that was formerly in the
deceased RAID mirror), in this case disk2s9. Similarly I restored
disk2s9 in the exact same manner as I had to disk1s9 using asr as
aforementioned (no problems, the asr restoration succeeded), and then
I used enableRAID on disk2s9 to create a new RAID mirror and had no
problems doing so:
# diskutil enableRAID mirror disk2s9
Changing filesystem size on disk 'disk2s9'...
Attempting to change filesystem size from 169261555712 to 169261531136
bytes
The filesystem may need to be modified to make this partition
bootable
Found new RAID Master
Changing filesystem size on disk 'disk4'...
The disk has been converted into a RAID
Presto!
Yes b/c this disks partition map was quite different.
I'd suggest you start looking at the partition maps more closely
and all should be revealed.
Thank you, great advice. Should I expect that partition maps morph throughout the life and usage of disks which have multiple partitions which included one (but only one) partition which is a member of an Apple RAID mirror?
Best regards,
Sergio