RAID Failure
Implementations of RAID controllers include Mylex, Adaptec, Compaq, HP and IBM. These implementations can rebuild a failed data volume from a hot standby drive or a replacement drive through a hot swap. A rebuild will fail if two disks or volumes fail simultaneously or if part of the native configuration is actually stored on the failed volume. RAIDs can also fail as a result of the following situations and frequently a combination of them:
- Malfunctioned Controller
- Raid rebuild error or volume reconstruction problem
- Missing RAID partition
- Multiple disk failure in off-line state resulting in loss of RAID volume
- Wrong replacement of good disk element belonging to a working raid volume
- Power Surge
- Data Deletion or reformat
- Virus Attack
- Loss of RAID configuration settings or system registry
- Inadvertent reconfiguration of RAID volume
- Loss of RAID disk access after system or application upgrade
If you have problem with your RAID server, some of the processes listed below may help you to minimize further loss of data or at least increase your chance of successful recovery with the right expert.
Case One: Single disk failure where only a number of critical files are required.
This is a most likely situation where it
may be possible for the user to recover stored data fully
and without the need to send the RAID system for recovery
work.
Note: The RAID is still accessible; but without
fault tolerance redundancy any failure of another
disk volume will result in complete RAID server failure.
What to do:
Critical data should be copied out as soon as possible before any rebuild attempt is performed. At this point, the remaining disks making up the RAID volume could be near imminent failure. RAID rebuild process is generally IO intensive and could stress the disks to complete failure. You stand a greater chance to copy out the required data before total failure occurs. Once the critical data is copied out, standard rebuild process could then be carried out.
Case Two: Disk failure, Raid severely degraded and it is important to save everything including the boot-up operating system and application.
Under these circumstances a re-configuration
of application may not be possible or maybe be too time consuming,
What to do:
The only feasible option is to rebuild the
failed RAID volume. However, it is advisable that you backup
the disk image from all the working disks before a rebuild
is performed. If the rebuild function fails or more
disks fail you still have the backup disk image to fall back
on.
Case Three: Loss of RAID disk volume due to
multiple disks failure, system crashes, power
surge, lost of RAID configuration settings or
other unknown reasons.
In such a case, one may need to seek data recovery assistance.
What to do:
Before proceeding, you may want to consider the following:
- Place a value on your data and consider fully the consequences of losing your business or critical data.
- What will be the true cost of replicating the non accessible data and how long will data entry take.
- Who will be affected yourself, your accountant, your customers, your family etc.
- Select an established Data Recovery service provider with clean room facilities, experienced technical support staff and a well organised customer services operation.
- Before you part with your system hardware, you may want to image all the working disks. It is better to play safe so you will always have a backup set to rework in case the original raid server suffered further corruption of any kinds. A good Data Recovery service provider will offer to undertake this for you.
- Carefully take note of some the following information if it’s applicable.
Strip block size (normally a multiple factor of 8K) and order of disk elements in which the RAID volume is formed. Such info can normally be found in the RAID BIOS or RAID configuration Manager.
- Description of problems
- Description of user’s attempt
- List of critical data and folders and any special requirements
- Label each disk before taking them out and carefully note the corresponding position.
- Carefully pack your disks or complete system for delivery to your chosen data recovery expert. Both working and damage disks are needed.
If the RAID volume is not longer accessible, do not attempt any rebuild as such act is meaningless and will only complicate the situation. Rebuild only makes sense if you have a degraded and accessible RAID volume.
Here at Datlabs our RAID Guys specialize in the recovery of data stored on failed or inaccessible RAID systems. This recovery work involves rebuilding and restoring individual failed hard disks to a working condition, reconfiguring the data structures, extracting and repairing the data from the operational system. Read more...
