Disk recovery
Depending on the size of your MooseFS instance, you are going to use from several to several hundreds of hard drives in your Chunkservers. The more drives you use, the more often one of them will fail, because any disk hardware deteriorates over time. You may also encounter other issues, like a power failure in a machine that affects your drives or an OS issue that resulted in some logical (filesystem) damages.
Monitoring as prevention of upcoming issues
We strongly recommend you actively monitor the health of all hard drives that store chunks in Chunkservers. The following tools might be of use:
- MooseFS GUI or CLI interfaces will show "last error" information (in the Disks tab or via
-SHDswitch, respectively), that is, the date and time of last I/O error on each disk (if there was no error, a message "no errors" will show instead); any drive that has an error should potentially be checked - S.M.A.R.T. information should be checked for all disks regularly
- system logs should be checked regularly for any kernel messages concerning hard drive errors
If a disk throws too many errors in a short period of time, it will be marked as damaged and all the chunk copies that were stored on this disk will instantly become unavailable to the system, which will create an undergoal situation. The number of tolerated errors and the time period are defined by two variables in mfschunkserver.cfg, that is: HDD_ERROR_TOLERANCE_COUNT errors that happen in HDD_ERROR_TOLERANCE_PERIOD seconds will be tolerated, any more in than that and the disk is marked as damaged. Default values are 2 tolerated errors per 600 seconds.
Replacement of a working, but faulty drive
If at some point you decide that a drive might be faulty and you want to replace it, you should follow these steps:
- mark the drive as MFR (Marked For Removal) by adding an asterisk (*) in front of its path in
mfshdd.cfg, then reload the chunkserver process - wait until your MooseFS instance replicates all the chunks that are stored on this drive to other places in the system; MooseFS tries to use this particular drive as little as possible, i.e. if other copies of affected chunks exist on other chunk servers, they will be used for making new replicas, the drive marked as MFR is always used only as the last resort
- to monitor if the replicaiton process is finished, use the Disk tab in GUI or
-SHDswitch in CLI - after the drive shows that it is ready to be removed, edit the
mfshdd.cfgfile again and remove or comment (with # sign) the line with the drive, then reload the chunkserver process - check GUI or CLI to confirm that the drive is no longer listed for this chunkserver
- remove the drive physically from the machine, insert a new drive
- make sure that the new drive is properly formatted and mounted
- add the mount path of the new drive to
mfshdd.cfg, then reload the chunkserver process - check GUI or CLI to confirm that the new drive is listed for this chunkserver
The new drive will be filled with chunks both trough internal chunkserver rebalancing (if your chunkserver has more than one drive) and by cluster wide rebalance. These processes are 100% automatic and no user input is needed. If desired, fast rebalance can be used in the chunkserver temporarily.
Replacement of a damaged drive
If a drive has failed suddenly and was marked as damaged by MooseFS, it is simply a matter of replacing it - the system has already started necessary replications, to bring all the chunks back to desired redundancy level. In this case:
- edit the
mfshdd.cfgfile again and remove or comment (with # sign) the line with the failed drive, then reload the chunkserver process - check GUI or CLI to confirm that the drive is no longer listed for this chunkserver
- remove the drive physically from the machine, insert a new drive
- make sure that the new drive is properly formatted and mounted
- add the mount path of the new drive to
mfshdd.cfg, then reload the chunkserver process - check GUI or CLI to confirm that the new drive is listed for this chunkserver
The new drive will be filled with chunks both trough internal chunkserver rebalancing (if your chunkserver has more than one drive) and by cluster wide rebalance. These processes are 100% automatic and no user input is needed. If desired, fast rebalance can be used in the chunkserver temporarily.
Repairing a drive's content after an incident
Sometimes a drive's content may be affected by a hardware failure, such as power failure. The drive itself is not faulty, but the recorded data might be. In that case, we recommend a procedure to check (and repair, if needed and possible) the content of the drive before reconnecting it to the MooseFS cluster. The procedure must be performed on a drive that is NOT actively connected to your MooseFS instance, so either the relevant chunkserver process is not running or this drive was temporarily removed from its chunkserver configuration (don't forget to reload the process after each configuration change). Perform these steps:
- run a filesystem integrity checker for your drive (e.g.
fsckforextorxfs_repairforxfs), deal with any errors - run the mfschunktool on your drive; the safest option is to move all chunk copies with problems to some external directory and restore the disk to the cluster with the remaining chunks, let it replicate copies of problematic chunks from other, unaffected copies; if you experience problems with several disks from several chunkservers at the same time and run a risk of all copies of one chunks being affected, then try to repair the chunks
- after the disk is repaired, connect it back to the cluster
The command to move all chunks with problems to a different location:
mfschunktool -m /backup/location/for/affected/chunks /path/to/your/disk/with/chunks
The command to attempt to repair all chunks (recalculate crc and fix header info - incorrect headers are the mostly caused by sudden power loss):
mfschunktool -r /path/to/your/disk/with/chunks
If you use the first option (move problematic chunk copies) and later realise you need to try to repair these chunk copies, because you have no other valid copies in the system, you can always use the -r option with them, repair them and manually move them back to your drive with chunks. If there are not many, just record them in any directory on the drive, e.g. 00, if there are a lot of them, try to distribute them evenly between al directories from 00 to FF. Remember, that this has to be performed on a drive that is NOT actively connected to the MooseFS instance and that you MUST remove the .chunkdb file from the drive before reconnecting it to the cluster.