Skip to main content

Converting an Existing Instance to Use Multilocations

An existing MooseFS instance that already uses multiple physical locations – managed via chunkserver labels and storage classes – can be reconfigured to benefit from the Multilocations feature. The steps are similar to setting up a fresh instance, but require additional care to avoid unnecessary chunk replications and disruption to clients.

Two common scenarios are described below.


Scenario 1: Existing Instance with Servers in Two Cities

Current setup: 10 chunkservers, 1 master, and 1 metalogger in each of two cities: Chicago and New York. Chicago chunkservers are labeled C, New York chunkservers are labeled Y. All data is stored in 2 copies on label C and 2 copies on label Y (4 copies total).

Reconfiguration steps

  1. Define two locations: chicago and newyork, both in ON state.
  2. Assign Chicago chunkservers (by IP) to the chicago location and remove their C label.
  3. Assign New York chunkservers (by IP) to the newyork location and remove their Y label.
  4. Assign the Chicago master and metalogger to chicago (by IP); same for New York modules.
  5. Modify the existing storage class: instead of "2 copies on label C" and "2 copies on label Y", define "2 copies on any chunkserver in location chicago" and "2 copies on any chunkserver in location newyork".

After this reconfiguration, data is stored exactly as before – the mechanics changed, but the physical placement did not. More importantly, the setup now unlocks:

  1. EC format per location: with 10 chunkservers per location, EC 8+1, EC 4+1, EC 4+2, or EC 4+3 can be used for cold data, yielding significant space savings.
  2. Single-command maintenance mode: disabling one location for maintenance no longer requires setting maintenance mode on every chunkserver individually.
  3. Resilience without quorum loss: the default location survives a total failure of the other location and can still elect a leader.
Hold replications during reconfiguration

During the transition there is a period when locations are defined but storage class definitions still reference the old labels. To prevent unnecessary chunk copying during this window, temporarily block replications by placing a chunkserver in maintenance mode and switching it off. Unless any data exists in a single copy only (RL 0) – which is strongly discouraged – this will not make any data inaccessible. Re-enable the chunkserver after reconfiguration is complete.


Scenario 2: Extending an Existing Instance with a Second Data Center

Current setup: 14 chunkservers, 2 masters, 1 metalogger. Several storage classes in use: temporary data (low RL, no cold storage), regular data (medium RL, cold storage after some time), important archival data (high RL, immediate cold storage). The cluster is to be extended with 10 new chunkservers in a second data center that will serve as a backup, with high redundancy for selected data.

Step order matters

The second location created with mfslocadmin create must be the backup location. When a second location is created, the previously hidden default location is revealed – it must then be renamed to main. This ensures that all existing storage class definitions are automatically associated with main, avoiding unnecessary replications.

Reconfiguration steps

  1. Define locations: create the backup location (leave it in OFF state for now) and rename the revealed default location to main. The main location is the default.

    mfslocadmin create backup
    mfslocadmin rename default main
  2. Map existing modules to main: assign all current chunkservers, masters, and the metalogger by their IP addresses.

  3. Map future backup chunkservers to backup: add the IP addresses of the new second data center machines.

  4. Update storage classes:

    • Temporary data classes can be left unchanged – there is no value in backing up temporary data.
    • Classes for data that should be backed up need a location-aware update. Typically this means:
      • Adding -C 0 for the backup location (don't create new chunks there).
      • Adding a KEEP or ARCHIVE definition for the backup location (e.g. EC 4+2 or EC 4+3, depending on the desired redundancy and available chunkservers).
      • Deciding whether KEEP state in backup should be zero or non-zero (replicate while data is hot, or only on archive).
  5. Enable the backup location: start the backup chunkservers, then:

    mfslocadmin state -s ON backup

    The chunkservers will connect and the system will begin copying existing chunks to the backup location according to the updated storage class definitions. Expect higher replication load initially, up to the configured replication limits.

info

Unlike Scenario 1, this extension scenario does not require temporarily blocking replications. New chunk copying to the backup location begins only after the location is switched ON, at which point all storage class definitions are already in their final form.