Saturday, November 16, 2024

How to Fix Ceph-MDS Service Issues on CentOS 7 with Ceph Nautilus 14.2.19

If you’re trying to set up a Ceph Metadata Server (MDS) manually on CentOS 7, especially on Ceph Nautilus version 14.2.19, you might encounter issues with starting the MDS daemon. Here’s a breakdown of the potential problems and a solution based on a common setup mistake.

Steps Taken and Setup

  1. Creating the MDS Directory: You first created a folder inside /var/lib/ceph/mds with the naming format <clusterid>-mds.<hostid>.

  2. Generating and Adding the Keyring: Using ceph-authtool, you created a keyring for the MDS and then added the necessary permissions using:


    ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-mds.<hostid>/keyring --gen-key -n mds.<hostid> ceph auth add mds.<hostid> osd "allow rwx" mds "allow *" mon "allow profile mds" -i /var/lib/ceph/mds/ceph-mds.<hostid>/keyring
  3. Changing Permissions: You modified user permissions for contents of /var/lib/ceph/mds/ to ceph:ceph.

  4. Verifying Keyring: You verified that the keyring matches in ceph auth list for the MDS.

After following these steps, the ceph-mds service should ideally start without issue, but if it doesn’t, and you see the following error in journalctl:


Apr 20 11:38:14 <hostid> ceph-mds[44742]: failed to fetch mon config (--no-mon-config to skip).

And on systemctl status ceph-mds@mds.<hostid>.service you get a failure message:


Unit ceph-mds@mds.<hostid>.service entered failed state. start request repeated too quickly for ceph-mds@mds.<hostid>.service

Common Cause of the Issue: Directory Naming

This issue is frequently due to an incorrect directory naming convention inside /var/lib/ceph/mds. Ceph requires a specific format, and even a minor deviation can cause the MDS daemon to fail when it cannot locate the configuration or keyring files as expected.

Solution

To resolve this issue:

  1. Double-Check Directory Naming: Ensure that the folder name matches exactly the expected format <clusterid>-mds.<hostid>.

    • For instance, if your cluster ID is ceph and host ID is mds1, the directory should be /var/lib/ceph/mds/ceph-mds.mds1.
  2. Restart the MDS Service: After renaming the directory to the correct format, restart the ceph-mds service:


    systemctl start ceph-mds@mds.<hostid>
  3. Verify Service Status: Check the status to confirm it’s running successfully:


    systemctl status ceph-mds@mds.<hostid>

Final Thoughts

Setting up Ceph manually can be intricate due to the strict directory and naming conventions Ceph expects. Following these conventions carefully, especially for directory names, can save time troubleshooting configuration errors like this one.

0 comments:

Post a Comment