GlusterFS mounts fail at boot on CentOS

After nearly a year of running GlusterFS on my compute cluster, I’m surprised it took me so long to run into this bug. I rebooted one of my compute nodes and found that the machine hung at Mounting network filesystems during the boot sequence.

From the volume log — /var/log/glusterfs/<volname>.log — on the client, it looks like a networking issue:

[2014-02-28 06:43:15.935153] E [socket.c:2157:socket_connect_finish] 0-glusterfs: connection to 192.168.5.12:24007 failed (No route to host)

The volume in question has _netdev in its mount options in /etc/fstab, and the netfs service is enabled, so this volume should theoretically mount after networking is up and running. For whatever reasons, CentOS 6.4, at the time of this writing, does not behave this way.

LINKDELAY

The solution is to add a delay to your network interface’s configuration. For example, /etc/sysconfig/network-scripts/ifcfg-eth4:

...
ONBOOT="yes"
MTU=9212
LINKDELAY=20

Thanks to someone on #gluster on Freenode IRC for suggesting it!