Tag: GlusterFS

Replacing Failed Drive in Hardware RAID5 on GlusterFS Replica

The other day a drive died in a hardware RAID5 array on one of my GlusterFS replica servers. I had a spare drive on hand, but I wasn’t sure what the process was for replacing it, and just how much down time my cluster would incur. To my surprise, I took the replica down and […]

Update GlusterFS 3.3.1 to 3.4.0 on CentOS 6.4 cluster

Notes from the GlusterFS 3.3.1 -> 3.4.0 upgrade on my storage / compute cluster at ILRI, Kenya. I referenced Vijay Bellur’s blog post about upgrading to 3.4, then added my own bits using Ansible for my infrastructure (I gave an overview of my Ansible setup here). Our cluster is comprised of: Three “storage” nodes (gluster […]

Managing Research Computing Clusters with Ansible

Our research computing cluster at work is slowly gathering more users, more storage, more applications, more physical machines etc. Managing everything consistently and predictably was beginning to get complicated (or maybe I’m just getting old?). There’s lots of buzz in DevOps circles about tools for managing this kind of scenario; Chef, Salt, Puppet and Ansible […]

“interactive” script for SLURM

I recently rolled out a new distributed model for our research computing cluster at work. We’re using GlusterFS for networked home directories and SLURM for job/resource scheduling. GlusterFS allows us to scale storage with minimal downtime or service disruption, and SLURM allows us to treat compute nodes as generic resources for running users’ jobs (ie, […]

Mjanja Tech

Ujanja Ni Uhai (Hustling Is Life)

Replacing Failed Drive in Hardware RAID5 on GlusterFS Replica

Update GlusterFS 3.3.1 to 3.4.0 on CentOS 6.4 cluster

Managing Research Computing Clusters with Ansible

“interactive” script for SLURM

Sysadmin is happy when users use SLURM