GlusterFS

Published on Thursday, July 14, 2016

GlusterFS is a scale-out network-attached storage file system that has found applications in cloud computing, streaming media services, and content delivery networks. GlusterFS was developed originally by Gluster, Inc. and then by Red Hat, Inc., as a result of Red Hat acquiring Gluster in 2011, says the Wikipedia. Its a distributed file system that we run on multiple hosts having “bricks” that hosts the data physically (on storage); the nodes communicate with other (peers) and we can create a volume across these nodes with different strategies; replication in one of them if chosen data will get stored in bricks of all contributing nodes acting like RAID 1

image

For our little project we will use two Raspberry Pis to create a GlusterFS Volume and then mount it into Docker Container

image

We need to install glusterfs-server on the PIs; give the following command

$ sudo apt-get install glusterfs-server

It installed Gluster 3.5.2; we can check the version using gluster –version; knowing version is important; as we will need to install same version on the Docker Container; newer versions dont talk to older version Gluster servers and vice versa

Once the gluster is installed probe the peers using gluster peer probe hostname; its better to have the two PIs in same subnet and friendly names are added in /etc/hosts files of each participating nodes. In my case I named two nodes, pi and pi2 and was able to do $ sudo gluster peer probe pi2 from pi and probe pi from p2. Once the probing is done successfully; we can create the RAID 1 like replicating volume using gluster volume create. I issued the following command

$ sudo gluster volume create gv replica 2 transport tcp pi:/srv/gluster pi2:/srv/gluster force

  • /srv/gluster is the directories being used as bricks here; I created them on both nodes
  • I used /srv/gluster thats on the SD card’s storage; ideally you should have USB drives mounted and use that; therefore I had to do force
  • I am using tcp as transport and as I have two nodes this using replica 2 and giving their names and brick paths accordingly

Once the volume is created the two nodes are keeping the bricks in sync and we can mount the volume using mount command. On PI I mounted this volume using mount –t glusterfs pi2:gv /mnt/gluster and on PI2 I mounted this volume using mount –f glusterfs pi:gv /mnt/gluster Once mounted we can read / write the data to GlusterFS just like any file system. If you want to you can add fstab entries; but I mounted on both from peer just to check things out

Lets create a Docker Container where we will mount this Gluster Volume; here’s the Dockerfile

FROM ubuntu
MAINTAINER Khurram <khuziz@hotmail.com>

RUN apt-get update && apt-get -y upgrade
RUN apt-get -y install software-properties-common python-software-properties
RUN apt-get -y install libpython2.7 libaio1 libibverbs1 liblvm2app2.2 librdmacm1 fuse
RUN apt-get -y install curl nano
RUN curl -sSL https://download.gluster.org/pub/gluster/glusterfs/3.5/3.5.2/Debian/jessie/apt/pool/main/g/glusterfs/glusterfs-common_3.5.2-4_amd64.deb > glusterfs-common_3.5.2-4_amd64.deb
RUN curl -sSL https://download.gluster.org/pub/gluster/glusterfs/3.5/3.5.2/Debian/jessie/apt/pool/main/g/glusterfs/glusterfs-client_3.5.2-4_amd64.deb > glusterfs-client_3.5.2-4_amd64.deb
RUN dpkg -i glusterfs-common_3.5.2-4_amd64.deb
RUN dpkg -i glusterfs-client_3.5.2-4_amd64.deb

  • Notice I have used the version of GlusterFS that's running on the PIs

If we are going to run the Docker Container in development environment; it will most probably be behind NAT; and we will not be able to connect to our PIs straight away as 3.5.2 version of Gluster dont allow request from clients using non privileged ports. For this edit /etc/glusterfs/glusterd.vol (at least on the server ip that you are going to use when mounting) and add option rpc-auth-allow-insecure on Also give gluster volume set gv server.allow-insecure on command following stop / start volume so that client can communicate with GlusterFS daemon and bricks using non privileged ports. Also make sure dont use any authentication for the volume as it might not work from behind NAT

The second thing before running Docker Container is; the client uses fuse and we need to expose /dev/fuse device and we need to run the container with SYS_ADMIN capability; if the docker image is khurram/gluster:work then run it with something like

docker run --name gluster --cap-add SYS_ADMIN --device /dev/fuse --rm -it khurram/gluster:work

When you are in Container; add pi and pi2 host entries into /etc/hosts, create a folder where you want to mount say /gluster and use mount command to mount it, mount –t glusterfs pi2:gv /gluster

  • As an exercise, can you customize dockerfile or create docker-compose file that takes care of adding hosts entries mounting glusterfs from the docker run parameters?
  • As an additional exercise, can you customize dockerfile or create docker-compose file further that we have SAMBA running and it exposes the mounted GlusterFS volume on Samba so we can access it from Windows and read/write data to it?
  • https://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.3/Raspbian/jessie/ has the more latest GlusterFS binaries that we can use on PIs and update our Dockerfile matching GlusterFS version accordingly
  • You can have one container that mounts the glusterfs and expose the directory as Docker volume; and then mount that Docker volume in another container (Container running Web Server or Database Server)

Happy Containering