What exactly does Gluster do?

We recently started researching GlusterFS for our own usage so this question was interesting to me. Gluster uses what are called ‘translators’ on the FUSE client to handle how you store data. There are several types of translators which are outlined here:

http://www.gluster.com/community/documentation/index.php/GlusterFS_Translators_v1.3

The one you are asking about specifically is called the Automatic File Replication Translator or AFR, and is covered in detail here:

http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator

Looking at the source code it appears that the data is actually written to nodes simultaneously, much better than rsync!

Regarding the recovery from a failure situation there is one interesting note I found. The Gluster system is different than Ceph in that it isn’t actively aware of replication state changes and has to be ‘triggered’. So if you lose a node in your cluster, you have to lookup each file in order for Gluster to make sure its replicated:

http://www.gluster.com/community/documentation/index.php/Gluster_3.2:_Triggering_Self-Heal_on_Replicate

I was unable to find a good page describing the failure scenario mechanisms internally, like how the client detects things are broken. However downloading the source code and looking through the client it appears there are various timeouts that it uses for commands and a probe it does every so often to other systems in the cluster. It looks like most of these have TODO marks and aren’t currently configurable except through source code modification, which may be a concern for you if convergence time is critical.

Leave a Comment