What’s the advantage of synchronizing UID/GID across Linux machines?

technical debt

For the reasons below, it is much simpler to address this problem early on to avoid the accumulation of technical debt. Even if you find yourself already in this situation, it’s probably better to deal with it in the near future than let it continue building.

networked filesystems

This question seems to be focused on the narrow scope of transferring files between machines with local filesystems, which allows for machine specific ownership states.

Networked filesystem considerations are easily the biggest case for trying to keep your UID/GID mappings in sync, because you can usually throw that “achieved otherwise” you mentioned out the window the moment they enter the picture. Sure, you might not have networked filesystems shared between these hosts right now…but what about the future? Can you honestly say that there will never be a use case for a networked filesystem being introduced between your current hosts, or hosts that are created in the future? It’s not very forward thinking to assume otherwise.

Assume that /home is a networked filesystem shared between host1 and host2 in the following examples.

  • Disagreeing permissions: /home/user1 is owned by a different user on each system. This prevents a user from being able to consistently access or modify their home directory across systems.
  • chown wars: It’s very common for a user to submit a ticket requesting that their home directory permissions be fixed on a specific system. Fixing this problem on host2 breaks the permissions on host1. It can sometimes take several of these tickets to be worked before someone steps back and realizes that a tug of war is in play. The only solution is to fix the disagreeing ID mappings. Which leads to…
  • UID/GID rebalancing hell: The complexity of correcting IDs later increases exponentially by the number of remappings involved to correct a single user across multiple machines. (user1 has the ID of user2, but user2 has the ID of user17…and that’s just the first system in the cluster) The longer you wait to fix the problem, the more complex these chains can become, often requiring the downtime of applications on multiple servers in order to get things properly in synch.
  • Security problems: user2 on host2 has the same UID as user1 on host1, allowing them to write to /home/user1 on host2 without the knowledge of user1. These changes are then evaluated on host1 with the permissions of user1. What could possibly go wrong? (if user1 is an app user, someone in dev will discover it’s writable and will make changes. this is a time proven fact.)

There are other scenarios, and these are just examples of the most common ones.

names aren’t always an option

Any scripts or config files written against numeric IDs become inherently unportable within your environment. Generally not a problem because most people don’t hardcode these unless they’re absolutely required to…but sometimes the tool you’re working with doesn’t give you a choice in the matter. In these scenarios, you’re forced to maintain n different versions of the script or configuration file.

Example: pam_succeed_if allows you to use fields of user, uid, and gid…a “group” option is conspicuously absent. If you were put in a position where multiple systems were expected to implement some form of group-based access restriction, you’d have n different variations of the PAM configs. (or at least a single GID that you have to avoid collisions on)

centralized management

natxo’s answer has this covered pretty well.

Leave a Comment