what is the new feature in c++20 [[no_unique_address]]?

The purpose behind the feature is exactly as stated in your quote: “the compiler may optimise it to occupy no space”. This requires two things:

  1. An object which is empty.

  2. An object that wants to have an non-static data member of a type which may be empty.

The first one is pretty simple, and the quote you used even spells it out an important application. Objects of type std::allocator do not actually store anything. It is merely a class-based interface into the global ::new and ::delete memory allocators. Allocators that don’t store data of any kind (typically by using a global resource) are commonly called “stateless allocators”.

Allocator-aware containers are required to store the value of an allocator that the user provides (which defaults to a default-constructed allocator of that type). That means the container must have a subobject of that type, which is initialized by the allocator value the user provides. And that subobject takes up space… in theory.

Consider std::vector. The common implementation of this type is to use 3 pointers: one for the beginning of the array, one for the end of the useful part of the array, and one for the end of the allocated block for the array. In a 64-bit compilation, these 3 pointers require 24 bytes of storage.

A stateless allocator doesn’t actually have any data to store. But in C++, every object has a size of at least 1. So if vector stored an allocator as a member, every vector<T, Alloc> would have to take up at least 32 bytes, even if the allocator stores nothing.

The common workaround to this is to derive vector<T, Alloc> from Alloc itself. The reason being that base class subobject are not required to have a size of 1. If a base class has no members and has no non-empty base classes, then the compiler is permitted to optimize the size of the base class within the derived class to not actually take up space. This is called the “empty base optimization” (and it’s required for standard layout types).

So if you provide a stateless allocator, a vector<T, Alloc> implementation that inherits from this allocator type is still just 24 bytes in size.

But there’s a problem: you have to inherit from the allocator. And that’s really annoying. And dangerous. First, the allocator could be final, which is in fact allowed by the standard. Second, the allocator could have members that interfere with the vector‘s members. Third, it’s an idiom that people have to learn, which makes it folk wisdom among C++ programmers, rather than an obvious tool for any of them to use.

So while inheritance is a solution, it’s not a very good one.

This is what [[no_unique_address]] is for. It would allow a container to store the allocator as a member subobject rather than as a base class. If the allocator is empty, then [[no_unique_address]] will allow the compiler to make it take up no space within the class’s definition. So such a vector could still be 24 bytes in size.

e1 and e2 cannot have the same address, but one of them can share with c[0] and the other with c1 can some one explain? why do we have such kind of relation ?

C++ has a fundamental rule that its object layout must follow. I call it the “unique identity rule“.

For any two objects, at least one of the following must be true:

  1. They must have different types.

  2. They must have different addresses in memory.

  3. They must actually be the same object.

e1 and e2 are not the same object, so #3 is violated. They also share the same type, so #1 is violated. Therefore, they must follow #2: they must not have the same address. In this case, since they are subobjects of the same type, this means that the compiler-defined object layout of this type cannot give them the same offset within the object.

e1 and c[0] are distinct objects, so again #3 fails. But they satisfy #1, since they have different types. Therefore (subject to the rules of [[no_unique_address]]) the compiler could assign them to the same offset within the object. The same goes for e2 and c[1].

If the compiler wants to assign two different members of a class to the same offset within the containing object, then they must be of different types (note that this is recursive through all of each of their subobjects). Therefore, if they have the same type, they must have different addresses.

Leave a Comment