C++ Reflection TS: A First Look

Recently, Matus Chochlik implemented the “C++ Extensions for Reflection N4856″ technical specification (or TS) in a fork of clang, which we can toy around with here. Naturally I was intrigued, but the cppreference page seemed very barebones.

Hence, after spending a bit of time with the TS, I wanted to explain what it’s all about, and how to Actually Do Things™ with it.

In this post, I’ll explain the basic ideas of the specification, how to write a simple generic “enum to string” function and go into a bit more detail with a proof-of-concept serialization function.

Note that in all my code, I use namespace reflect = std::experimental::reflect; to shorten things a bit. Also, I’d recommend reading this on a computer rather than a mobile device.

Basic ideas

Pretty much the entire thing is based on concepts, which are essentially constraints on generic types. I won’t go into too much detail, but for the unacquainted this article is a good introduction and you can find more documentation on them here.

The TS adds a new keyword, reflexpr, which returns a so-called “reflection meta-object type”, basically just a type which fits the concept Object. Other concepts refine Object, such as Variable, ObjectSequence, Lambda, etc. In this case, “refine” just means “constrain even more”.

Once you have meta-object type, you can ask it stuff through meta-functions such as get_name, is_class, get_public_data_members, etc. All of these meta-functions are constrained fairly sensibly, for example you can only call get_public_data_members on a Record, which is a concept for classes, structs and unions. Note that as with the rest of the C++ standard meta-functions, these usually have _t or _v shorthands so you can write e.g. stuff_v<T> instead of stuff<T>::value.

Enum to string

Enums only have three dedicated meta-functions, and the only one we care about here is get_enumerators, which will give us an ObjectSequence type we can then work on:

template<typename T> //confusingly, there is also reflect::is_enum(_v), which tells you if a meta-object type is a reflect::Enum
consteval auto enum_names() requires std::is_enum_v<T>
{
    //here's the reflexpr; reflected_enum is a meta-object type reflecting T
    using reflected_enum = reflexpr(T); 

    //enum_enumerators is an ObjectSequence of every enum value's meta-object type
    using enum_enumerators = reflect::get_enumerators_t<reflected_enum>; 
    
    //moar stuff later on
}

ObjectSequence is a concept representing a series of Objects; it itself has a few meta-functions, notably get_element. With this, we can write a function which returns an array filled with the application of another meta-function to each element of the ObjectSequence:

template<
   //the trait we want to apply, don't worry too much about the syntax
   template<typename> typename Trait_t, 
   //the sequence we want to apply it to
   reflect::ObjectSequence Sequence_t,
   //a spicy variadic so we can apply the operation onto each element
   size_t... ints>
consteval auto make_object_sequence_array(std::index_sequence<ints...>)
{
    //Trait_t is expected to have a value member
    return std::array { Trait_t<reflect::get_element_t<ints, Sequence_t>>::value... }; 
}

This is a bit of a mouthful, but essentially it just applies Trait_t to every element of Sequence_t. The index_sequence needs a size we can acquire through the get_size meta-function, which gives us the number of elements in an ObjectSequence. make_object_sequence_array can now be called:

template<typename T>
consteval auto enum_names() requires std::is_enum_v<T>
{
    using reflected_enum = reflexpr(T);
    using enum_enumerators = reflect::get_enumerators_t<reflected_enum>;
    
    //getting the size of the ObjectSequence enum_enumerators
    constexpr auto T_size = reflect::get_size_v<enum_enumerators>; 
    using sequence = std::make_index_sequence<T_size>;

    //this'll return an array with the application of get_name to every element of the ObjectSequence
    return make_object_sequence_array<reflect::get_name, enum_enumerator>(sequence{}); 
}

Testing this in Compiler Explorer, we can see that it works perfectly. We can write our enum_to_string function pretty trivially now:

template<typename T>
constexpr auto enum_to_string(const T value) requires std::is_enum_v<T>
{
    //again, confusingly, there is reflect::underlying_type which returns the meta-object type for the underlying type of a reflect::Enum
    using underlying_type = std::underlying_type_t<T>; 
    const auto underlying_value = static_cast<underlying_type>(value);

    //easy peasy lemon squeezy
    return enum_names<T>()[underlying_value]; 
}

And that’s where I left it, until Barry Revzin mentioned that, well, this doesn’t work at all! It assumes the enum starts at zero and that all values are contiguous, which isn’t always the case, so back to the drawing board for a bit. We can fix this by using get_constant in our own meta-function, which will return a pair of the constant and the name, like so:

template<typename T> requires reflect::Constant<T> && reflect::Named<T>
struct get_constant_and_name 
{
    static constexpr auto value = std::pair { reflect::get_constant_v<T>, reflect::get_name_v<T> };
};

Swapping get_name for get_constant_and_name in enum_names, it now returns an array of pairs, which works. Now we need to change our enum_to_string function a little bit, and we’re set:


template<typename T>
constexpr auto enum_to_string(const T value) requires std::is_enum_v<T>
{

    //this could be a std::find_if, or even a ranges::find but let's keep this relatively simple
    for(const auto&pair : enum_names<T>())
    {
        //remember, first is value, second is name
        if(value == pair.first)
        {
            return pair.second;
        }
    }

    //this may happen for flags for example
    return "Unnamed value";
}

Note that this still doesn’t handle the case where an enum has two enumerators with the same value.

Serialization – Proof of concept

This won’t have any actual serialization code – I leave that as an exercise to the reader. That being said, it’ll quickly show how to iterate over a type and break it down into serializable parts fairly automagically. For this, we’ll use a recursive templated function, which we can already type out a draft of:

template<typename T>
void serialize(const T& value)
{
    //Collection is a concept I made, check the Compiler Explorer link later if you're interested in the implementation
    if constexpr(Collection<T>)
    {
        //call serialize on each element, easy enough
        for(const auto& element : value)
        {
            serialize(element); 
        }
    } else
    if constexpr(std::is_class_v<T>)
    {
        //here we should break T down into its members, isolate the member variables and call serialize on them
    } else 
    {
        //here we should take care of primitive types
    }
}

Taking care of primitive types is essentially just a bunch of if constexprs for strings, arithmetic types, enums (for which we should be using our enum_to_string function perhaps!), etc.

The interesting part is if T is a class. This is where get_public_data_members comes in handy:

if constexpr(std::is_class_v<T>)
{
    //again, this is the meta-object type reflecting T
    using Reflected_t = reflexpr(T);

    //we probably don't want to serialize private stuff, though there is get_data_members for that
    using data_members = reflect::get_public_data_members_t<Reflected_t>;

    //again the size trick to make an index_sequence
    constexpr auto T_size = reflect::get_size_v<data_members>; 
    using sequence = std::make_index_sequence<T_size>;

    //now what?
}

This is tricky. We want to use get_pointer on each element of data_members to get a pointer to the class’ data member, but get_pointer doesn’t have a value member if we use it on a meta-object type which isn’t a Variable. I circumvented this by creating my own meta-function wrapping get_pointer. If its T fits the Variable concept, then it calls get_pointer. Otherwise, it returns std::monostate, signalling that this one should be ignored:

template<typename T>
struct get_pointer_or_monostate
{
private:
    static constexpr auto get_value() 
    { 
        if constexpr(reflect::Variable<T>) 
        {
            return reflect::get_pointer<T>::value; 
        }
        else 
        {
            return std::monostate{};
        } 
    }
public:
    static constexpr auto value = get_value();
};

Armed with this new (and possibly very roundabout, do tell if you find a better way) meta-function, we can now use it to get a tuple of data member pointers and monostates:

//same thing as make_object_sequence_array, except returns an std::tuple
constexpr auto pointer_or_monostate_tuple = make_object_sequence_tuple<get_pointer_or_monostate, data_members>(sequence{});

//this applies the templated lambda onto each element of the tuple (nicer way of writing the std::apply incantation)
apply_operation_on_tuple([&value](auto current_value)
    {
        using current_value_t = decltype(current_value);

        //"not" is cool, sue me; here we check if the type of current_value is not monostate, meaning it's a pointer to a data member of the class
        if constexpr(not std::same_as<current_value_t, std::monostate>)
        {
            //it's all coming together now, we can get the member's value by using the pointer to the class' data member
            const auto& member = value.*current_value;
            //through that we can call serialize on the member
            serialize(member);
        }
    }, pointer_or_monostate_tuple);

Et voila, that’s pretty much that. Popping that into Compiler Explorer, I haven’t actually written any serialization code but popping a few std::couts in there shows me we’re traversing the data we want to serialize correctly, yay!

Closing thoughts

One of my big frustrations with C++ is the lack of (good – sorry typeid) reflection as a core part of the language. Simple stuff the compiler is bound to know, like enum names, usually have to be written manually in current C++, or libraries like magic enum have to be relied on, with all the shortcomings that entails.

As far as I can tell the fate of this TS is not yet decided. What is clear is that the code I just showed was not easy to follow, nor was it easy to write. As always with templates (though concepts have made this better), finding out why something is not working is a massive pain. There are also worries about compilation speed, since this is a lot of extremely generic compile-time work.

That being said, the TS provides an incredibly powerful solution to a lot of problems endemic to C++, all in one coherent package which builds on the rest of the standard, and does so in a way which will be fairly recognizable to most template afficionados. I didn’t touch upon reflecting on a namespace for example, but the stuff you can do with that is mind-bogglingly awesome.

As always, the C++ committee’s member are all volunteers and we should laud their efforts to make the language better. I’d therefore like to thank David Sankel and the contributors to this TS, Matus Chochlik for his implementation, Matt Godbolt for his continued work on Compiler Explorer and Guy Davidson for proofreading this article, reviewing its code and spreading the good word about the TS in a Teams chat at work.

Speaking of which, if you’re looking for work, The Creative Assembly is hiring for a bunch of C++ positions! Feel free to ask me any questions about the roles, or about the article, on twitter or linkedin.

Thank you for getting this far without a TL;DR and have a nice day!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s