Deepesson's blog

By Deepesson, history, 21 month(s) ago, In English

Are you tired of creating comparators or hashes for your custom class? I have a solution for you.

Sometimes we need to use STL data structures such as map, unordered_map or set using a data type that doesn't have a hash or a comparator. Instead of coding things properly I will show an easy way to achieve the same thing with minimal effort.

The idea is to use pointer casting.

Let's say we want to use pair<int,int> with an unordered_map (notice that there's no pre-made hash for this data type).

The size of pair<int,int> is 8 bytes. Also notice that we have long long: a data type also 8 bytes long.

We can create an unordered_map, and to check our pair we can just do:

typedef std::pair<int,int> pii;
std::unordered_map<long long,int> hashmap;
pii object;
hashmap[*((long long*)&object)] = 24;

We can recast our data type for another data type of the same size, and it will work just fine! But we don't have endless integers to use. Is there an easier way that is able to store any data structure?

Luckily there is!

We can use bitsets, and resize them as we wish. Following our previous example:

typedef std::pair<int,int> pii;
typedef std::bitset<sizeof(pii)*8> bits;
std::unordered_map<bits,int> hashmap;
pii object;
hashmap[*((bits*)&object)] = 24;

Here we create a bitset exactly the size of our class, making us able to use any standard container with our pair<int,int>.

Reason for this blog: I have used this trick for a somewhat long time, but it seems that not many programmers are aware of this method (+ I didn't find anything talking about this technique), so here it is :)

  • Vote: I like it
  • +172
  • Vote: I do not like it

»
11 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by Deepesson (previous revision, new revision, compare).

  • »
    »
    11 months ago, # ^ |
      Vote: I like it +11 Vote: I do not like it

    Also, I think it's worth mentioning that with a couple macros we can make our code really small:

    #include <bits/stdc++.h>
    #define CAST(X) std::bitset<sizeof(X)*8>
    #define ACCESS(X) *((std::bitset<sizeof(X)*8>*)&X)
    typedef std::pair<int,int> pii;
    std::unordered_map<CAST(pii),int> hashmap;
    
    int main(){
        pii object={53,21};
        hashmap[ACCESS(object)] = 24;
        std::cout<<hashmap[ACCESS(object)]<<"\n";
    }
    
»
11 months ago, # |
Rev. 2   Vote: I like it +43 Vote: I do not like it

Can use

*reinterpret_cast<bitset<...>*>(&object) 

for a bit more C++ flavor, instead of C style casting, because technically C style cast can do a bunch of unexpected things.

Also if you are going to use this trick, be aware that you are comparing the bit patterns of the objects, which might not necessarily be the same for equivalent objects. For example, you can't just compare the bit patterns of 2 std::string object to see if they are the same, you have to compare the data.

It might also be possible to create a template wrapper class to automatically do this. It might also be possible to just implement a bunch of structured bindings to compare objects field by field automatically, but it might have high overhead.

  • »
    »
    11 months ago, # ^ |
      Vote: I like it +11 Vote: I do not like it

    I have come to prove myself with the last part. The following code lexicographically compare any 2 objects of the same class. (Need C++20).


    #include <bits/stdc++.h> using namespace std; namespace details { struct anything { template <class T> operator T() const; }; template <class T, class Is, class = void> struct can_construct_with_N : std::false_type {}; template <class T, std::size_t... Is> struct can_construct_with_N<T, std::index_sequence<Is...>, std::void_t<decltype(T{(void(Is), anything{})...})>> : std::true_type {}; } // namespace details template <class T, std::size_t N> using can_construct_with_N = details::can_construct_with_N<T, std::make_index_sequence<N>>; namespace details { template <std::size_t Min, std::size_t Range, template <std::size_t N> class target> struct maximize : std::conditional_t<maximize<Min, Range / 2, target>{} == (Min + Range / 2) - 1, maximize<Min + Range / 2, (Range + 1) / 2, target>, maximize<Min, Range / 2, target>> {}; template <std::size_t Min, template <std::size_t N> class target> struct maximize<Min, 1, target> : std::conditional_t<target<Min>{}, std::integral_constant<std::size_t, Min>, std::integral_constant<std::size_t, Min - 1>> {}; template <std::size_t Min, template <std::size_t N> class target> struct maximize<Min, 0, target> : std::integral_constant<std::size_t, Min - 1> {}; template <class T> struct construct_searcher { template <std::size_t N> using result = ::can_construct_with_N<T, N>; }; template <typename> struct is_tuple : std::false_type {}; template <typename... T> struct is_tuple<std::tuple<T...>> : std::true_type {}; } // namespace details template <class T, size_t Cap> using construct_arity = details::maximize<0, Cap, details::construct_searcher<T>::template result>; template <typename T> int compare(const T& a, const T& b) { static constexpr size_t MAX_ARITY = 7; constexpr size_t arity = std::conditional_t<details::is_tuple<T>::value, tuple_size<T>, construct_arity<T, MAX_ARITY>>(); // tuples are weird // FOR_EACH MACRO #define PARENS () #define EXPAND(...) EXPAND4(EXPAND4(EXPAND4(EXPAND4(__VA_ARGS__)))) #define EXPAND4(...) EXPAND3(EXPAND3(EXPAND3(EXPAND3(__VA_ARGS__)))) #define EXPAND3(...) EXPAND2(EXPAND2(EXPAND2(EXPAND2(__VA_ARGS__)))) #define EXPAND2(...) EXPAND1(EXPAND1(EXPAND1(EXPAND1(__VA_ARGS__)))) #define EXPAND1(...) __VA_ARGS__ #define FOR_EACH(macro, ...) __VA_OPT__(EXPAND(FOR_EACH_HELPER(macro, __VA_ARGS__))) #define FOR_EACH_HELPER(macro, a1, ...) macro(a1) __VA_OPT__(FOR_EACH_AGAIN PARENS(macro, __VA_ARGS__)) #define FOR_EACH_AGAIN() FOR_EACH_HELPER #define CHECK_FIELD(...) \ __VA_OPT__({ \ int res = compare(a_##__VA_ARGS__, b_##__VA_ARGS__); \ if (res) return res; \ }) if constexpr ((arity <= 1) || (arity == MAX_ARITY - 1)) { if (a < b) return -1; else if (a > b) return 1; } else if constexpr (arity == 2) { auto&& [a_0, a_1] = a; auto&& [b_0, b_1] = b; FOR_EACH(CHECK_FIELD, 0, 1); } else if constexpr (arity == 3) { auto&& [a_0, a_1, a_2] = a; auto&& [b_0, b_1, b_2] = b; FOR_EACH(CHECK_FIELD, 0, 1, 2); } else if constexpr (arity == 4) { auto&& [a_0, a_1, a_2, a_3] = a; auto&& [b_0, b_1, b_2, b_3] = b; FOR_EACH(CHECK_FIELD, 0, 1, 2, 3); } else if constexpr (arity == 5) { auto&& [a_0, a_1, a_2, a_3, a_4] = a; auto&& [b_0, b_1, b_2, b_3, b_4] = b; FOR_EACH(CHECK_FIELD, 0, 1, 2, 3, 4); } else { assert(false); // unhandled arity } return 0; } template <typename T> operator<(const T& a, const T& b) { return compare(a, b) == -1; } class Five { public: int a, b, c, d; string s; }; class FiveOfFive { public: Five a, b, c, d, e; }; int main() { // demo stuff std::cout << construct_arity<Five, 15>() << '\n'; std::cout << compare(4, 4) << '\n'; std::cout << compare(std::pair<string, int>("4a", 5), std::pair<string, int>("4a", 6)) << '\n'; std::cout << compare(std::tuple<int, int, int>(4, 5, 7), std::tuple<int, int, int>(4, 5, 6)) << '\n'; std::cout << compare(std::tuple<int, int, int, int, int>(4, 5, 7, 8, 9), std::tuple<int, int, int, int, int>(4, 5, 6, 8, 9)) << '\n'; // 9)) << '\n'; map<Five, int> m; m[{1, 2, 3, 4, "abc"}] = 12345; m[{2, 3, 4, 5, "def"}] = 23456; for (auto&& [key, value] : m) cout << value << '\n'; m[{1, 2, 3, 4, "abc"}] = 54321; for (auto&& [key, value] : m) cout << value << '\n'; map<FiveOfFive, int> mm; }

    The most significant part of the implementation is the construct_arity template, which I unashamedly copied from somewhere. This count the number of fields inside a class, by binary searching the amount of field and check if we can construct the object from that many field. This is all done at compile time, but there's some problem:

    • If the type is something like std::string, the code return the max value of binary search -1
    • If the type is std::tuple, the code return 3
    • And maybe many other things that I have yet to test.

    After that, you can see the compare function which return -1, 0, 1 when the lhs is less, greater than, or equal to the rhs.

    Currently the code need to manually handle the various arities. I have handled up to 5.

    It is probably possible (and maybe better) to just implement something similar to std::get (magic_get might be one such thing), then we don't have to handle everything manually. In the implementation, I have opted to copy the FOR_EACH macro.

    It's easy to see that this idea can also be expanded to a hashing function (just call std::hash in the base case, and call std::hash on the fields' resulting hashes otherwise).

    Overall, this was a nice exercise that wasted plenty of time for both me and those of you who read until now.

  • »
    »
    11 months ago, # ^ |
      Vote: I like it +10 Vote: I do not like it

    reinterpret_cast is not enough because it breaks strict aliasing rule. It's better to use bit_cast after c++20.

    • »
      »
      »
      11 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      That's true. I have not updated my things to the newest stuff.

»
11 months ago, # |
  Vote: I like it +13 Vote: I do not like it

Nice