nor's blog

By nor, 4 months ago, In English

Someone asked me about my template that used a "cache wrapper" for lambdas, so I decided to write a blog explaining how it works. For reference, here is a submission of mine from 2021 that uses that template.

Here's what you will find implementations for in this blog:

  • Generalized hashing (for tuple types, sequence types and basic types)
  • Convenient aliases for policy based data structures
  • Wrappers for recursive (or otherwise) lambdas that automatically do caching (memoization) for you.

Jump to the Usage section if you only care about the template, though I would strongly recommend reading the rest of the blog too since it has a lot of cool/useful things in my opinion.

If you know how to implement recursive lambdas

$$$ $$$

Motivation

There are two major ways in which dynamic programming is implemented — recursive and iterative.

Recursive DP has the following advantages:

  • It is (arguably) the cleaner way for beginners, and is often easier to reason about mathematically.
  • It is needed in problems where you want to only traverse the states required for computing your answer and no others.

However, people switch to iterative DP as their default DP implementation, because of the following reasons:

  • Recursive DP is sometimes slower compared to iterative DP when they compute the same number of states.
  • It is much easier to do push DP iteratively.
  • Recursive DP takes more boilerplate to write (something along the lines of "if computed already, return the stored value; otherwise, do the following").

The template talked about in the above comment aims to tackle the third point by reducing boilerplate code — it does this by automatically managing memoization for you.

Implementation

In Python, you have nice things called decorators. For example, the functools.cache decorator is precisely what we would use for such a purpose:

from functools import cache
@cache
def factorial(n):
    return n * factorial(n - 1) if n else 1

If you call factorial(10), then the internal cache will be populated with the results of all the recursive calls that were required to compute factorial(10). As a result, you can call factorial(5) and it will just look it up in the internal cache and return the value to you without calling the function body.

So I thought, why not try to do the same thing in C++?

Let's get a couple of things out of the way first:

  • Note that Python dictionaries support heterogeneous types, but you can't do this without type erasure (in C++ or otherwise). So I decided to keep types homogeneous, i.e., the internal cache has keys of the same type. This was done to avoid performance hits, and it made sense because in competitive programming, people don't really use DP states that have heterogeneous types.
  • Functions in C++ can take arguments by value or by reference. But in Python, arguments are taken by name. For the purposes of DP, since it doesn't really make sense to make states mutable, we decide that we will pass everything by value (there are a few pointers in the Caveats section if you are worried about performance implications due to copying a lot, but these are often not important since DP states tend to be integers/small types most of the time).

So let's start out by trying to design it.

We need the following two parts:

  1. A wrapper around a function that is able to store calls to the function and their results.
  2. Generalized hashing, to be able to store results.

Let's solve the first problem first. Hashing arbitrary structs in C++ is quite hard (not impossible, though). But what you can always do (for structs whose intent is to just store elements) is to supply a function that makes them tuples (i.e., provide a type conversion operator). For our problem, we identify the most important types that we will ever need to hash:

  • Native integral types (including characters)
  • Sequence types (vectors, sets, maps)
  • Tuple types (tuples, pairs)

The last two are uniformly recursive on their inputs (except for vector<bool>, but our implementation doesn't need to treat it separately), so we can do recursive metaprogramming to be able to hash such types.

All in all, we need a hashing function that hashes the "base case" and a hash combination function that aggregates over the sequence/tuple types.

The implementation becomes something like this:

Spoiler

Now for the hash table, we could just use std::unordered_map. However, it often turns out to be slower than the GNU policy based data structure gp_hash_table, so we also define some aliases that are helpful.

Spoiler

Now the only thing that remains is to actually decide the design of the wrapper.

We make the following choices:

  • As mentioned above, we won't support reference semantics for now.
  • We can choose to use either lambdas or plain old functions. Functions can't "capture" any state, and thus you need to make things global for using them with functions (or pass them as references, which we already disallowed). Hence, we will support lambdas and not functions for the sake of generality.
  • We will choose to implement the wrapper as a struct that stores the lambda as well as the cache (hash table mapping tuples of input arguments to return values) inside it, and has its own operator() (call operator) that looks into the cache and returns the answer if found, else calls the lambda recursively and updates the cache (both recursively and for the current call).

This is done as follows:

Spoiler

Usage

The usage is very simple.

Here's the whole template for reference:

Template

Let's first try to replicate the python example from above:

auto factorial = use_cache<int(int)>([](auto&& self, int n) -> int {
    if (n) return n * self(n - 1);
    else return 1;    
});
std::cout << factorial(10) << '\n';

You can do more complicated things with this template too. Let's say you want to have a recursive function that takes a char and pair<int, string> as a DP state, and stores a bool. It also requires values of some array you have taken as input, so you need to capture it as well.

You write it like this:

vector<int> v;
// read v
auto solve = use_cache<bool(char, pair<int, string>)>([&](auto&& self, char c, pair<int, string> p) -> bool {
    // note that v was captured by &
    // use c, p, v as needed
    // let's say you have c1 and p1 as arguments for a recursive call
    auto x = self(c1, p1); // recursive call - do as many of them as needed
    // use c, p, v as needed
    return y; // return something
});
std::cout << solve('a', pair{1, "hello"s}) << '\n';

Caveats

The most important caveat is that since it uses a hash-table under the hood, you will often miss the performance that you get from just using multidimensional arrays. You can adapt the template to do things like that, but I didn't, for the sake of clarity and ease of implementation.

Since it doesn't make a lot of sense to store function calls with arguments that can change over time, the template doesn't support lambdas that take things by reference either. This is done in order to avoid any possible bugs that might come with references being modified on the fly somewhere in your code.

However, if for some reason (maybe for performance you want const refs of large structs/vectors instead of copies) you really want to use references, you can remedy that by doing the following:

  • Wrap your arguments within std::reference_wrapper_t (for example, rather than int&, use std::reference_wrapper_t<int>) — this makes a reference behave like a value (and is copyable for example).
  • Provide a hashing implementation for std::reference_wrapper_t<T>.
  • Avoid reference invalidation, i.e., the hash table should store values and not references, because references to temporary variables get invalidated during the run of the algorithm.
  • Handle any other issues that might crop up, on your own.

Do let me know if you know of a better way of doing these things, and if there are any errors that might have crept into this blog!

  • Vote: I like it
  • +86
  • Vote: I do not like it

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

It's nice to see something like @lru_cache implemented in C++. Thanks very much, the blog is great!

Btw, what is the version of C++ that you used?

  • »
    »
    4 months ago, # ^ |
    Rev. 2   Vote: I like it +6 Vote: I do not like it

    I'm glad to know that you liked it! I used C++20 when I submitted a few years back, but it works in C++17 too (relevant submission).

  • »
    »
    4 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    In C++ 23, you could use 'deducing this' for lambda recursion.

    • »
      »
      »
      4 months ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      By the way, the way in which we implement recursive lambdas in C++17-20 is just the continuation passing style, which is also how the design of the cache is done in the blog.

      C++23 makes it a language feature (in the form of syntactic sugar) but it still remains a special case. It will definitely help and encourage more people to write recursive lambdas, though.

»
4 months ago, # |
  Vote: I like it +3 Vote: I do not like it

most important types that we will ever need to hash: ...sets, maps ...

it is worth noting that for equals unordered_sets this hasher can return different values

  • »
    »
    4 months ago, # ^ |
      Vote: I like it +11 Vote: I do not like it

    Thanks for pointing this out. I used the phrase "sequence types" (and not container types) for this precise reason. The (unqualified) default set and map in C++ are ordered and will always give the same hash.

    The implementation, however, doesn't prevent you from using unordered versions of these, so you should indeed take care that you're not plugging them in anywhere.