Writing a memoize function from scratch
2020-10-14
Intro
In this tutorial we'll build our own memoize
function from scratch.
You might be (or might not be!) familiar with the term "memoize". To give a quick tl;dr, "memoize" is kinda a fancy word for "cache". Or to put it a different way, memoizing something is a way of adding caching to it.
When we "memoize" something we cache the results of a function based on its arguments. This can be useful when you have a function that's very performance-intensive.
Chances are you don't need to write your own memoization function for working in most codebases. If you're a lodash
user, there's _.memoize
. I've also used memoize-one
in the past.
But reimplementing things from scratch has other benefits—it's helpful for me when I want to solidify my knowledge of a concept. So here we go, here's memoize
from scratch!
The "specs"
Before we dive right in on coding our memoize
function, let's take a moment and draw up some specifications.
This is how we want memoize
to look and behave:
const sumBelow = memoize(someReallyExpensiveFunction);
// returns `499999999067109000` after delay
sumBelow(1e9);
// returns `1999999998067114000` after delay
sumBelow(2e9);
// immediately returns `499999999067109000`
sumBelow(1e9);
To further flesh out our "acceptance criteria", here's the things that memoize
needs to do:
memoize
needs to accept a function as an argumentmemoize
needs to return a new function that can be used with the same arguments as the original function.When the memoized function is called, it will run the original function.
However, if the memoized function is called again with the same arguments as a previous call, the result of the first call will be returned immediately.
Writing the function
Alright, let's dive into the code for our memoize
function.
First, we'll need to set up memoize
as a function that accepts a function as an argument. (If you're curious, these are often referred to as higher-order functions).
For now, we'll just have memoize
call the original function with the arguments that were passed to it.
const memoize = (fn) => {
return (...args) => {
const result = fn(...args);
return result;
};
};
As we go through this tutorial we can use the following code to test whether our memoize function is working:
// ...memoize function up above
const sumBelow = (limit) => {
const sum = 0;
for (var i = 1; i < limit; i++) {
sum += i;
}
return sum;
};
const memoSumBelow = memoize(sumBelow);
// Returns after a slight delay
memoSumBelow(1e6);
// Also returns after a small delay
memoSumBelow(2e6);
// Should return immediately!
memoSumBelow(1e6);
If we run this with our current memoize
function, we'll see that memoSumBelow
returns the correct value each time, but it doesn't return immediately on the third call (the second one with 1e6
).
Let's add a cache of calls to memoize
so we can make that third call return instantly ⚡️
We'll start by just setting up the cache. We'll use a JavaScript Map
to store the results.
const memoize = (fn) => {
const cache = new Map();
return (...args) => {
const result = fn(...args);
return result;
};
};
It's important that const cache
is outside of the function that we return
. (FYI, this is referred to as a closure if you're interested).
Next, we'll want to create a cache key and put the result of fn(...args)
in our newly created cache. For now we'll use the first argument as a cache key.
const memoize = (fn) => {
const cache = new Map();
return (...args) => {
const cacheKey = args[0];
const result = fn(...args);
cache.set(cacheKey, result);
return result;
};
};
Now there's one thing left to do—we check the cache to see whether the key exists. If it does, we can grab the item out of the cache instead of running fn
. If it doesn't, we run fn(...args)
and cache the result for next time.
const memoize = (fn) => {
const cache = new Map();
return (...args) => {
const cacheKey = args[0];
if (cache.has(cacheKey)) {
return cache.get(cacheKey);
}
const result = fn(...args);
cache.set(cacheKey, result);
return result;
};
};
Bonus: writing unit tests 🤓
Let's write some tests for our memoize
function. I'll be using Jest in these examples.
Before, let's take a step back and brainstorm about the things we want to test. We can use our memoize
specifications that we wrote earlier along with test.todo
.
// memoize.test.js
test.todo('should run the function if it has not been called with the same arguments');
test.todo('should return a cached result if the function has been called with the same arguments');
In the first test, we'll run our memoized function with arguments that it has not been called with. We'll then verify that the underlying fn
argument got called correctly.
First, let's remove test.todo
and replace it with an actual test. At the top of our test we'll create a new mock function using jest.fn()
.
In this case, we don't need something complicated or performance-intensive—we're just testing that the function gets called, so a simple plus1
function will do. We'll pass it into memoize
.
// memoize.test.js
test('should run the function if it has not been called with the same arguments', () => {
const plus1 = jest.fn();
const memoizedFn = memoize(plus1);
});
If you're familiar with the Arrange-Act-Assert pattern for writing tests, this bit we just did is the "Arrange" portion of the test.
Next, let's call plus1
(btw, this is the "Act" portion of Arrange-Act-Assert).
// memoize.test.js
test('should run the function if it has not been called with the same arguments', () => {
const plus1 = jest.fn((a) => a + 1);
const memoizedFn = memoize(plus1);
const result = memoizedFn(1);
});
Finally, we can add our assertion.
// memoize.test.js
test('should run the function if it has not been called with the same arguments', () => {
const plus1 = jest.fn((a) => a + 1);
const memoizedFn = memoize(plus1);
const result = memoizedFn(1);
expect(result).toEqual(2);
expect(plus1).toHaveBeenCalledTimes(1);
expect(plus1).toHaveBeenCalledWith(1);
});
In this test we want to assert three things about our memoize
function:
That
memoizedFn
returned the correct value (.toEqual(2)
in our case).That
memoizedFn
was only called once (.toHaveBeenCalledTimes(1)
). This will matter more in the next test.That
memoizedFn
actually was called with an argument of1
(.toHaveBeenCalledWith(1)
).
For the second test, the first portion of the test looks exactly the same:
// memoize.test.js
test('should return a cached result if the function has been called with the same arguments', () => {
const plus1 = jest.fn((a) => a + 1);
const memoizedFn = memoize(plus1);
});
However, we'll call memoizedFn
slightly differently. We're going to call it three times so that we can test that the cache is working. We'll call it first with 1
, then once with 2
, then again a third time with 1
.
// memoize.test.js
test('should return a cached result if the function has been called with the same arguments', () => {
const plus1 = jest.fn((a) => a + 1);
const memoizedFn = memoize(plus1);
memoizedFn(1);
memoizedFn(2);
const result = memoizedFn(1);
});
Finally, we can write our assertions. We want to make sure that plus1
isn't called the second time.
// memoize.test.js
test('should return a cached result if the function has been called with the same arguments', () => {
const plus1 = jest.fn((a) => a + 1);
const memoizedFn = memoize(plus1);
memoizedFn(1);
memoizedFn(2);
const result = memoizedFn(1);
expect(result).toEqual(2);
expect(plus1).toHaveBeenCalledTimes(2);
});
For this test we only want to assert two things:
That
memoizedFn
still returns the correct result (.toEqual(2)
).That
plus1
was only called twice (once with1
, once with2
, the third time hit the cache).
Bonus BONUS: custom cache keys 🤯
As a second bonus, let's add the ability to customize how the cache key is created.
Currently, memoize
just takes the first argument and uses that as a cache key. This is good enough for a lot of functions, but sometimes you need something a little more robust.
We'll add an optional second argument to memoize
called getCacheKey
. As a default we're going to use the first argument, like we had before.
const defaultGetCacheKey = (...args) => args[0];
const memoize = (fn, getCacheKey = defaultGetCacheKey) => {
const cache = new Map();
return (...args) => {
const cacheKey = args[0];
if (cache.has(cacheKey)) {
return cache.get(cacheKey);
}
const result = fn(...args);
cache.set(cacheKey, result);
return result;
};
};
Then, all that's left to do is swap const cacheKey = args[0]
for our new function.
const memoize = (fn, getCacheKey = (...args) => args[0]) => {
const cache = new Map();
return (...args) => {
const cacheKey = getCacheKey(...args);
if (cache.has(cacheKey)) {
return cache.get(cacheKey);
}
const result = fn(...args);
cache.set(cacheKey, result);
return result;
};
};
Now, instead of only using the first argument to the memoized function as a cache key, we can use whatever we want!
For example, instead of using the first argument to add
, we can create a string containing both numbers.
const add = (a, b) => a + b;
const memoizedAdd = memoize(add, (a, b) => `${a}-${b}`);
memoizedAdd(1, 3);
memoizedAdd(1, 2);
memoizedAdd(1, 4);
// hits the cache for the key "1-2"
memoizedAdd(1, 2);
Finally, let's add a test for this new functionality.
// memoize.test.js
test('should allow customization of the cache key', () => {
const add = jest.fn((a, b) => a + b);
const getCacheKey = (a, b) => `${a}-${b}`;
const memoized = memoize(add, getCacheKey);
memoized(1, 3);
memoized(1, 2);
memoized(1, 4);
const result = memoized(1, 2);
expect(result).toEqual(3);
expect(add).toHaveBeenCalledTimes(3);
});
In this test we set up our memoized function and then run it three times with fresh input. In each of these first three calls, the first argument is the same. This way we can make sure that the cache key is actually changing.
On the fourth time we run the memoized function with arguments that it has already received before.
Lastly, we make our assertions—we'll assert that the fourth time returns the correct result (3
and not the result of memoized(1,3)
), and we'll assert that add
only got called 3 times.
Conclusion
I hope that this was helpful! I know for myself, I learn a ton when I reimplement stuff that I commonly use from scratch.
Feel free to reach out to me on Twitter or my LinkedIn (reference this article so I know you're not a LinkedIn bot 😱 ).