Restoring React Reducer State Across Browser Sessions

Posted on

This is a story about how we used io-ts to save React useReducer state for later. Saved state may be restored months later, in a different browser, while both the React code and the data used by the reducer have continued to change over time. The solution described here is in production on the site you're looking at. First, a short description of the application itself:

Execute Program is an interactive learning platform made up of courses (like TypeScript or Regular Expressions). Each course is made up of many lessons (like Generic function types or Character classes). And each lesson is made up of interactive code examples mixed with paragraphs of text.

As a user goes through a lesson, they reveal one example or paragraph after another. We call those "steps": each code example is a step and each paragraph is a step. When they reach the last step, the client code tells the server that the user has finished the lesson.

Here's the problem:

  1. A user starts a long lesson that might take 20 minutes to finish.
  2. They close the browser tab for some reason.
  3. The next day, they go back to the lesson.
  4. Now they have to start from the beginning of the lesson.

Here's a video of our solution in action. We advance through a paragraph and an initial, non-interactive code example. Then we reload the page and choose to resume the lesson. Then we reload the page again and choose to restart the lesson.

Let's examine a series of possible solutions, beginning with the simplest.

Solution 1: Store the current step index.

React's useReducer manages large state objects with complex transitions. The developer defines how the object will change in response to user actions. For example, when we clicked the "Continue" button in the example above, our reducer incremented the step index by 1.

In our lessons, the reducer state contains the full step objects as retrieved from the API, including the actual code and paragraph text. It also contains various metadata, including an index variable: a number representing the current step's index in the step array. When the user finishes a step – for example, by completing a code example – we increment the index and the UI shows the next step.

The first possible solution is to save the step index somewhere: maybe on the server, or maybe in the browser's localStorage. This works... but only if our lessons never change.

Our lessons changes almost every day as we tweak them in response to user feedback. If we insert, delete, or reorder steps, then the stored index will be wrong. The user will get to a certain point in a lesson, then leave, then come back, and the lesson will resume at the wrong step. We need a solution that always resumes to the correct point in the lesson.

Solution 2: Invalidate the step index when the lesson changes.

We can store the step index on the server or in the browser's local storage, but also have a way to invalidate it when the lesson content changes. The easiest solution here is a hash over the entire lesson's content: all of the code examples and all of the explanatory text. We store the hash alongside the step index.

Later, when a user returns to the lesson, we compare the stored hash against the current lesson hash. If they match, we offer to resume the lesson. Any content change will result in a different hash, so we'll ignore the stored step index whenever the lesson changes. This works... but it only restores the step index. What about the other metadata that the client gathers during the lesson?

We care about that metadata! For example, we want to know how many attempts the user made at each code example. Aggregating those numbers can show us when certain examples are too difficult.

We need a solution that preserves that metadata in addition to the step index. One idea is to pull out the pieces of the reducer state that matter and store only them. That would include the exact content of the lesson (or a hash of it), the number of attempts the user made at each code example, etc.

Solution 3: Store the entire reducer state.

As it turns out, we care about most of the data in the reducer state; otherwise we wouldn't be tracking it! The easiest solution is to store the entire reducer state, then restore it later when the user resumes the lesson.

Storing the reducer state brings up another data mismatch problem. Last time, the mismatch was in the lesson content. Now, the mismatch is in the schema or structure of the reducer's state object: the actual keys in the state, and the particular types of data in those keys.

Here's the problem that will occur:

  1. The user does part of a lesson.
  2. We store the reducer state.
  3. A month passes. We make code changes that change the structure of the reducer state: adding and removing keys, changing data types, etc.
  4. The user comes back and resumes the lesson.
  5. We restore the reducer state.
  6. Now the user is running a new version of our app, which expects the new reducer state data, but we just restored the old reducer state data.
  7. The app crashes (or worse).

We need a solution that can ignore any saved reducer state that was created by an older, incompatible version of the client-side code.

Solution 4: Version the reducer state.

The obvious solution is to add a version number to the saved state data. When we make an incompatible change to the reducer state's structure, we change the version number. When restoring an old reducer state, we compare its version number against the current code's version number. If they don't match, we ignore the saved state and start from the beginning of the lesson.

Unfortunately, humans are bad at remembering to do things. It's likely that we'd change the reducer state's structure at some point, but forget to change the version number. This is especially likely because the state contains the lesson content itself, which comes from our API. We'd have to remember to increment the version number when changing the state, but also whenever we change any of the related API endpoints. That's too error-prone.

We could automate the version number. For example, we could hash the source files that define the relevant parts of the API, as well as the file that defines the reducer state, and use that hash as the "version number". But that has its own problems.

First, even trivial changes like adding or removing whitespace would invalidate all lesson states for all users. Second, what happens if we break one of those API files into two different files? If we forget to add those two new files to our hashing scheme, then the system won't know to invalidate saved states when they change. Hashing the source files would only provide the illusion of safety; in reality, it's probably more error-prone than a manual version number.

We need a solution that can detect changes to the structure of the state data only when that structure genuinely changes, and regardless of why it changed.

Interlude: An introduction to io-ts.

Execute Program is written in TypeScript. It already uses io-ts to do runtime validation of API data. For example, when an API request comes in, we use io-ts to check that all of the keys are present; that keys that should be numbers are numbers; that arrays are arrays; etc.

Our reducer problem is similar to the API problem. In an API, we have network data coming in from another computer. It could be coming from a malicious attacker, or it could be coming from a future version of ourselves who made a mistake and sent the wrong data to our own API. We want to detect those problems and throw an error.

In our reducer state problem, we also have data coming in from "another" computer. In this case, that computer is the user's own computer from some point in the past: the state was saved a month ago and it's being resumed now. Like with the API, we need to check the structure of the data: make sure that all expected keys are present; that keys that should be numbers are numbers; etc. We can use io-ts to solve this, just as we do with our API.

Here's a quick primer on how io-ts works. Suppose that our API has a register endpoint for registering new accounts. It accepts an email address. We can define an io-ts codec that validates the incoming data:

import * as t from 'io-ts'
import { isRight } from 'fp-ts/lib/Either'

const register = t.strict({
email: t.string,
})

Then, we can use it to check for whether some data matches our codec:

// This returns true.
isRight(register.decode(
{email: 'amir@example.com'}
))

// Each of these returns false.
isRight(register.decode(
{email: 1234}
))
isRight(register.decode(
{email: undefined}
))
isRight(register.decode(
{userEmail: 'amir@example.com'}
))
isRight(register.decode(
{}
))
isRight(register.decode(
null
))

This is how Execute Program checks API data. We define io-ts codecs for every API endpoint. Then the server uses decode and isRight to validate all incoming and outgoing data.

Some of our API data is complex: objects nested in arrays nested in other objects nested in yet other objects, etc. We break those complex io-ts codecs into smaller, more manageable pieces, which are sometimes shared between multiple endpoints.

There's one more important fact about io-ts. We're in TypeScript, so we want static types for our data. For our register codec above, that might look like:

type Register = {
email: string
}

But that duplicates our io-ts codec. Instead of writing the type manually, we can extract the exact type above from the io-ts validator:

type Register = t.TypeOf<typeof register>

That will look quite arcane if you're new to TypeScript. But this pattern always looks the same when extracting a static type from an io-ts codec, so it becomes routine with with some TypeScript and io-ts experience. One short line of code using t.TypeOf can replace hundreds of lines of duplicated type definitions.

(Full disclosure: doing this sometimes makes type error messages more confusing, but the details of that are out of scope here. In some situations, we do define the types explicitly, effectively duplicating the io-ts codec definitions. It's annoying, but the type system will keep us honest: it will error if our types don't match the codec's structure. These definitions are used all over the system but change rarely, so a bit of extra code in a complex codec is often a good trade-off to get better error messages.)

To summarize:

  • io-ts lets us validate data at runtime, ensuring that it has the structure that we expect.
  • We can extract static types from our io-ts codecs, so we don't have to define the types twice unless we choose to.

Solution 5: Validate the reducer state with io-ts.

Now we can talk about our chosen solution for the lesson resuming problem.

We define an io-ts codec that describes the structure of the reducer state. Remember that the reducer state contains the steps (the code examples and paragraphs), which come from the API. Our entire API is defined using io-ts, so we can reuse all of those codecs. Excluding imports and whitespace, the io-ts codecs for our reducer state are 66 lines long.

If we ever change the reducer state's structure, we'll be forced to update the io-ts codec. That's because the TypeScript type system checks all of this: the io-ts codec must match the reducer state types, and the reducer code in our React components must also match those types.

Now we can formulate a solution to the problem of deciding whether a lesson state came from an old, incompatible version of the application:

  1. Deserialize the JSON-encoded state. If it's not legal JSON, abort.
  2. Check the JSON against our reducer state io-ts codec. If it doesn't conform to the io-ts codec, then it's from an older, incompatible version of the code. Abort.
  3. If we get here then the state is compatible!

Now we have to solve the second problem: what if the lesson content has changed? Fortunately, the reducer state contains all of the steps (the code examples and paragraphs) as they existed when the state was saved. The client also has the current examples and paragraphs: it's showing a lesson, so it just got all of the lesson content from the API. We can compare the two to decide whether the content has changed since this state was saved:

  1. Get the steps array from the reducer state.
  2. Use the API's io-ts codecs to "narrow" those steps to only the keys that exist in the API. This removes the metadata fields added by our reducer. (In addition to validating data, io-ts codecs can narrow data down, which removes any object keys that are in the data but aren't part of the codec.)
  3. Use lodash's isEqual to do a deep comparison of the saved state's steps and the current API steps.
  4. If they're not equal, then the content has changed since this state was saved, so abort.

This is a full solution to the problem! The type system ensures that we never create invalid reducer states in our normal React code. The io-ts codecs ensure that we never restore an old state with the wrong structure. And checking the saved steps against the current steps ensures that we don't confuse the user or the app by restoring to the wrong point.

Here's the function that loads the state. It takes a string that (hopefully) contains a JSON-serialized state.

export function loadSerializedState(
apiSteps: readonly ApiStep[],
serializedModelState: string
): State | undefined {
return validateSteps(
apiSteps,
validateStateType(
deserializeJson(serializedModelState)
)
)
}

The three functions called here are quite short. Excluding imports and comments, all four functions sum to 38 lines of code.

Each of the three functions called here will return undefined if something doesn't match, and each of them passes undefineds along. That allows us to write the top-level loadSerializedState function in this simple way.

If loadSerializedState returns an object, then we know it's a valid reducer state that corresponds to the current lesson content and the current client-side code. We feed it into useReducer, and React uses it without knowing or caring that it was created a month ago!

It's important to note that this is a conservative solution. Suppose that we make a typo in the lesson content: we write "proimse" instead of "promise". When we fix that typo, the current lesson content will no longer match any of the saved states for that lesson. Every saved state for that lesson is immediately invalidated.

In the future, we may try to detect trivial content changes and allow those saved states to be restored. But the perfect is the enemy of the good, and we get most of the benefit without doing any deep content comparison.

Where do we store the state?

Now that we can save and restore our reducer state, there's one more question: where do we store it? The initial plan was to store it in the browser's local storage. But that has a big problem. If a user begins a lesson on their phone, then later tries to resume it on their laptop, the state won't be there. The laptop's browser can't see the saved state on the phone.

Our actual solution is to store the state in our server-side database. This feels wrong at first glance: we're storing a client-side React component's state as an opaque JSON blob in our server-side database, and we're doing that knowing that some of those JSON blobs will go out of date and be unusable in the future. But io-ts and the type system keep us honest here! And this is a convenience feature, so we can always change it later, even if that means throwing away all of the saved states. (When a user finishes a lesson, that record is stored in a separate part of the database.)

A simple solution

It took a lot of words to describe this solution, but that's because there was a lot of background to explain. In reality, I went almost directly to the io-ts solution because I could see that this problem was analogous to the API validation problem.

This code change was quite simple: it added 470 lines of code containing 1,102 words. (This article contains almost three times as many words!) Those 470 lines include the new modal UI for resuming, the new API endpoint, the API server handler, the database migration, and the database model class, all of which would exist with or without this solution based on io-ts.

Our tools are a good match for our problems, which allows us to solve problems like this smoothly:

  • React's useReducer represents states as simple JavaScript objects, so we can serialize them to JSON and bring them back later.
  • Defining our API's structure with io-ts makes it easy to validate objects that contain API data.
  • TypeScript statically guarantees that our system's data all has the correct shape. If io-ts says that the restored state object is correct, we can trust TypeScript to keep other parts of the system honest.

If you're interested in trying Execute Program to see this in action, we recommend the TypeScript courses or the Regular Expressions course. Please use the feedback item in the menu to tell us what you think!

This post was written by the Execute Program team. Execute Program teaches TypeScript, JavaScript, Regular Expressions, SQL, and more using thousands of interactive code examples. It has an integrated spaced repetition system to ensure that you don't forget what you've learned!