What’s the point of generators and controls in @wordpress/data?

At the end of the Motivation for Thunks post we arrived at a thunk function that fetches stuff from a REST endpoint and stores it into state by dispatching an action:

function fetchFeatures() {
  return async ( { dispatch } ) => {
    const { features } = await window.fetch( '/features' );
    dispatch.receiveFeatures( features );
    return features.length;
  };
}

This is a good JavaScript function that’s going to do the fetching and receiving, and the return value from the thunk is available as the return value from the dispatch call (asynchronously):

const count = await dispatch( 'features' ).fetchFeatures();
console.log( `fetched ${ count } features` );

It all works perfectly. But! For a functional programmer, the fetchFeatures function has a very serious issue: it’s not a pure function. Instead of just returning a value and nothing else, it does side-effects like calling window.fetch or dispatch.receiveFeatures. In a purely functional language like Haskell or Elm, you couldn’t do this at all. So, what if we wanted to write our fetchFeatures JavaScript function in a purely functional way? That looks quite impossible doesn’t it? We want fetchFeatures to be a pure function that merely returns a value, and at the same time we want it to perform network fetches and store updates. You can’t get both at the same time.

The functional solution, used by Haskell or Elm, and one we’re going to implement now in JavaScript, is to divide the problem into two parts:

pure function fetchFeatures that returns descriptions of effects it wants to perform.
an effect runtime that reads these descriptions and performs them.

Now please look carefully at this weird fetchFeatures function:

function fetchFeatures() {
  return {
    type: 'fetch',
    params: { path: '/features' },
    next: ( { features } ) => {
      return {
        type: 'dispatch',
        params: { action: receive( features ) },
        next: () => {
          return {
            type: 'return',
            params: { value: features.length }
          };
        }
      }
    }
  }
}

What does it do? It returns an object with shape { type, params, next }. The type of this object could be called Effect and it contains a description of what to do, and what to do next. We want to perform a fetch effect and when it’s done, to call the next callback with the result.

The next callback again returns the same Effect type, this time requesting a dispatch effect. And so on. Finally the return effect requests to “exit” the program, and to return a certain value to the caller.

This fetchFeatures function is indeed a pure function. It does nothing but return a value of type Effect. You could write this function in Haskell, too, and actually Haskell programmers really do it this way — only instead of Effect, Haskell names the effect type as IO.

Now to actually execute the effects, you need an effect runtime that takes an Effect as a parameter and executes it:

function runEffect( effect, next ) {
  switch ( effect.type ) {
    case 'fetch':
      window.fetch( effect.params.path ).then( body => effect.next( body ) );
      break;
    case 'dispatch':
      registry.dispatch( 'foos' )( effect.params.action );
      effect.next();
      break;
    case 'return':
      next( effect.params.value );
      break;
    default:
      throw `unknown effect: ${ effect.type }`;
  }
}

This little runEffect function will bring life to our inert and purely functional fetchFeatures function. Running them together like this:

runEffect( fetchFeatures(), ( count ) => {
  console.log( 'number of features:', count );
} );

will actually do all the fetching and storing and will print the count of received features.

This is exactly how Haskell or Elm works, too. The runEffect runtime is hidden from you, because it’s part of the language runtime (or the Elm “kernel”) and is likely written in C. You, as a functional programmer, write purely functional programs that return instances of the IO type (i.e., effects), and the language runtime then looks at what kind of IO did you return, executes it, and calls a next callback, which is encapsulated in a monad type (something like a Promise with a then handler).

A Haskell example if you’re curious

Here is an example of a Haskell program that prints a prompt, then reads a line, and then prints a greeting using the line that was just read:

main = putStrLn "your name?" >>= (
  \_ -> getLine >>= (
    \s -> putStrLn ("Hello " ++ s)
  )
)

The >>= operator (called bind) is something like a .then method on a promise, or the next callback in our fetchFeatures example. The (\_ -> ...) syntax is a lambda function. This program constructs a structure of IO operations, with callbacks saying what to do next, and returns it from the main program. The language runtime is then responsible for executing these IO operations and calling the callbacks with their results.

You can try this program out in an online Haskell REPL.

Doing it with generators

One ugly thing about our purely functional fetchFeatures function is that it contains a lot of callbacks which are nested and it’s common knowledge that as your program gets more complex these nested callbacks become a hell.

So, with a little bit of syntactic magic we can convert these nested callbacks into generators. This is a generator version of the fetchFeatures function:

function* fetchFeatures() {
  const { features } = yield {
    type: 'fetch',
    params: { path: '/features' },
  };
  yield {
    type: 'dispatch',
    params: { action: receive( features ) },
  };
  return features.length;
}

We are still working with Effect objects, but this time we’re yielding them from a generator. The next callbacks are gone. We are still purely functional, just with a bit of syntactic sugar on top.

The effect runtime that works with a generator/iterator is a bit more complex, you need to understand generators and iterators in some detail to get it, there is tail recursion etc, and looks like this:

function doEffect( effect, next ) {
  switch ( effect.type ) {
    case 'fetch':
      window.fetch( effect.params.path ).then( body => next( body ) );
      break;
    case 'dispatch':
      registry.dispatch( 'foos' )( effect.params.action );
      next();
      break;
    default:
      throw `unknown effect: ${ effect.type }`;
  }
}

function runEffect( effectIterator, next ) {
  function nextEffect( value ) {
    const nextItem = effectIterator.next( value );
    // process return statement
    if ( nextItem.done ) {
      next( nextItem.value );
      return;
    }
    // process effects
    doEffect( nextItem.value, nextEffect );
  }
  nextEffect();
}

The code that connects the generator function and the runtime and brings them to life is exactly the same as for the first callback version!

runEffect( fetchFeatures(), ( count ) => {
  console.log( 'number of features:', count );
} );

Calling the fetchFeatures() generator returns an iterator (sequence of Effects) and the runtime loops through the iterator and executes the effects.

If you’re still interested in analogies with Haskell, this generator syntactic sugar we just described is equivalent to the Haskell do notation. Our example program that reads and prints lines would be rewritten to:

main = do
  putStrLn "your name?"
  s <- getLine
  putStrLn ("Hello " ++ s)

Instead of a series of nested callbacks with the >>= operator, we can write the same program using a do syntax that has a structure similar to async/await.

The connection to @wordpress/data

Looking at the fetchFeatures generator, it probably looks familiar to what you’ve seen in @wordpress/data stores and you’re starting to see the connection.

These generators are pure functions that yield effect descriptions.

The various effect types that the runtime can handle in the big switch statement are controls and they can be registered dynamically in the @wordpress/data store runtime. There are controls for selecting (reading) and dispatching (writing) to a store, the apiFetch control etc.

What’s the point of this additional complexity? Well that’s a good question. If you want to write purely functional code without explicit side-effects, then the runEffect or rungen runtime gives you tools to do exactly that and that fact alone is probably a sufficient justification for you.

If you’re more pragmatic and believe that even code with explicit side-effects can be good code, the answers are not that clear. Some claim that the purely functional code is easier to test and mock. Instead of mocking window.fetch and other random APIs, you create one super-mock for the runEffect runtime and then test your actions against that. There is a well-known Effects as Data talk by Richard Feldman from the Elm team that explains the case for the functional approach in great detail. But I’m personally not very convinced.

Thunks or Generators?

A final note about relationship between thunks and generators. These are two concepts that are not on the same level of abstraction I would say. It’s more precise to say that generators are a layer on top of thunks. What I mean by that is that I can write a thunk that is implemented as a generator and effect runtime:

function* fetchFeatures() {
  const { features } = yield { type: 'fetch', /* ... */ };
  /* ... */
}

function fetchFeaturesThunk() {
  return ( runEffect ) => {
    return runEffect( fetchFeatures() );
  };
}

In other words, runEffect( fetchFeatures() ) is a normal, impure and side-effect-ful function call that can be used anywhere in imperative JavaScript code. The runEffect runtime call is the boundary between the purely functional and imperative world.