
by Adam Berkan
Khan Academy is finishing a huge project to move our backend from Python to Go. Though the primary goal of the project was to migrate off an obsolete platform, we saw an opportunity to improve our code beyond just a straight port.
One big thing we wanted to improve was the implicit dependencies that were all over our Python codebase. Accessing the current request or current user was done by calling global functions. Likewise, we connected to other internal services and external functionality, like the database, storage, and caching layer, through global functions or global decorators.
Using globals like this made it difficult to know if a block of code touched a piece of data or called out to a service. It also complicated code testing since all the implicit dependencies needed to be mocked out.
We considered a number of possible solutions, including passing in everything as parameters or using the context
to hold all dependencies, but each approach had failings.
In this post, I’m going to describe how and why we solved these issues by creating a statically typed context. We extended the context object with functions to access these shared resources, and functions declare interfaces that show which functionality they require. The result is we have dependencies explicitly listed and verified at compile time, but it’s still easy to call and test a function.
func DoTheThing(
ctx interface {
context.Context
RequestContext
DatabaseContext
HttpClientContext
SecretsContext
LoggerContext
},
thing string,
) {...}
I’ll walk through the various ideas we considered and show why we settled on this solution. All of the code examples from this article are available at https://github.com/Khan/typed-context. You can explore that repository to see working examples and the details of how statically typed contexts are implemented.
Attempt 1: Globals
Let’s start with a motivating example:
func DoTheThing(thing string) error {
// Find User Key from request
userKey, err := request.GetUserKey()
if err != nil { return err }
// Lookup User in database
user, err := database.Read(userKey)
if err != nil { return err }
// Maybe post an http if can do the thing
if user.CanDoThing(thing) {
err = httpClient.Post("www.dothething.example", user.GetName())
}
return err
}
This code is fairly straightforward, handles errors, and even has comments, but there are a few big problems. What is request
here? A global variable!? And where do database
and httpClient
come from? And what about any dependencies that those functions have?
Here are some reasons why we don’t like global variables:
- It’s hard to trace where dependencies are used.
- It’s hard to mock out dependencies for testing since every test uses the same globals.
- We can’t run concurrently against different data.
Hiding all these dependencies in globals makes the code hard to follow. In Go, we like to be explicit! Instead of implicitly relying on all these globals, let’s try passing them in as parameters.
Attempt 2: Parameters
func DoTheThing(
thing string,
request *Request,
database *Database,
httpClient *HttpClient,
secrets *Secrets,
logger *Logger,
timeout *Timeout,
) error {
// Find User Key from request
userKey, err := request.GetUserKey()
if err != nil { return err }
// Lookup User in database
user, err := database.Read(userKey, secrets, logger, timeout)
if err != nil { return err }
// Maybe post an http if can do the thing
if user.CanDoThing(thing) {
token, err := request.GetToken()
if err != nil { return err }
err = httpClient.Post("www.dothething.example", user.GetName(), token, logger)
return err
}
return nil
}
All of the functionality that is required to DoTheThing
is now very obvious, and it’s clear which request
is being processed, which database
is being accessed, and which secrets
the database is using. If we want to test this function, it’s easy to see how to pass in mock objects.
Unfortunately the code is now very verbose. Some parameters are common to almost every function and need to be passed everywhere: request
, logger
, and secrets
, for example. DoTheThing
has a bunch of parameters that are only there so that we can pass them on to other functions. Some functions might need to take dozens of parameters to encompass all the functionality they need.
When every function takes dozens of parameters, it’s hard to get the parameter order right. When we want to pass in mocks, we need to generate a large number of mocks and make sure they’re compatible with each other.
We should probably be checking each parameter to ensure it’s not nil, but in practice lots of developers would just risk panicking if the caller incorrectly passes nils.
When we add a new parameter to a function, we have to update all the call sites, but the calling functions also need to check if they already have that parameter. If not, they need to add it as a parameter of their own. This results in huge amounts of non-automatable code churn.
One potential twist on this idea is to create a server
object that bundles a bunch of these dependencies together. This approach can reduce the number of parameters, but now it hides exactly which dependencies a function actually needs. There’s a tradeoff between a large number of small objects and a few large ones that bundle together a bunch of dependencies that potentially aren’t all used. These objects can become all-powerful utility classes, which negates the value of explicitly listing dependencies. The entire object must be mocked even if we only depend on a small piece of it.
For some of this functionality, like timeouts and the request, there is a standard Go solution. The context
library provides an object that holds information about the current request and provides functionality around handling timeouts and cancellation.
It can be further extended to hold any other object that the developer wants to pass around everywhere. In practice, a lot of code bases use the context as a catch-all bin that holds all the common objects. Does this make the code nicer?
Attempt 3: Context
func DoTheThing(
ctx context.Context,
thing string,
) error {
// Find User Key from request
userKey, err := ctx.Value("request").(*Request).GetUserKey()
if err != nil { return err }
// Lookup User in database
user, err := ctx.Value("database").(*Database).Read(ctx, userKey)
if err != nil { return err }
// Maybe post an http if can do the thing
if user.CanDoThing(thing) {
err = ctx.Value("httpClient").(*HttpClient).
Post(ctx, "www.dothething.example", user.GetName())
return err
}
return nil
}
This is way smaller than listing everything, but the code is very susceptible to runtime panics if any of the ctx.Value(...)
calls returns a nil or a value of the wrong type. It’s difficult to know which fields need to be populated on ctx
before this is called and what the expected type is. We should probably check these parameters.
Attempt 4: Context, but safely
func DoTheThing(
ctx context.Context,
thing string,
) error {
// Find User Key from request
request, ok := ctx.Value("request").(*Request)
if !ok || request == nil { return errors.New("Missing Request") }
userKey, err := request.GetUserKey()
if err != nil { return err }
// Lookup User in database
database, ok := ctx.Value("database").(*Database)
if !ok || database == nil { return errors.New("Missing Database") }
user, err := database.Read(ctx, userKey)
if err != nil { return err }
// Maybe post an http if can do the thing
if user.CanDoThing(thing) {
httpClient, ok := ctx.Value("httpClient").(*HttpClient)
if !ok || httpClient == nil {
return errors.New("Missing HttpClient")
}
err = httpClient.Post(ctx, "www.dothething.example", user.GetName())
return err
}
return nil
}
So now we’re properly checking that the context contains everything we need and handling errors appropriately. The single ctx
parameter carries all the commonly used functionality. This context can be created in a small number of centralized spots for different situations (e.g., GetProdContext()
, GetTestContext()
).
Unfortunately, the code is now even longer than if we passed in everything as a parameter. Most of the added code is boring boilerplate that makes it harder to see what the code is actually doing.
This solution does let us work on concurrent requests independently (each with its own context), but it still suffers from a lot of the other problems from the globals solution. In particular, there’s no easy way to tell what functionality a function needs. For example, it’s not clear that ctx
needs to contain a “secret
” when you call datastore.Get
and that therefore it’s also necessary when you call DoTheThing
.
This code suffers from runtime failures if the context is missing necessary functionality. This can lead to errors in production. For example, if we CanDoTheThing
rarely returns true, we might not realize this function needs httpClient
until it starts failing. There’s no easy way at compile time to guarantee that the context will always contain everything it needs.
Our Solution: Statically Typed Context
What we want is something that explicitly lists our function’s dependencies but doesn’t require us to list them at every call site. We want to verify all dependencies at compile time, but we also want to be able to add a new dependency without a massive manual code change.
The solution we’ve designed at Khan Academy is to extend the context object with interfaces representing the shared functionality. Every function declares an interface that describes all the functionality it requires from the statically typed context. The function can use the declared functionality by accessing it through the context.
The context is treated normally after the function signature, getting passed along to other functions. But now the compiler ensures that the context implements the interfaces for each function we call.
func DoTheThing(
ctx interface {
context.Context
RequestContext
DatabaseContext
HttpClientContext
SecretsContext
LoggerContext
},
thing string,
) error {
// Find User Key from request
userKey, err := ctx.Request().GetUserKey()
if err != nil { return err }
// Lookup User in database
user, err := ctx.Database().Read(ctx, userKey)
if err != nil { return err }
// Maybe post an http if can do the thing
if user.CanDoThing(thing) {
err = ctx.HttpClient().Post(ctx, "www.dothething.example", user.GetName())
}
return err
}
The body of this function is nearly as simple as the original function using globals. The function signature lists all the required functionality for this code block and the functions it calls. Notice that calling a function such as ctx.Datastore().Read(ctx, …)
doesn’t require us to change our ctx
, even though Read
only requires a subset of the functionality.
When we need to call a new interface that wasn’t previously part of our statically typed context, we need to add the interface with a single line to our function signature. This documents the new dependency and enables us to call the new function on the context.
If we had callers who don’t have the new interface in their context, they’ll get an error message describing what interface they’re missing, and they can add the same context to their signature. The developer has a chance while making the change to make sure the new dependency is appropriate. A change like this can sometimes ripple up the stack, but it’s just a one line change in each affected function until we reach a level that still has that interface. This can be a bit annoying for deep call stacks, but it is also something that could be automated for large changes.
The interfaces are declared by each library and usually consist of a single call that returns either a piece of data or a client object for that functionality. For example, here’s the request
and database
context interfaces in the sample code.
type RequestContext interface {
Request() *Request
context.Context
}
type DatabaseInterface interface {
Read(
ctx interface{
context.Context
SecretsContext
LoggerContext
},
key DatabaseKey,
) (*User, error)
}
type DatabaseContext interface {
Database() DatabaseInterface
context.Context
}
We have a library that provides contexts for different situations. In some situations, such as at the start of our request handlers, we have a basic context.Context
and need to upgrade it into a statically typed context.
func GetProdContext() ProdContext {...}
func GetTestContext() TestContext {...}
func Upgrade(ctx *context.Context) ProdContext {...}
These prebuilt contexts generally meet all the Context Interfaces in our code base and can therefore be passed to any function. The ProdContext
connects to all our services in production, while our TestContext
uses a bunch of mocks that are designed to work properly together.
We also have special contexts that are for our developer environment and for use inside cron jobs. Each context is implemented differently, but all can be passed to any function in our code.
We also have contexts that only implement a subset of the interfaces, such as a ReadOnlyContext
that only implements the read-only interfaces. You can pass it to any function that doesn’t require writes in its Context Interfaces. This ensures, at compile time, inadvertent writes are impossible.
We have a linter to ensure that every function declares the minimum interface necessary. This guarantees that functions don’t just declare they need “everything.” You can find a version of our linter in the sample code.
Conclusion
We’ve been using statically typed contexts at Khan Academy for two years now. We have over a dozen interfaces functions can depend upon. They’ve made it very easy to track how dependencies are used in our code and are also useful for injecting mocks for testing. We have compile time assurance that all functions will be available before they’re used.
Statically typed contexts aren’t always amazing. They are more verbose than not declaring your dependencies, and they can require fiddling with your context interface when you “just want to log something,” but they also save work. When a function needs to use new functionality it can be as simple as declaring it in your context interface and then using it.
Statically typed contexts have eliminated whole classes of bugs. We never have uninitialized globals or missing context values. We never have something mutate a global and break later requests. We never have a function that unexpectedly calls a service. Mocks always play well together because we have a company-wide convention for injecting dependencies in test code.
Go is a language that encourages being explicit and using static types to improve maintainability. Using statically typed contexts lets us achieve those goals when accessing global resources.
If you’re also excited about this opportunity, check out our careers page. As you can imagine, we’re hiring engineers!