見出し画像

Go Context Essentials

Introduction

Developers new to Go may quickly encounter the concept of "context" as they begin exploring the language. Usually, it's the first parameter in a function call, and it is usually named ctx. For example, consider the following (non-production) code sample.

import (
 "context"
 "time"

 "cloud.google.com/go/pubsub"
)

func main() {
 ctx := context.Background()
 client, err := pubsub.NewClient(ctx, "project-id")
 if err != nil {
  // TODO: Handle error.
 }

 // Create a new topic with the given name.
 topic, err := client.CreateTopic(ctx, "topicName")
 if err != nil {
  // TODO: Handle error.
 }

 // TODO: etc etc
}

Inexperienced (in Go) developers will often see this ctx variable in large parts of the code base and wonder what it's for. Similarly, although having an intrinsic understanding, experienced developers may find it difficult to explain exactly what it is, and what the best practices are.

In this blog post, we will provide an example based summary of the core features of the context package.

The official definition

The context package specification defines:

context defines the Context type, which carries deadlines, cancellation signals, and other request-scoped values across API boundaries and between processes

Additionally, it defines:

Incoming requests to a server should create a Context, and outgoing calls to servers should accept a Context.

In summary, this is indicating that entry points to code would create the context. So, we would imagine doing this:

  • at the start of a worker service

  • at the entry point to a HTTP handler when processing a request

    • Using net/http, or third party frameworks, this is done for us.

Regarding conventions of how we write code, the official document iterates what we see in most idiotmatic Go code:

Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx

Deadlines and Timeouts

Let's assume we have a function that makes a call to a third party API. For demonstration purposes, we supply an empty context and a delayValue which configures how long our API call will take. (This sample makes use of httpbin to act as our third party API) In our sample, we may decide that we want a relatively short timeout before the request should be cancelled.
context package's WithXXX functions will return new, child contexts since Context is immutable.

func getFromUrl(ctx context.Context, delayValue int) error {
  slog.Info("getFromUrl entry: ", slog.Int("delay-value", delayValue))
  start := time.Now()

  // Conventionally, in a function like this, we would just overwrite our parent context
  // So, we'd use `ctx, cancel := context.WithTimeout(ctx, 3*time.Second)`
  // For this sample code however, we'll assign our new child context to a new variable.
  ctxWithTimout, cancel := context.WithTimeout(ctx, 3*time.Second)
  defer cancel()

  url := fmt.Sprintf("http://0.0.0.0:80/delay/%d", delayValue)
  req, err := http.NewRequestWithContext(ctxWithTimout, http.MethodGet, url, nil)
  if err != nil {
    return err // We don't expect this to happen
  }

  httpClient := &http.Client{}
  resp, err := httpClient.Do(req)
  if err != nil {
    // Note: ctxWithTimout.Err() will be set here for the child context.
    slog.Error("httpClient.Do(req) has returned error: ", slog.String("err", err.Error()), slog.Any("ctx-err", ctxWithTimout.Err()))
    return err
  }
  defer resp.Body.Close()
  defer io.Copy(io.Discard, resp.Body)

  slog.Info("getFromUrl completed: ", slog.Int64("since-ms", int64(time.Since(start)/time.Millisecond)))

  return nil
}

In this function, httpClient.Do is attempting to call an API. However, if no response is received by 3 seconds, then the child context is cancelled. The result of this happening is that:

  • httpClient.Do will return an error.

  • ctx.Err() will not be nil.

New for Go 1.21, WithTimeoutCause

If we want more control over what error is returned when a timeout has occurred, we can make use of new functionality introduced to the context package.

For example, we could define our own custom error type. In the sample below, we store the timeoutDuration that was exceeded in the struct. But, we would be free to add more features to this custom error.

type apiTimeoutError struct {
  timeoutDuration time.Duration
}

func (a *apiTimeoutError) Error() string {
  return fmt.Sprintf("timeout after %v seconds", a.timeoutDuration.Seconds())
}

func newApiTimeout(timeoutAfter time.Duration) *apiTimeoutError {
  return &apiTimeoutError{
    timeoutDuration: timeoutAfter,
  }
}

Having defined our error, we would could modify our getFromUrl to.

func getFromUrl(ctx context.Context, delayValue int) error {
  slog.Info("getFromUrl entry: ", slog.Int("delay-value", delayValue))
  start := time.Now()

  // UPDATED HERE
  timeoutAfter := 3 * time.Second
  ctxWithTimeout, cancel := context.WithTimeoutCause(ctx, timeoutAfter, newApiTimeout(timeoutAfter))
  defer cancel()
  // END OF UPDATED CODE

  url := fmt.Sprintf("http://0.0.0.0:80/delay/%d", delayValue)
  req, err := http.NewRequestWithContext(ctxWithTimeout, http.MethodGet, url, nil)
  if err != nil {
    return err // We don't expect this to happen
  }

  httpClient := &http.Client{}
  resp, err := httpClient.Do(req)
  if err != nil {
    // Note: ctxWithTimout.Err() will be set here for the child context.
    slog.Error("httpClient.Do(req) has returned error: ", slog.String("err", err.Error()), slog.Any("ctx-err", ctxWithTimeout.Err()))
    return err
  }
  defer resp.Body.Close()
  defer io.Copy(io.Discard, resp.Body)

  slog.Info("getFromUrl completed: ", slog.Int64("since-ms", int64(time.Since(start)/time.Millisecond)))

  return nil
}

Any code that calls getFromUrl can easily check the error returned using errors.As. Normally, you could use this to perform conditional logic, perhaps relating to retries, or some other form of behavior that takes remedial action. For simplicity, we can amend our sample code as follows.

package main

import (
  "context"
  "errors"
  "fmt"
  "log/slog"
  "net/http"
  "time"
)

func main() {
  ctx := context.Background()
  checkError(ctx, getFromUrl(ctx, 1))
  checkError(ctx, getFromUrl(ctx, 5))
  slog.Info("completed")
}

func checkError(ctx context.Context, err error) {
  if err != nil {
  var apiTimeoutErr *apiTimeoutError
  if errors.As(err, &apiTimeoutErr) {
    // Was a timeout
    // Note: ctx.Err() will NOT be set in the parent context
    slog.Error("timeout err: ", slog.String("err", err.Error()), slog.Any("ctx-err", ctx.Err()))
  } else {
    // Not a timeout
    // Note: ctx.Err() will NOT be set in the parent context
    slog.Error("unexpected err: ", slog.String("err", err.Error()), slog.Any("ctx-err", ctx.Err()))
  }
  }
}

The end result of executing this code is:

2024/10/12 14:46:38 INFO getFromUrl entry:  delay-value=1
2024/10/12 14:46:39 INFO getFromUrl completed:  since-ms=1029
2024/10/12 14:46:39 INFO getFromUrl entry:  delay-value=5
2024/10/12 14:46:42 ERROR httpClient.Do(req) has returned error:  err="Get \"http://0.0.0.0:80/delay/5\": timeout after 3 seconds" ctx-err="context deadline exceeded"
2024/10/12 14:46:42 ERROR timeout err:  err="Get \"http://0.0.0.0:80/delay/5\": timeout after 3 seconds" ctx-err=<nil>
2024/10/12 14:46:42 INFO completed

You will observe that httpClient.Do(req) returns a wrapped error, but the child context has ctx.Error() set to context.DeadlineExceeded. Additionally, the parent context does not have ctx.Error(). (Deadlines are not propagated back up to parent contexts)

context.WithDeadline and context.WithDeadlineCause

In principle, WithDeadline and WithDeadlineCause can be used in a similar manner as with their analogous Timeout functions. The difference being instead of a time.Duration, we specify a specific time.Time that the context should close.

In fact, looking at the actual implementation in the context package, we can see that WithTimeout is just a function call to WithDeadline.

func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {
 return WithDeadline(parent, time.Now().Add(timeout))
}

Cancellation Signals

So far, we have looked at scenarios where child contexts are expiring via a timeout or deadline. Another key feature is that contexts can be cancelled, and when they are cancelled, it is propagated to any child contexts. Propagation of cancellation signals will only be applied downwards to children of the cancelled context. Signals are not applied upwards to the parent, or across to any sibling contexts. This can be demonstrated with the following code.

func main() {
  parent := context.Background()

  // Create two child contexts with their own cancel functions,
  // and create two further grandchildren from the first child
  child1, cancel1 := context.WithCancel(parent)
  grandchild1, cancelg1 := context.WithCancel(child1)
  grandchild2, cancelg2 := context.WithCancel(child1)
  child2, cancel2 := context.WithCancel(parent)

  defer cancelg1()
  defer cancelg2()
  defer cancel2()

  // Simulate canceling child1
  cancel1()

  // Check states
  fmt.Println("child1:", child1.Err())       // Outputs: context canceled
  fmt.Println("child1a:", grandchild1.Err()) // Outputs: context canceled
  fmt.Println("child1b:", grandchild2.Err()) // Outputs: context canceled
  fmt.Println("child2:", child2.Err())       // Outputs: <nil> (still active)
  fmt.Println("parent:", parent.Err())       // Outputs: <nil> (still active)

  time.Sleep(5 * time.Second)
}

It is important to note that a context can only be cancelled once. Additionally, cancellation should be used when we actually want to cancel something. It should not be used to notify downstream processes that an error has occurred.

As an example, lets consider the following scenario.

  • We have a small program that, after starting up:

    • Sets up a context that waits for a SIGTERM

    • Creates a go routine to loop continuously, until it's context is cancelled

      • Inside the loop, the code calls an API (it could be polling)

      • Inside the loop, the result so the API call could be processed and results written to a DB, or log.

    • When we terminate the program, we want to gracefully stop the go routine, maybe perform some cleanup.

To demonstrate this in sample code, we will use the original version of our getFromUrl function (ie: we only use ctx, cancel := context.WithTimeout(ctx, 3*time.Second) instead of using a custom error).

Next, we would write a poller function.

func poller(ctx context.Context) {
  slog.Info("poller entry: ")
  start := time.Now()

  for {
    select {
      case <-ctx.Done():
        // Here we could do some graceful cleanup
        slog.Info("poller received cancellation signal, stopping execution: ",
        slog.Int64("since-ms", int64(time.Since(start)/time.Millisecond)),
        slog.Any("ctx-err", ctx.Err()))
        return
      default:
        err := getFromUrl(ctx, 1)
        if err != nil {
        // Note: ctx.Err() will NOT be set in the parent context
        slog.Error("getFromUrl(ctx, 1) has returned error: ", slog.String("err", err.Error()), slog.Any("ctx-err", ctx.Err()))
        } else {
        slog.Info("doing some more work, such as writing to DB: ", slog.Any("ctx.Done()", ctx.Done()), slog.Any("ctx-err", ctx.Err()))
        }
        time.Sleep(1 * time.Second) // Let's slow down the amount of logging
    }
  }
}

This function will loop "forever", executing the default code block in the select. However, once the context is cancelled, the alternative case is invoked. This would give our code the opportunity to do things such as connection cleanup for example. Having done that, we return from the function ending it's execution.

Subsequently, our main program would be written as follows:

package main

import (
  "context"
  "fmt"
  "log/slog"
  "net/http"
  "os"
  "os/signal"
  "syscall"
  "time"
)

func main() {
  ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
  defer stop()

  go poller(ctx)

  // Wait for interrupt signal
  <-ctx.Done()

  // Create a new context, and wait for 10 seconds while we cleanup
  closingCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
  defer cancel()
  time.Sleep(3 * time.Second) // We are doing clean up for example

  slog.Info("completed", slog.Any("closingCtx.", closingCtx.Err())) // Assuming we cleanup in time, we log this
}

Here, we are making use of the standard library signal package, specifically the NotifyContext function. The documentation describes it perfectly.

NotifyContext returns a copy of the parent context that is marked done (its Done channel is closed) when one of the listed signals arrives, when the returned stop function is called, or when the parent context's Done channel is closed, whichever happens first.

Having created this context, we start a new go routine executing our poller function. Then we wait for cancellation (i.e we press CTRL-C). The code is blocked by <-ctx.Done() until the context is "Done". This will also be propgated to any child contexts. To give time for any child go routines to clean up, we create a new context with a timeout (10 seconds). Then, either our clean up will complete and we exit the program, or the timeout will trigger and cause the program to end.

When we execute our program, and then terminate it, we see the following logs.

2024/10/12 14:52:36 INFO poller entry:
2024/10/12 14:52:36 INFO getFromUrl entry:  delay-value=1
2024/10/12 14:52:37 INFO getFromUrl completed:  since-ms=1027
2024/10/12 14:52:37 INFO doing some more work, such as writing to DB:  ctx.Done()=0x140000a2000 ctx-err=<nil>
2024/10/12 14:52:38 INFO getFromUrl entry:  delay-value=1
2024/10/12 14:52:39 INFO getFromUrl completed:  since-ms=1017
2024/10/12 14:52:39 INFO doing some more work, such as writing to DB:  ctx.Done()=0x140000a2000 ctx-err=<nil>
2024/10/12 14:52:40 INFO getFromUrl entry:  delay-value=1
2024/10/12 14:52:41 INFO getFromUrl completed:  since-ms=1022
2024/10/12 14:52:41 INFO doing some more work, such as writing to DB:  ctx.Done()=0x140000a2000 ctx-err=<nil>
2024/10/12 14:52:42 INFO getFromUrl entry:  delay-value=1
^C
2024/10/12 14:52:42 ERROR httpClient.Do(req) has returned error:  err="Get \"http://0.0.0.0:80/delay/1\": context canceled" ctx-err="context canceled"
2024/10/12 14:52:42 ERROR getFromUrl(ctx, 1) has returned error:  err="Get \"http://0.0.0.0:80/delay/1\": context canceled" ctx-err="context canceled"
2024/10/12 14:52:43 INFO poller received cancellation signal, stopping execution:  since-ms=7586 ctx-err="context canceled"
2024/10/12 14:52:45 INFO completed closingCtx.=<nil>

When we press ctrl-c, we signal to the parent context that we are cancelling. We can also observe that the child context is being cancelled, as shown by the error being returned by httpClient.Do(req) which is using the child context.

Request Scoped Variables

Finally, we come to a feature that is often misused, or not written in an optimal way.

Consider the following, non-optimal code.

package main

import (
 "context"
 "log/slog"
)

func main() {
  ctx := context.Background()
  ctxWithValue := context.WithValue(ctx, "key", "value")

  slog.Info("ctx",
  slog.Any("context-key", ctx.Value("key")), // This will result in nil, because we Value is added to a child context
  slog.Any("context-not-exist-key", ctx.Value("other"))) // This invalid key will result in nil result

  slog.Info("ctxWithValue",
  slog.Any("context-key", ctxWithValue.Value("key")),  // This should be logged
  slog.Any("context-not-exist-key", ctxWithValue.Value("other"))) // This invalid key will result in nil result
}
2024/10/12 14:56:10 INFO ctx context-key=<nil> context-not-exist-key=<nil>
2024/10/12 14:56:10 INFO ctxWithValue context-key=value context-not-exist-key=<nil>

Remembering that contexts are immutable, and thus we can't directly add a key/value to an existing context, we can create a new child context with a key/value pair.

Best practices

How should we use context values? We should not be using them as a dumping ground for all our parameters we want to pass a function.

The official documentation states:

Use context Values only for request-scoped data that transits processes and APIs, not for passing optional parameters to functions.

So a sensible use could be something like adding a trace identifier, or correlation identifier at the entry point to a API handler. The trace id is then propagated through the call chain via the ctx, and can be used for logging and observability purposes.

We should be mindful about performance and not adding too many values to a context. When we create a context WithValue, the presence of a Key and Value parameter seems to suggest we have a map, and that retrieving a value from a context should be O(1) time complexity.

However, remembering that contexts are immutable and child contexts wrap parent contexts, when we look at the source code for context we see the following.

func (c *valueCtx) Value(key interface{}) interface{} {
  if c.key == key {
    return c.val
  }
  return c.Context.Value(key)
}

It's recursive! This means we have O(n) time complexity.

The other concern we also have is that key and value are interface{} or any (any was introduced in Go 1.18 as an alias for interface{}).

Again, referencing the standard library documentation, we are told:

key must be comparable and should not be of type string or any other built-in type to avoid collisions between packages using context. Users of WithValue should define their own types for keys. To avoid allocating when assigning to an interface{}, context keys often have concrete type struct{}. Alternatively, exported context key variables' static type should be a pointer or interface.

In our sample above, if we are using go-staticcheck, then we would already see the following warning about our code.

should not use built-in type string as key for value; define your own type to avoid collisions (SA1029)

As an example, we will write a simple middleware function for a http server. We define a package as follows:

// Assume define this in a file <project root>/tracing/tracing.go
package tracing

import (
  "context"
)

type traceIdContextKeyType struct{}

var (
  traceIdContextKey = traceIdContextKeyType{}
)

func ContextWithTraceId(ctx context.Context, traceId string) context.Context {
  return context.WithValue(ctx, traceIdContextKey, traceId)
}

func TraceIdFromContext(ctx context.Context) string {
  if traceId := ctx.Value(traceIdContextKey); traceId != nil {
  value, ok := traceId.(string)
    if !ok {
      return ""
    }
    return value
  }
  return ""
}

traceIdContextKeyType and traceIdContextKey are not exported (ie. they are private), and thus not accessible to any code outside of this package. We then define two exported functions which add type safety for creating a new context with the traceId value, and another function for obtaining a traceId from a context.

In our server code, we would write the following.

package main

import (
  "log/slog"
  "net/http"

  "github.com/google/uuid"

  "projectroot/tracing" // This would depend on our source control and application setup
)

func traceMiddleware(next http.Handler) http.Handler {
  return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
    ctx := tracing.ContextWithTraceId(r.Context(), uuid.New().String())
    next.ServeHTTP(w, r.WithContext(ctx))
  })
}

func messagesHandler(w http.ResponseWriter, r *http.Request) {
  traceID := tracing.TraceIdFromContext(r.Context())

  slog.Info("messages handler", slog.String("trace-id", traceID))

  // Do other business logic
  // result, err := DoThing(r.Context(), ...)

  w.WriteHeader(http.StatusOK)
  w.Write([]byte("Trace-ID logged"))
}

func main() {
  mux := http.NewServeMux()

  mux.Handle("/messages", traceMiddleware(http.HandlerFunc(messagesHandler)))

  slog.Info("Server is running on http://0.0.0.0:8080")
  if err := http.ListenAndServe(":8080", mux); err != nil {
    slog.Error("error from server", slog.Any("err", err))
  }
}

We have a basic webserver that exposes a single endpoint. Our middleware (traceMiddleware) will add a trace Id to our incoming request. Then, our handler messagesHandler can use the traceId in the context, and it's free to pass that context onwards to other downstream code.

If we run our server in one terminal session, and then perform curl http://0.0.0.0:8080/messages in another session, we observe the following.

# SESSION 1
2024/10/12 15:00:49 INFO Server is running on http://0.0.0.0:8080
2024/10/12 15:00:53 INFO messages handler trace-id=ed257941-fe7d-48c9-8f71-73e9c49e8f25

# -----
# SESSION 2
curl http://0.0.0.0:8080/messages
Trace-ID logged

Although this code is very simplistic, it does demonstrate how we should use context values in a idiomatic manner.

Closing

In this blog post, we have reviewed the primary use cases for Go's context package. We have seen some simple demonstrations that show best uses, and covered some subtle details regarding performance and type safety with context values.

Lastly, in the references below, there is recommended reading material that provides additional details.

References

いいなと思ったら応援しよう!