Handling errors in gRPC and go-kit services

If you look up how to handle custom errors in gRPC/go-kit services you will probably find a few sketchy instructions but no system. A frustrating omission since custom errors are useful and common in Go projects. Here’s my attempt.

Define an error message in protobuf with the fields/data needed within and across the services.
Implement Go’s error interface for the generated error so it can be passed around like other errors in Go.
Support stack traces so you can see the source of an error and understand how it came to be.
Respond with the error in a field on the response thereby exposing the error to other services.

Define a useful error message in protobuf

What makes for a useful error? To start, our error needs a message like “user not found” and a code like 404. Additionally, nested errors work for situations like validations, where an error consists of subordinates — where the message could be something like “invalid user” with nested errors {“email”: “must be set”, “password”: “must be stronger”}. A slice of strings is used to catch all other details — for instance if you wanted a localized, user-facing message; put it in the details.

// example.proto

message Error {string message = 1;int32 code = 2;map<string, string> nested_errors = 3;repeated string details = 4;bytes stack = 5 [(gogoproto.customtype) = "Stack"];}

Implement the error interface

By implementing the error interface, you can pass around and use the error like any other error handled writing Go. You’ll want your functions to return your custom error as the error interface too, otherwise you’ll hit this confusing behavior: Why is my nil error value not equal to nil?

It’s a good idea for functions that return errors always to use the error type in their signature (as we did above) rather than a concrete type such as *MyError, to help guarantee the error is created correctly. As an example,[os.Open](https://golang.org/pkg/os/#Open) returns an error even though, if not nil, it's always of concrete type [*os.PathError](https://golang.org/pkg/os/#PathError).

I’ve read opinions on the web, mostly from the Google camp, saying compiled protobuf code should be kept separate from your own handwritten code, making this API impossible in Go. Instead you’d define an error in protobuf, define an error in your code, and convert one to the other as needed. But in practice handwritten code alongside the protobufs has not posed a problem and benefits one with APIs such as this and other useful APIs like getters for commonly retrieved nested fields. So go ahead I say.

// error.go

// assert Error implements the error interface.var _ error = &Error{}

// Error implements the error interface.func (e *Error) Error() string {b := new(bytes.Buffer)e.printStack(b)pad(b, ": ")b.WriteString(e.Message)if b.Len() == 0 {return "no error"}return b.String()}

Support stack traces

Stack traces make it easy to find the source of the errors and understand how they came to be.

/path/main.go:20: github.com/travisjeffery/example.c:/path/main.go:16: ...b:/path/main.go:12: ...a: shit's on fire, yo

I used gogo’s protobuf fork to add the stack field with a custom type, which isn’t supported by the official Golang protobuf. The field being public is annoying and I don’t want its contents marshaled/unmarshaled — the former I can’t do anything about because protobuf gets pissy over private fields, the latter I can fix by implementing the marshaler/unmarshaler interfaces to nop.

// stack.go

type Stack struct {Callers []uintptr}

func (t Stack) Marshal() ([]byte, error) {return nil, nil}

func (t *Stack) MarshalTo(data []byte) (n int, err error) {return 0, nil}

func (t *Stack) Unmarshal(data []byte) error {return nil}

func (t Stack) MarshalJSON() ([]byte, error) {return []byte(`null`), nil}

func (t *Stack) UnmarshalJSON(data []byte) error {return nil}

func (t *Stack) Size() int {return 0}

I stole the code for populating and printing the stack traces from upspin, for the most part anyway. The stack is populated when the errors are created with the WrapErr and E functions and printed with the Error() function. If you wanted an error without a stack trace then you’d instantiate one yourself.

WrapErr is similar to Dave Cheney’s errors.Wrap, taking an error and wrapping it with a message. I use WrapErr at the lowest level of my code to annotate errors returned by the stdlib or a third-party pkg. E takes in a list of args of various types and instantiates a corresponding error, for example E(“user not found”, 404) would match the string to the Error’s message and the int to its code.

// error.go

// WrapErr returns an Error for the given error and msg.func WrapErr(err error, msg string) error {if err == nil {return nil}e := &Error{Message: fmt.Sprintf("%s: %s", msg, err.Error())}e.populateStack()return e}

// E is a useful func for instantiating Errors.func E(args ...interface{}) error {if len(args) == 0 {panic("call to E with no arguments")}e := &Error{}b := new(bytes.Buffer)for _, arg := range args {switch arg := arg.(type) {case string:pad(b, ": ")b.WriteString(arg)case int32:e.Code = argcase int:e.Code = int32(arg)case error:pad(b, ": ")b.WriteString(arg.Error())}}e.Message = b.String()e.populateStack()return e}

Respond with an error field

Errors are exposed to other services by putting an error field on each response. At the edges of the service, the endpoints, the error’s type is asserted and set on the response. If you have a HTTP service acting as a gateway to your gRPC service, you can use the RPC’s error code to set the HTTP response’s status code.

// example.proto

service Users {rpc GetUser (GetUserRequest) returns (GetUserResponse) {}}

message GetUserResponse {User user = 1;Error error = 2;}

// endpoints.go

func makeGetUserEndpoint(userService user.Service) endpoint.Endpoint {return func(ctx context.Context, request interface{}) (interface{}, error) {req := request.(*GetUserRequest)user, err := userService.GetUser(req.User)if err != nil {return &GetUserResponse{Error: err.(*Error)}, nil}return &GetUserResponse{User: user}, nil}}

// transport.go

r.Methods("GET").Handle("/users/{id}", httptransport.NewServer(endpoints.GetUserEndpoint,decodeGetUserRequest,encodeResponse,options...,))

type errorer interface {GetError() *example.Error}

func encodeResponse(ctx context.Context, w http.ResponseWriter, response interface{}) error {if e, ok := response.(errorer); ok {if err := e.GetError(); err != nil {w.WriteHeader(int(err.Code))}}w.Header().Set("Content-Type", "application/json; charset=utf-8")return json.NewEncoder(w).Encode(response)}

See it all together

I’ve pushed an example to GitHub so you can check everything out.

This workflow works swell but if you got opinions on something better I’d like to hear them.

Why not metadata/return the error to gRPC

I tried writing an API where I’d return the error to gRPC rather than setting it on the response. Middleware on the server would encode the error to metadata and middleware on the client would decode the error and return it, making the gRPC call work exactly like a local function call.

// example.proto

service Users {rpc GetUser (GetUserRequest) returns (GetUserResponse) {}}

message GetUserResponse {User user = 1;// Error's gone}

// endpoints.go

func makeGetUserEndpoint(userService user.Service) endpoint.Endpoint {return func(ctx context.Context, request interface{}) (interface{}, error) {req := request.(*GetUserRequest)user, err := userService.GetUser(req.User)if err != nil {return nil, err}return &GetUserResponse{User: user}, nil}}

// client.goresp, err := usersClient.GetUser(ctx, &GetUserRequest{Email: "email@example.com"})if err != nil {...}

The problem is go-kit doesn’t give you an opportunity to encode the error to the context. If your endpoint returns an error go-kit will send it straight to gRPC which will only know to take the Error() string of the error, missing the rest of the data. The downside of not supporting this is your client has to check for two errors when it makes a request, the returned error and the request error. I’d prefer go-kit support it, but it seems it’s not gonna happen.

–

Please say hi at @travisjeffery.

Hit the 👏 and share if you found this useful.

Thanks for reading.