Many applications rely on gRPC to connect services, but a number of modern load balancers still do not support HTTP/2, and, in turn, gRPC. In an earlier blog post, we showed a way to take advantage of the gRPC-Web protocol to circumvent this issue. That solution works well for non-client-streaming gRPC calls — with this new approach, we can support client/bidirectional-streams.

In our earlier writing, we briefly mentioned that WebSockets may actually help us resolve our client/bidi-streaming problem. Since we published that article, we have been able to implement a solution using WebSockets, which we discuss here.

gRPC over WebSocket

The WebSocket protocol is ideal for our needs: it is HTTP/1.x compatible, supported by many modern load balancers, and client/bidi-streaming capable.

Luckily, a comprehensive specification is available for the gRPC protocol, so we have been able to transcode gRPC requests/responses into WebSocket messages without any guesswork. The workflow is as follows:

  1. The client initiates a gRPC request to the server
  2. The client initiates a WebSocket connection with the server
  3. The server accepts the WebSocket connection
  4. The client transcodes the gRPC request on the fly and sends it via the WebSocket connection
  5. The server reads the request off the WebSocket connection and responds via the same connection
  6. The client reads the response off the WebSocket connection
  7. The server closes the WebSocket connection upon completion

Similar to our gRPC-Web “downgrade” implementation, we opt for spawning a local HTTP/2 client-side proxy to handle the transcoding and WebSocket connection. While this approach does add an extra hop in the network communication (albeit via a local in-memory pipe, net.Pipe), it is less invasive than modifying gRPC client-library code. We also opt for using a modified server handler which does not add an extra hop in the network communication. An example workflow follows:

gRPC-WebSocket-2_bpg3a0

Transcoding Requests

The gRPC protocol defines a request as a stream of bytes defined as follows in ABNF form (note: we will show only the subset of definitions required for this discussion):

Request -> Request-Headers *Length-Prefixed-Message EOS

Length-Prefixed-Message -> Compressed-Flag Message-Length Message

Compressed-Flag -> 0 / 1 # encoded as a 1-byte unsigned int

Message-Length -> {message length} # 4-byte big-endian unsigned int

We won’t worry about the Request-Headers, as they remain untouched and are simply forwarded along the wire with the initial WebSocket connection request. Instead, we will focus on Length-Prefixed-Message, which is the request’s body.

We cannot wait and send the entire series of length-prefixed messages as one payload, as the request may be client-streaming. So, we opt for a different approach: send a new WebSocket message one Length-Prefixed-Message at a time. We know how long each message is, as it is length-prefixed, so we may just simply forward each message along the WebSocket connection.

(Side note: we can rely on intermediate proxies to respect WebSocket message boundaries and to not buffer individual WebSocket messages. A WebSocket message can be split up into multiple WebSocket frames, but that is an implementation detail.)

The last concern is to find a way to signal the server that we are done sending messages. The gRPC protocol handles this step by setting the HTTP/2 END_STREAM flag on the final HTTP/2 data frame. However, Golang’s HTTP/2 library does not give us access to low-level constructs such as HTTP flags or any other part of the HTTP/2 framing. To signal completion, we take inspiration from the gRPC-Web protocol.

Recall that a gRPC request is simply a sequence of bytes, and the body is a series of Length-Prefixed-Messages. The first byte of each Length-Prefixed-Message represents the Compressed-Flag, which will be set to either 0 or 1. This construct leaves us with seven unused bits.

The gRPC-Web protocol uses the most-significant bit (MSB) in the Compressed-Flag byte to indicate that a message contains trailing metadata. We adopt the same approach for the data sent by the client: the final message sent by the client will always have the MSB set. Since the client does not send any trailing metadata, the End-of-Stream (EOS) message is uniquely identified by having the MSB in the first byte set and a Message-Length of zero.

Transcoding Responses

The gRPC protocol defines a response as follows in ABNF form (note: we will show only the subset of definitions required for this discussion):

Response -> (Response-Headers *Length-Prefixed-Message Trailers) / Trailers-Only

Trailers -> Status [Status-Message] *Custom-Metadata

Trailers-Only -> HTTP-Status Content-Type Trailers

The process for transcoding responses is actually very similar to transcoding requests. However, instead of sending the headers as normal HTTP request headers, we send all bytes of the response as part of the WebSocket stream. The WebSocket protocol is a handshake-based protocol, so the actual HTTP response headers, which initiate the WebSocket connection, are sent immediately upon connection creation. The gRPC server may delay sending header metadata arbitrarily, so the header metadata must be sent via the stream.

Now we have to deal with Trailers, as the Length-Prefixed-Message is the same as before. Trailers/Trailing metadata is actually handled in exactly the same way as the gRPC-Web protocol — sent as a normal response message with the MSB set in the leading byte — with the sole difference that the message is, of course, sent via a WebSocket stream, not as part of a normal HTTP response body.

(Side note: Again, we can rely on intermediate proxies to respect WebSocket message boundaries and to not buffer individual WebSocket messages, as some proxies may do with chunked HTTP/1.x responses.)

Updates to our Open-Source Repository

We recently announced we open-sourced our gRPC via HTTP/1 repository, and we have since updated the repository with the new functionality presented here.

Using the Library in Your Client Code

As mentioned in the previous post, we export a function ConnectViaProxy, which replaces the typical grpc.DialContext. The function header is the same as before, but we have added a new ConnectOption, UseWebSocket. When set, the library will use WebSockets to connect to the server; otherwise, we default to the gRPC-Web “downgrades.”

(Side note: we introduced two ConnectOptions previously: ForceHTTP2 and ExtraH2ALPNs. These options are ignored when WebSockets are enabled.)

An example usage is:

ctx := context.Background()
targetAddr := "https://my.example.com"
tlsClientConf := &tls.Config{}

// Traditional, non-proxy client
// cc, _ := grpc.DialContext(ctx, targetAddr, grpc.WithTransportCredentials(credentials.NewTLS(tlsClientConf))...)

// With the proxy client
cc, _ := client.ConnectViaProxy(ctx, targetAddr, tlsClientConf, client.UseWebSocket(true)...)

echoClient := echo.NewEchoClient(cc)

...
Why not always use WebSockets?

Though we could use WebSockets for all types of gRPC workflows (streaming and non-streaming), WebSockets come with their own baggage: they require an initial handshake, which adds latency, and they are not compatible with a standard gRPC server.

We leave choosing between the two approaches — gRPC-Web “downgrades” and WebSockets — to the user. We recommend using the gRPC-Web “downgrades” if the only requirement is to support unary requests, and we recommend using WebSockets otherwise. The gRPC-Web “downgrade” solution does support server-streaming requests, since HTTP/1.x supports it; however, it is possible an intermediate proxy could choose to buffer chunked responses, thereby not supporting server-streams.

(Side note: choosing the WebSockets approach means WebSockets will be used even when an intermediate HTTP/2 incompatible proxy is not present. The gRPC-Web “downgrade” solution takes an adaptive approach, and it will only downgrade if an intermediate incompatible proxy exists.)

In the future, we would like to have auto-sensing logic to detect the presence of a proxy and whether it supports server-streaming — then we will be able to automatically determine which solution to use for a given request.

Using the Library in Your Server Code

With these updates, the CreateDowngradingHandler signature needs no changes. However, it’s important to note that if the user wishes to use WebSockets, then the downgrading handler must be used, as traditional gRPC servers do not support WebSockets. Users are still not required to use the downgrading server handler when the client library does not use WebSockets.

An example usage is:

grpcSrv := grpc.NewServer()
echo.RegisterEchoServer(grpcSrv, echoService{})
// 0 means the port is chosen for us.
lis, _ := net.Listen("tcp", "127.0.0.1:0")

// Traditional, non-downgrading-capable gRPC server
// grpcSrv.Serve(lis)

// With downgrading-capable gRPC server, which can also handle HTTP.
downgradingSrv := &http.Server{}

var h2Srv http2.Server
_ = http2.ConfigureServer(downgradingSrv, &h2Srv)
downgradingSrv.Handler = h2c.NewHandler(server.CreateDowngradingHandler(grpcSrv, http.NotFoundHandler()))

downgradingSrv.Serve(lis)
Conclusion

If you find yourself wanting to use gRPC in an environment where only HTTP/1.x traffic is allowed, you no longer need to fret! We have added our WebSockets solution to our already open-sourced solution to expand support to client/bidirectional streaming, and we welcome users and contributors.