The Ballerina Language and Platform Support for WebSockets

WebSocket is a communication protocol used for efficient full-duplex communication between web browsers and servers over Transmission Control Protocol (TCP).

In this article, we will take a look at the history of the technologies used in dynamic websites. Then, we will introduce WebSocket as the modern approach in fulfilling these requirements while fixing the shortcomings of earlier techniques.

We will use the Ballerina language to demonstrate how you can effectively use the WebSocket features.

The Dynamic Web: Looking Back

HTTP is commonly used for a typical request / response scenario. Using JavaScript, the XMLHTTPRequest object (and now the Fetch API) helped send requests from the client to servers in the background. This allows us to execute data operations without refreshing or loading another web page.

However, this didn’t support the need for server push scenarios, where requests are initiated from the server and sent to the client. So people came up with workarounds to make it possible.

A couple of those options are polling and long polling.

Regular polling works by creating a new HTTP connection that sends a request to the server looking for new updates. If there is any communication that needs to be done from the server to the client, the server will at this point return the message to the client. In the event there is nothing new, the server will reply saying so. Following the response from the server, the connection will be closed.

Figure 1 shows the high-level operations of polling.

Figure 1: HTTP Regular Polling

Here, after each poll request, we wait for a specific interval to limit the number of requests repeatedly sent to the server. However, this interval adds to a potential maximum delay in receiving a message from the server to the client. This is because in the interval periods above, if the server is to send data, the client has to wait till the next poll cycle to pick up new data from the server. This scenario can be avoided by using long polling.

In long polling, we follow a similar approach to regular polling, but rather than the server immediately returning a response with the client’s request, it blocks the request until it is ready to send some data to the client.

This process is shown below in Figure 2.

Figure 2: HTTP Long Polling

In this approach, the client initiates a request to the server, and the server holds on to the request until it has any data to be communicated to the client. The interval in our regular polling mechanism is now on the server-side so it can immediately contact the client when needed. Once the server has sent a response to the client, the client initiates another request immediately and repeats the same flow.

Also, for long polling, we will use a persistent (keep-alive) HTTP connection, since there is no need to close it as we are always in contact with the server. Compared to regular polling, long polling is a much better approach for real-time communication, since it allows instant communication from the server to the client. However, we need to keep a dedicated HTTP connection active from the client to the server, which is inevitable for this type of communication.

We have now seen how to use an HTTP-based APIs request / response flow, which is half-duplex, to emulate a full-duplex communication channel. Client libraries such as SocketIO does exactly this by abstracting out its internal operation details and providing the user an easy-to-use API.

So, if we can use long polling for our operations, what more do we need to improve?

Answer: communication data efficiency and the processing overhead the servers will have to incur. Typical HTTP requests will have a set of header values that are sent to servers, so this becomes a data overhead for the clients who may be performing many requests with small payloads. The solution for this is the WebSocket transport.

HTTP to WebSocket

WebSocket provides a low-latency communication protocol based on TCP.

The protocols work outside of HTTP and contain a minimal framing technology when sending and receiving messages. WebSocket also uses the same HTTP servers when processing WebSocket traffic, where they use the same communication channel created by an HTTP channel. This also has the added advantage of being more compatible with infrastructure components such as proxies and firewalls that are already configured to allow ports HTTP use.

Let’s take a look at how a WebSocket connection is created via the WebSocket handshake. The HTTP protocol’s upgrade feature will be used to do this. A WebSocket client will send an HTTP request with the following header value set:

Connection: Upgrade - The header value here signals to the server that we need to switch the protocol to something else.
Upgrade: Websocket - This mentions the protocol to switch to in the upgrade operation.
Sec-WebSocket-Key: Randomly generated 16-byte value (base64 encoded) - This is used by the client and the server in the WebSocket handshake to explicitly understand that they are requesting a WebSocket connection. The server does a known calculation on this value and returns a new value, which can be verified by the client. This resultant value will be provided in the “sec-websocket-accept” HTTP header in the server handshake response.
Sec-WebSocket-Protocol: The WebSocket sub-protocols that we wish to use - Here, we can provide a list of the possible protocols to use after the WebSocket connection is established. The server selects a single supported protocol and returns it to the client. Some examples include “xml”, “json”, and “mqtt”.
Sec-WebSocket-Version: 13 - The most recent WebSocket protocol version is 13.

A sample WebSocket client handshake request is shown below:

GET /ws/echo HTTP/1.1
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: RmNX6Mdh0cnQqmc8am4nng==

The corresponding server handshake response is shown below:

HTTP/1.1 101 Switching Protocols
upgrade: websocket
connection: upgrade
sec-websocket-accept: 2JZViLsWMSpKcCBZum79UYSVbu0=

At this point, the TCP connection is not working with the HTTP protocol anymore, but rather it has switched to communicating with the WebSocket protocol.

The WebSocket protocol transmits data as a sequence of frames. A single message in WebSocket can be created using multiple frames known as fragmented frames. The general format of a frame is shown below.

0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len |    Extended payload length    |
|I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
|N|V|V|V|       |S|             |   (if payload len==126/127)   |
| |1|2|3|       |K|             |                               |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
|     Extended payload length continued, if payload len == 127  |
+ - - - - - - - - - - - - - - - +-------------------------------+
|                               |Masking-key, if MASK set to 1  |
+-------------------------------+-------------------------------+
| Masking-key (continued)       |          Payload Data         |
+-------------------------------- - - - - - - - - - - - - - - - +
:                     Payload Data continued ...                :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
|                     Payload Data continued ...                |
+---------------------------------------------------------------+

Let’s take a look at a few of the most important bits of the frame header. The first is “

FIN

”, which denotes whether the current frame is the final frame in a sequence of frames. The “

opcode

” bits represent the state of the frame. The table below lists the opcodes and their descriptions.

Opcode/Description

1 : A continuation frame - which means it’s the same type of the frame that came before
2 : A text frame
3 : A binary frame
3 - 7 : Reserved
8 : Connection close
9 : Ping - a control frame used as a heartbeat to check the liveness of the server. The server should reply with a “Pong” message. 
10 : Pong - The reply to the “Ping” message
11 - 16 : Reserved

Now that we know the basics of how the WebSocket protocol works, let’s take a look at how to write applications using it.

Creating WebSocket Services

In this section, we will take a look at how to implement WebSocket-based services using the Ballerina programming language. Ballerina provides an easy to use services abstraction as a first-class language concept.

In WebSocket services, the user needs to be aware of the following primary events:

Connection creation
Data message
Control message
Connection error
Connection close

The individual events above are notified to the user through their own resource functions in a Ballerina service.

Connection Creation

This state is achieved when the WebSocket client successfully establishes a connection after a successful handshake operation. At this moment, the following resource function is called if available in the service.

remote function onOpen(websocket:Caller caller);

This resource function provides us an instance of a WebSockerCaller object, which can be used to communicate back with the WebSocket client. A general pattern of using this function is to save the caller object when the connection is created, and whenever the application wants to send messages to the connected clients it can use the stored caller objects for the communication.

Let’s look at an example of this, where we implement an HTTP service resource to broadcast a message to all the connected WebSocket clients.

import ballerina/http;
import ballerina/websocket;
 
service /ws on new websocket:Listener(8080) {
 
   resource function get .(http:Request req)
                           returns websocket:Service|
                                   websocket:Error {
       return new WsService();
   }
 
}
 
websocket:Caller[] callers = [];
 
service class WsService {
 
   *websocket:Service;
 
   remote function onOpen(websocket:Caller caller) {
       callers.push(caller);
   }
 
}
 
service /broadcaster on new http:Listener(8081) {
   resource function post broadcast(@http:Payload {} string payload)
                                    returns string|error? {
       foreach var targetCaller in callers {
           check targetCaller->writeTextMessage(payload);
       }
       return "OK";
   }
}

In the code above, the first service is bound to a WebSocket listener. The provided request implementation at

/ws

signals to the system that it is executing an HTTP upgrade to the WebSocket protocol. After the protocol upgrade is done, the

WsService

service will assume the functionality of a WebSocket service.

Let’s run the code above to test out the scenario:

$ ballerina run ws_open.bal
Compiling source
        ws_open.bal
Running executables

[ballerina/http] started HTTP/WS listener 0.0.0.0:8080
[ballerina/http] started HTTP/WS listener 0.0.0.0:8081

Let’s open multiple web browser tabs, and start the developer tools JavaScript console. Here, enter the following lines to create a WebSocket client to connect to our server and to register a callback to print any message received from the server.

var ws = new WebSocket("ws://localhost:8080/ws/subscribe");
ws.onmessage = function(frame) {console.log(frame.data)};

Now let’s send an HTTP request to the “broadcaster” service we have deployed, in order to send messages to the WebSocket clients that were stored in our application.

$ curl -d "Hello!" http://localhost:8081/broadcaster/broadcast

We can now see the message given here in all the browser tabs we opened with the WebSocket clients.

Sub-Protocol Handling

When a WebSocket connection is created, we can provide a list of sub-protocols that the client can handle in the priority order. This is done in the following manner when the WebSocket client is created.

var ws = new WebSocket("ws://localhost:8080/ws/subscribe", ["xml", "json"]);

The sub-protocols are given in the second parameter in the WebSocket constructor, which can either give a single string value or an array of strings. In the statement above, we are requesting either “

xml

” or “

json

” to be used as the protocol.

On the server-side, it will be configured to handle zero or multiple sub-protocols. This sub-protocol list will be inspected when the client is requesting a specific protocol and the server will check the client’s protocol list in priority order to see if it is supported in the given service. If it finds a match, it will return this single first-matched protocol to the client.

The server-side configuration of sub-protocols is done using the

websocket:ServiceConfig

annotation using its “

subProtocols

” field. An example of this usage is shown below, where we update our earlier

/ws

service to negotiate a sub-protocol and print the selected one in connection open.

@websocket:ServiceConfig {
   subProtocols: ["mqtt", "json"],
   idleTimeoutInSeconds: 120
}
service /ws on new websocket:Listener(8080) {
 
   resource function get .(http:Request req)
                           returns websocket:Service|
                                   websocket:Error {
       return new WsService();
   }
 
}
 
service class WsService {
 
   *websocket:Service;
 
   remote function onOpen(websocket:Caller caller) {
       callers.push(caller);
       io:println("Negotiated sub-protocol: ",
                   caller.getNegotiatedSubProtocol());
 
   }
 
}

Here, we have configured our service to support “

json

” and “

mqtt

” sub-protocols. Let’s consider the scenario of a WebSocket client created with “

xml

”, “

json

” sub-protocols as shown in the example above. This will print the following in the standard output of the service execution location when a connection is created.

$ ballerina run ws_open.bal
Compiling source
        ws_open.bal
Running executables

[ballerina/http] started HTTP/WS listener 0.0.0.0:8080
[ballerina/http] started HTTP/WS listener 0.0.0.0:8081
Negotiated sub-protocol: json

The service has negotiated to use the “

json

” protocol since the client’s highest priority “

xml

” is not supported, but the second option of “

json

” is available as a supported sub-protocol in our service.

Data Message

A data message is received when a WebSocket client either sends a text or a binary message to a WebSocket service. The following remote functions are called, if available, in the service to handle text and binary messages respectively.

remote function onTextMessage(websocket:Caller caller, string text);
 
remote function onBinaryMessage(websocket:Caller caller, byte[] data);

To demonstrate the functionality above, let’s write a simple WebSocket service class that will echo the message we send to it:

service class WsService {
 
   *websocket:Service;
 
   remote function onOpen(websocket:Caller caller) {
       callers.push(caller);
       io:println("Negotiated sub-protocol: ",
                   caller.getNegotiatedSubProtocol());
 
   }
 
   remote function onTextMessage(websocket:Caller caller,
                                 string text) returns error? {
       check caller->writeTextMessage("Echo: " + text);
   }
 
   remote function onBinaryMessage(websocket:Caller caller,
                                   byte[] data) returns error? {
       check caller->writeBinaryMessage(data);
   }
 
}

$ ballerina run ws_demo.bal
Compiling source
        ws_demo.bal
Running executables

[ballerina/http] started HTTP/WS listener 0.0.0.0:8080

Now the program is compiled and the service is up and running at port 8080. Now we can open up developer tools in a web browser, such as Firefox and Chrome, type in the following statements to create a WebSocket client, and send some data to the server.

var ws = new WebSocket("ws://localhost:8080/ws/echo");
ws.onmessage = function(frame) {console.log(frame.data)};
ws.send("Hello!");

The execution of the command above results in the following message printed in the console:

You said: Hello!

This is the response returned from our WebSocket service.

Control Message

WebSocket contains two control messages: “

ping

” and “

pong

”. A WebSocket server or a client can send a “

ping

” message, and the opposite side should respond with a corresponding “

pong

” message by returning the same payload sent with the “

ping

” message. These ping/pong sequences are used as a heartbeat mechanism to check if the connection is healthy.

The user does not need to explicitly control these messages as they are handled automatically by the services and clients. But if required, we can override the default implementations of the ping / pong messages. This is done by providing implementations to the following remote functions in a WebSocket service.

remote function onPing(websocket:Caller caller, byte[] data);
 
remote function onPong(websocket:Caller caller, byte[] data);

An example implementation of the ping/pong functions is shown below:

remote function onPing(websocket:Caller caller,
                       byte[] data) returns error? {
    io:println(string `Ping received with data: ${data.toBase64()}`);
    check caller->pong(data);
}
 
remote function onPong(websocket:Caller caller,
                       byte[] data) {
    io:println(string `Pong received with data: ${data.toBase64()}`);
}

Connection Error

In the event of an error in the WebSocket connection, the connection will be automatically closed by generating the required connection close frame. The following remote function can be implemented in the service to receive the notification that this is going to happen and perform any possible cleanup or custom logging operations.

remote function onError(websocket:Caller caller, error err);

Connection Close

In the event the connection is closed from the client-side, the service will be notified by calling the remote function below:

remote function onClose(websocket:Caller caller, int statusCode,
                        string reason);

A sample implementation of this remote function, which logs the information about the connection closure is shown below:

remote function onClose(websocket:Caller caller, int statusCode,
                        string reason) {
    io:println(string `Client closed connection with ${statusCode} because of ${reason}`);
}

Securing WebSocket Communication

Whenever possible, we should use WebSocket with TLS. This makes sure our data communication is secure through the network. For our WebSocket service to be compatible with this approach, we configure a secure socket for our WebSocket listener. This WebSocket listener is the one used in the WebSocket upgrade, so it will be upgrading a TCP connection with TLS.

Afterward, in our WebSocket client, we can use the `

wss

` protocol scheme to connect to a secure WebSocket server. An example of a secure WebSocket client initialization is shown below:

var ws = new WebSocket("wss://localhost:8443/ws");

Let’s update our initial WebSocket echo service to enable TLS on the communication channel:

As a prerequisite, we need to first create a certificate to be used in the service and the client. In this scenario, we will be generating a self-signed certificate by using the OpenSSL tool.

$ openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem
$ openssl pkcs8 -topk8 -nocrypt -in key.pem -out pkcs8_key.pem

import ballerina/http;
import ballerina/websocket;
 
websocket:ListenerConfiguration wsConf = {
  secureSocket: {
      certFile: "/path/to/cert.pem",
      keyFile: "/path/to/pkcs8_key.pem"
  }
};

service /ws on new websocket:Listener(8443, config = wsConf) {
 
   resource function get .(http:Request req)
                           returns websocket:Service|
                                   websocket:Error {
       return new WsService();
   }
 
}
 
websocket:Caller[] callers = [];
 
service class WsService {
 
   *websocket:Service;
 
   remote function onTextMessage(websocket:Caller caller,
                                 string text) returns error? {
       check caller->writeTextMessage("Echo: " + text);
   }
 
}

$ bal run ws_messages.bal
Compiling source
        ws_messages.bal
Running executables
 
[ballerina/http] started HTTPS/WSS listener 0.0.0.0:8443

As we can see in the code above, we simply created a WebSocket listener by providing the

ListenerConfiguration

, which contains the secure socket parameters.

Since we are using a self-signed certificate in this example, web browsers will generally reject secure WebSocket connections. So let’s write a Ballerina WebSocket client to make a connection and send requests to the service above:

import ballerina/websocket;
import ballerina/io;
 
websocket:ClientConfiguration wsConf = {
    secureSocket: {
        trustedCertFile: "/path/to/cert.pem"
    }
};
 
public function main() returns error? {
    websocket:Client wsClient = check new ("wss://localhost:8443/ws",
                                            config = wsConf);
    check wsClient->writeTextMessage("Hello!");
    string resp = check wsClient->readTextMessage();
    io:println("Response: ", resp);
}

$ bal run ws_client.bal

Compiling source
        ws_client.bal

Running executable

Response: Echo: Hello!

In the code above, we have created a

websocket:Client

by providing the websocket:ClientConfiguration value containing the secure socket parameters. From here onwards, any communication done from the client to the WebSocket server will be done with TLS.

Summary

In this article, we have delved into the historical techniques we used to implement a dynamic web experience for web pages, and introduced WebSockets as a modern approach in full-duplex communication between web pages and servers.

We provided an overview of the Ballerina language and platform support for WebSockets, where the language’s services abstraction fits in intuitively to the operations defined for its communication.

For more information on the Ballerina language and its features, refer to the resources in the Ballerina Learn Page.

Previously published here.