Rethinking Programming: Network-Aware Type System

Written by lafernando | Published 2020/04/04
Tech Story Tags: programming | microservices | cloud-native | ballerinalang | networking | structured-data | data-integration | hackernoon-top-story

TLDR Rethinking Programming: Network-Aware Type System Rethinks Programming. 4,439 reads by Fernando Fernando, lafernando Anjana Fernando. Fernando: Ballerina has a network-friendly type system that helps the developer a great deal in the domain of network-distributed programming. The type system gives you the foundation in representing data in your programs and implementing logic. In the current development landscape, more and more functionality is available in the cloud. This transforms our implementation of business logic into mostly service integration through the network.via the TL;DR App

Introduction

In a programming language, the type system gives you the foundation in representing data in your programs and implementing logic. It provides the means of creating abstractions to the solutions you are providing. Some programming languages will provide you the most basic functionality, while others will give you in-built functionality for specialized domains, such as data frames for data processing jobs, as found in languages such as Python and R, and vectors and complex numbers for numerical computing as seen in Matlab and Octave. 
But how about in the domain of networked-services programming? This is not an area that has been ventured into frequently. In the current development landscape, more and more functionality is available in the cloud. This transforms our implementation of business logic into mostly service integration through the network. Basically, our application logic and integration code are converging.
There will no longer be heavy-weight centralized ESBs intercepting all application requests to make integration decisions elsewhere, but rather, the endpoints themselves have got smarter. This inversion of responsibility provides much more power and control to the developer and ultimately results in increased productivity and reduced release cycle times. 
As the developer is provided with the added responsibility of working with networked-resources in their code, it is critical that the programming language itself aids in this operation. This will involve aspects such as network communication resiliency, discovery, data mapping, transformation, and more.
Having an effective type system can greatly help in the core operations of these features. Ballerina is a programming language that was built ground-up with these requirements in mind. It has a network-friendly type system that helps the developer a great deal in the domain of network-distributed programming. In the following sections, we will take a look at this in detail. 

Statically Typed and Structural

Ballerina is a statically typed language, which means, all the types of variables are resolved at compile-time. Languages such as Javascript and Python are dynamically typed, where its variables do not have a specific type assigned, but rather, any value can be assigned to the variables. Java and C++ are statically typed, where similar to Ballerina, its types are checked at compile-time and only compatible values can be assigned. Statically typed languages are generally more robust, easier to debug, and aids in creating better tooling for the language. This comes as a tradeoff between the flexibility for a developer vs safety. Ballerina has chosen the path of code safety and better productivity in the long run. 
Also, Ballerina’s type system is structural, as opposed to a nominal type system. This means the type compatibility is figured out by considering the structure of the value. In a nominal type system, this is bound to the name of the actual type. Languages such as Java, C++, and C# follow a nominal type system. 
A type in Ballerina works with an abstraction of a value called a shape. A shape basically ignores the storage identity of a value. This means it does not take into account the storage location of a value when it is compared with other values. Let’s understand this more using a record type in Ballerina. Records in Ballerina have storage identity, which means, the respective variables will store a reference to the value rather than storing the actual value in the variable.
This is comparable to references in Java, or pointers in C++. Basic simple types in Ballerina, such as
int
,
float
, and
boolean
, do not have a storage identity and their values are directly stored in the variables. 
type DoorState record {|
    boolean open;
    boolean locked;
|};
Let’s create some values of the above DoorState record type. 
DoorState v1 = { open: false, locked: true };
DoorState v2 = { open: false, locked: true };
DoorState v3 = { open: false, locked: true };
The three variables above —
v1
,
v2
, and
v3
— all represent a single state of the door being closed and locked. But nonetheless, we have created three different values where each of the variables is stored in three distinct memory references. In this manner, we can actually create an infinite number of values with the
DoorState
type.
If we ignore the storage identity of the above values, we simply get the representation of the data it has, which is
{ open: false, locked: true }
. This is a single shape of the type
DoorState
. In this way, there are four possible shapes for
DoorState
. These four shapes are represented in the values referenced in the following variables.
DoorState ds1 = { open: true, locked: true };
DoorState ds2 = { open: true, locked: false };
DoorState ds3 = { open: false, locked: true };
DoorState ds4 = { open: false, locked: false };
A type in Ballerina represents a set of the possible shapes it can have. Any value which belongs to either of the above four shapes will be considered to be of the type
DoorState
Figure 1: Set of shapes of the type
DoorState
Figure 1 above shows the set of elements that contain the shapes that define the type
DoorState
.  
In Ballerina, it follows a semantic subtyping mechanism. It is defined by means of shapes, where S is a subtype of T, if the shapes representing S are a subset of the shapes representing T. Let’s demonstrate this behavior with a few examples. 
The type
boolean
is a simple basic type in Ballerina without a storage identity so its values become equivalent to its shapes. Therefore the
boolean
type is defined as having two shapes
true
and
false
The boolean type’s shapes can be defined in set notation as
sb = { true, false }
This can be visualized as seen in Figure 2 below. 
Figure 2: Set of shapes of the type
boolean
Now, according to our subtyping rules, we can derive new types based on the
boolean
type by creating subsets of its shapes. For example, a new type we can create is
boolean_false
where its only supported shape/value would be false. The new type is shown in Figure 3 below. 
Figure 3: Shapes sets of types
boolean
and
boolean_false
The new type
boolean_false
can be defined in Ballerina code in the following manner: 
type boolean_false false;
Here, we are using the value
false
in defining the new type
boolean_false
. In a more practical scenario, we can provide multiple values as a union in defining new types. A type created with a single value is called a singleton type. This new type can be used in the code in the following way. 
boolean_false bv1 = false;
boolean bv2 = true;
bv2 = bv1;
Here, we defined a new variable
bv1
of type
boolean_false
, and for this, we can assign only the value
false
, since this is the only supported value in the type. We also declare a variable
bv2
of type
boolean
, and later on, we assign the value of
bv1
to
bv2
. This assignment is possible because the
bv1
’s type is a subtype of
bv2
’s type. In simple terms, all the values that can be held in the variable
bv1
can be held in the variable
bv2
, thus the assignment is possible. 
The example above is not really useful in a practical scenario, but it demonstrates the basic idea of subtyping. For a more real-life example, let’s create a new type for holding the day of a month. 
A regular
int
’s value space is between -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807. So the set which represents the shapes/values contains all the integers between those numbers. The new type we create will contain integer values from 1 to 31. 
type day_of_month 1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31;
The Ballerina code above defines the new type
day_of_month
as a union type. 
A union type in Ballerina is created by using the syntax
 T1|T2
, this basically means, the type
T1|T2
represents the union of the value spaces of type
T1
and
T2
. In the
day_of_month
type above, it is defined as a union of 31 singleton types representing each day of the month. The following code snippet shows the usage of the new type. 
day_of_month dm1 = 10;
day_of_month dm2 = 14;
day_of_month dm3 = 31;
int dm4 = dm3;
Here, we see how the
day_of_month
type is used in defining variables that can only store values between 1 to 31. Also, it shows how a
day_of_month
value can be assigned to an
int
variable, where
day_of_month
is a subtype of
int
since its value space is a subset of
int
’s value space. 
We have now seen how Ballerina’s subtyping works in relation to simple types. Let’s take a look at creating subtypes of records by revisiting our
DoorState
scenario. Here, we are going to create a new type
EmergencyDoorState
, where the
locked
field has to always have the value
false
. The resultant types and their shapes can be seen below in Figure 4. 
Figure 4: Shapes sets of types
DoorState
and
EmergencyDoorState
The type definition of
EmergencyDoorState
type is shown below:
type EmergencyDoorState record {|
    boolean open;
    boolean_false locked = false;
|};
In the above type, we have modified the field
locked
to be of type
boolean_false
, which makes it only have the value
false
. Here, we have also made the usage of default values in records in Ballerina.
So, in this case, the user does not have to always provide the value of a field, but rather in the absence of an explicit value provided by the user, the default value mentioned here will be used.
In this manner, the type
EmergencyDoorState
can only have the shapes
{ open: true, locked: false }
and
{ open: false, locked: false }
. These two elements make it a subset of the
DoorState
shapes set, thus
EmergencyDoorState
is a subtype of
DoorState
The following code snippet shows a sample usage of the
EmergencyDoorState
type. 
EmergencyDoorState eds1 = { open: true };
DoorState eds2 = eds1;
io:println("Door - Open: ", eds2.open, " Locked: ", eds2.locked);

Benefits of a Structural Type System

Having a structural type system is mainly beneficial when you have multiple systems interacting with each other, and data exchange and type compatibility can be resolved easier in this way. Let’s dive into an example that shows this behavior using Ballerina’s integrated query functionality. 
type Result record {|
    string name;
    string college;
    int grade;
|};
...
Result[] results = from var person in persons
                     where lgrade > 75
                       let int lgrade = (grades[person.name] ?: 0),
                       string targetCollege = "Stanford"
                       select {
                         name: person.name,
                         college: targetCollege,
                         grade: lgrade
                       };
In the example above, we filter records from a list of records and create a new value using the
select
clause, where its structure is defined dynamically at that time and the values are created. These values are assigned to an array of
Result
records, which is possible because the generated values are structurally compatible with the
Result
record type. 
In situations such as above, a separate system from our core application may be generating values to be consumed by us. In these cases, instead of worrying about sharing the code for the data type definitions and so on, you can simply concentrate on the compatibility of the data in order to ensure interoperability. 

Open-by-Default

Ballerina’s open-by-default concept is tied around the Robustness Principle. Here, we should design network-aware programs to accept all of the data that is sent to you and make the best effort to understand it. But when you are sending data, make your best effort to conform to the standard protocols that were agreed upon beforehand. This strategy makes sure we have the best chance of interacting with different systems in a reliable manner. 
The main facilitator of this in the type system is the open record concept in Ballerina. So far we have looked at closed records. Let’s take a look at a record type to represent the details of a person.
type Ethnicity "Asian"|"White"|"African American"|"Native American/Alaskan Native"|"Pacific Islander"|"Native Hawaiian";
type Person record {
    string name;
    int birthYear;  
    boolean married = false;
    Ethnicity ethnicity?;
};
Here, the type
Person
is an open record type, defined with an inclusive-record-type-descriptor by using the “
{
“ and “
}
” delimiters. This is the default behavior when defining record types in Ballerina. An open record is not limited to the fields that are declared in the record type. But rather, we can set additional fields that are not explicitly mentioned in the type definition.
The earlier
DoorState
record type was defined explicitly as a closed record type with an exclusive-record-type-descriptor by using the “
{|
” and “
|}
” delimiters in the definition. Because it was a closed record, we were able to list out all the possible shapes in the
DoorState
type. If this type was defined as an open record, we would have an infinite number of shapes since
DoorState
values can have any arbitrary field set in the code.
The
Person
record type above has an optional field ethnicity. In a record type, a field name with the suffix “
?
” makes it an optional field. This means the field value of ethnicity of a
Person
record can be skipped without setting a value. Later on, this field can be accessed using the “
?.
” operator, which would return a value of type
Ethnicity?
, which is equivalent to the union type
Ethnicity|()
. In Ballerina, the nil value and the type is represented by
()
.
Let’s create a new type
Student
, which will be a subtype of the
Person
type.
type Student record {
    string name;
    int birthYear;  
    boolean married;
    Ethnicity ethnicity?;
    string college;
};
The
Student
type defined above has an extra field
college
of type string compared to the
Person
type. All the possible shapes in the
Student
type are included in the set of shapes in
Person
as well.
This is possible because the
Person
type is an open type, and its shapes can have the
string
field called
college
as well. By any chance, if we make the
Person
type a closed record,
Student
will no longer be a subtype of
Person
Sample usage of the above types is shown below:
public function main() {
   Student s1 = { name: "Tom", birthYear: 1990, married: false,
                  college: "Yale" };
   Student s2 = { name: "Anne", birthYear: 1988, married: true,
                  ethnicity: "White", college: "Harvard" };
   Person p1 = s1;
   Ethnicity? eth = p1?.ethnicity;
   if eth != () {
       io:println("P1's ethnicity: ", eth);
   } else {
       io:println("P1's ethnicity: N/A");
   }
   Person p2 = s2;
   eth = p2?.ethnicity;
   if eth != () {
       io:println("P2's ethnicity: ", eth);
   } else {
       io:println("P2's ethnicity: N/A");
   }
   io:println(p2);
}
$ ballerina run sample.bal 
P1's ethnicity: N/A
P2's ethnicity: White
name=Anne birthYear=1988 married=true ethnicity=White college=Harvard

Network Communication with Data Binding

In this section, we will take a look at how we will be using type system features for records in implementing network data binding operations with structural validation, data types handling, and payload passthrough operations. 
The functionality will be demonstrated using an HTTP service in Ballerina: 
http:Client asianRecordsDB = new("http://example.com/");
@http:ServiceConfig {
   basePath: "/"
}
service RecordService on new http:Listener(8080) {
   @http:ResourceConfig {
       path: "/record",
       body: "entry"
   }
   resource function process(http:Caller caller, http:Request request,
                             Person entry) returns error? {
       if entry?.ethnicity == "Asian" {
           io:println("Asian Record: ", entry);
           json jsonPayload = check json.constructFrom(entry);
           _ = check asianRecordsDB->post("/store",
                                          <@untainteD> jsonPayload);
       } else {
           io:println("Non-Asian Record: ", entry);
       }
       check caller->respond();
   }
}
$ ballerina run sample.bal 
[ballerina/http] started HTTP/WS listener 0.0.0.0:8080

Test Scenarios

Scenario 1
A request is sent without setting the
married
field since the record type defines the
married
field as having a default value. This value is set in the resultant record value. Also, the
ethnicity
field is not set since it is marked as optional.
Request:
curl -d '{ "name": "John Little",  "birthYear": 1855 }' 
http://localhost:8080/record
Terminal output: Non-Asian Record: name=John Little birthYear=1855 married=false
Scenario 2
A request is sent with a
string
value given for the integer field
birthYear
. The service validates the value for the field and the request fails. 
Request:
curl -d '{ "name": "John Little",  "birthYear": "1855 BC" }' 
http://localhost:8080/record
Response: data binding failed: error {ballerina/lang.typedesc}
ConversionError message='map<json>' value cannot be converted to 'Person'
Scenario 3
A request is sent with the optional
ethnicity
field also set. 
Request: curl -d '{ "name": "Sunil Perera",  "birthYear": 1950, "married": true, "ethnicity": "Asian" }' http://localhost:8080/record
Terminal output: Asian Record: name=Sunil Perera birthYear=1950 married=true ethnicity=Asian
Scenario 4
A request is sent with a non-existing value of the
Ethnicity
union type. This is validated by the service and the request fails. 
Request:
curl -d '{ "name": "Tim Kern",  "birthYear": 1995, "ethnicity": "Japanese", "country": "Japan", "zipcode": "98101" }' 
http://localhost:8080/record
Response: data binding failed: error {ballerina/lang.typedesc}
ConversionError message='map<json>' value cannot be converted to 'Person'
Scenario 5
A request is sent with additional fields not explicitly mentioned in the
Person
type. Since
Person
is an open record type, the service accepts these extra fields and these are available to be passed through to other systems, e.g. a forwarding service. 
Request:
curl -d '{ "name": "Tim Kern",  "birthYear": 1995, "ethnicity": "Asian", "country": "Japan", "zipcode": "98101" }' 
http://localhost:8080/record
Terminal output: Asian Record: name=Tim Kern birthYear=1995 married=false ethnicity=Asian country=Japan zipcode=98101
The above executions demonstrate how the Ballerina type system works hand-in-hand with the network data handling functionality in providing an intuitive experience for developers. 

Summary

In this article, we looked at how the unique features of the Ballerina type system enable us to model network communication patterns in an intuitive way that enables maximum productivity for developers. The seamless integration with the data types and the context awareness allows Ballerina programs to do effective network data operations. 
For a more extensive analysis of the type system, refer to the Ballerina language specification
Various examples on using in-built data types, such as JSON/XML, and other network-based functionality can be found in Ballerina by Example pages. 

Written by lafernando | Software architect and evangelist @ WSO2 Inc.
Published by HackerNoon on 2020/04/04