If I were to invent a programming language for the 21st century

Written by okaleniuk | Published 2017/05/20
Tech Story Tags: programming | programming-languages | software-development | software-engineering | cobol

TLDRvia the TL;DR App

The updated version of this piece is available on Words and Buttons Online: https://wordsandbuttons.online/if_i_were_to_invent_a_programming_language_for_the_21st_century.html

A bunch of languages already appeared in the 21st century. Swift, Kotlin, and Go are probably among the most popular. However, the distinct feature of 21st-century language design is the absence of any distinct features in the languages themselves. The best thing about any of these is that you can spend a weekend and claim to learn a shiny new thing without actually learning anything new. They don’t have anything new in them at all; they are all made by the “something done right” formula, this something being Objective-C, Java or C.

While “not being new” is indeed a valuable trait in its own right, the question arises. Are they indeed the languages for the 21st century, or are they merely the reflection on the 20th-century bad programming habits?

If I were to invent a language, I wouldn’t try to fix the past. I would try to invent a thing that both fits well in the reality of the modern world, and that could evolve properly. If this requires radical design decisions — then so be it.

1. Down with the syntax!

Modern languages syntaxes evolved from the freedom of chalk and blackboard put into the shackles of ASCII. While some elements of notation like arithmetical signs and brackets are more or less idiomatic, some are just made up for no reason at all apart from saving the effort of pressing teletype buttons.

Typing is not an issue anymore. We are not obliged to play guess with our syntax. Things like this (($:@(<#[), (=#[), $:@(>#[)) ({~ ?@#)) ^: (1<#) are indeed concise and expressive. And also fun to write. But they do not help readability and, what’s even more important, googleability and stackoverflowability.

The same goes for cryptic function names, return code conventions and attributes with obscure meaning. They served us well in the past, saving our punch-card and display space, now they deserve retirement.

Ultimately this:

FILE * test_file = fopen(“/tmp/test.txt”, “w+”);

Should become something like this:

create file /tmp/test.txt for input and output as test_file

We don’t need all that brackets, quotes, asterisks and semicolons (unless they help us to express things). Syntax highlighting should work instead of syntax notation just fine.

Things that are cheap in the 21st century: parsing time, computer memory, on-line search. Things that are not: development time, programmer’s memory, effort spent on-line for notation clarifications. This type of coding should facilitate the usage of cheaper things to get things done.

2. Down with the native types!

You probably know this as one of JavaScript WATs.

> 10.8 / 1000.10800000000000001

It isn’t, of course, JavaScript specific WAT. In fact, it is not a WAT at all, it is a perfectly correct behavior backed by well-respected IEEE 754 standard. It’s just how floating point numbers are implemented in almost any language. And it’s actually not that bad considering we are trying to squeeze an infinite amount of real numbers into 32, 64 or even 256 bits.

What mathematicians consider impossible, engineers do by trading-off possibility for sanity. IEEE floating point numbers are in fact not floating point and not even numbers at all. Maths requires real numbers addition to have associativity. Floats and doubles do not always hold this property. Math requires real numbers to include all the integers. This is not true even for the same sized float and uint32_t. Math requires real numbers to have a zero element. Well, at least in this regard IEEE numbers exceed the expectations, as they have two zero elements instead.

And it’s not only about floating point numbers. Native integers are not much better at all. Do you know what happens when you add up two 16-bit numbers like that?

0xFFFF + 0x0001

Well, nobody knows actually. Intuition tells us that the overflowed number should be simply 0. But this is not specified by any worldwide standard; it’s just how it usually goes with C and on Intel x86-family processors. But it may also result in 0xFFFF, or trigger an interrupt, or store some special bit in a special place signaling that the overflow happened.

It is not specified at all. It differs. While floating point numbers are just moderately insane, these are entirely unpredictable.

What I would propose for numeric computations instead is fixed point arbitrary sized data types with standard defined behavior on underflow, overflow, and precision loss. Something like this:

1.000 / 3.000 = 0.3330001 + 9999 = overflowed 99990.001 / 2 = underflowed 0

Of course you don’t have to actually write all the trailing zeros, they should be implied by the data type definition. You should be able select your maximum and minimum bounds for the type yourself, not just rely on the processor architecture.

But wouldn’t it work much slower then? Yes, it would. But realistically, how often do you have to program high-performance computations? Well, unless you work in research or engineering that does require that, but then you’d have to use specialized hardware and compilers as well. I’ll just presume a general 21st-century programmer wouldn’t have to solve differential equations very often.

That being said, shouldn’t fast, complex and unpredictable native types from the past be an option and not a default?

3. Down with the metalanguaging!

There are brilliant wonderful languages designed not to do the task, but to create languages to do the task. Racket, Rebol, and Forth to name a few. I love them all, they are a pure delight to play with. But as you might guess, being fun is not exactly what makes a language universally popular.

Language leverage, the ability to create new sub-languages for the task, is a great power, and it pays vastly to have it when you work in research all on your own. Unfortunately, if you have to write code for other people to understand, you have to teach them your language as well. And that’s when it gets ugly.

People are generally interested in getting things done, not learning the language they’d have to forget anyway after the things are done. For other people learning your language is just an effort that would hardly pay off. Learning something common and standardized however is an investment for life. Therefore people will rather reinvent your language using standard tools than learn it. And there you go: countless dialects for the single domain; people arguing about aesthetics, ideology, architecture and all the things that are irrelevant; million lines of code being written just to be forgotten in months.

Lisp guys went through all of that in the 80s. They figured out that the more of the practical part of a language is standardized — the better. And they came up with Common Lisp.

And it’s huge. The INCITS 226–1994 standard consists of 1153 pages. This was only beaten by C++ ISO/IEC 14882:2011 standard with 1338 pages. C++ has to drag a bag of heritage though, it was not always that big. Common Lisp was created huge from the scratch.

But the language should not be anecdotally huge. Not at all. It’s just that it should have a decent standard library filled with all the goodies people would have to reinvent themselves otherwise.

It is difficult to balance hugeness and applicability. We had to learn this with C++ the hard way. So in fact, I think the language for the 21st century should be more domain specific than not. Since business applications are currently the biggest mess, perhaps it should address that and not some game development or web-design.

So…

The language for the 21st century should be business oriented, English-like and not dependent on any native types.

Wait a minute... Did I just reinvent COBOL?

Yes, I did.

By Rainer Gerhards (own work (own card, own photo)) [GFDL (http://www.gnu.org/copyleft/fdl.html), CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/) or CC BY-SA 2.5–2.0–1.0 (http://creativecommons.org/licenses/by-sa/2.5-2.0-1.0)], via Wikimedia Commons

I deliberately described COBOL features here as ultra-modern and promising to show you one thing. Language features don’t write the code. You do.

It’s naïve to think that the language is responsible for the quality of code and that by adding some bells and whistles or removing some bells and whistles, we can automatically make everything better. We were not happy with Fortran and COBOL, so we invented C++ and Java only to be unhappy with them too in some 20–30 years.

I feel like the issue here is much more about sociology and psychology than actual programming. Are we really unhappy with the languages? Aren’t we unhappy with all the software in general? Windows is vulnerable, Studio is sluggish, and Vim is impossible to quit from. That’s what really disappointing, not the creative process per se.

But we have to blame something. Being software engineers partially responsible for the world of crappy software, we wouldn’t blame ourselves, would we? So let’s blame the tools instead! Let us reinvent COBOL again and again until one day the sun shines, and the birds are singing, and it takes 2 seconds for Windows to boot.

Probably not going to happen.

https://www.flickr.com/photos/35362905@N07/ Attribution 2.0 Generic (CC BY 2.0)

So if I were to invent a programming language for the 21st century, I would reinvent being responsible instead. I would reinvent learning your tools; I would reinvent being attentive to essential details and being merciless to accidental complexity. All that stuff that matters.


Published by HackerNoon on 2017/05/20