How to Create the Next Great Programming Language

Written by thosakwe | Published 2017/02/18
Tech Story Tags: programming | javascript | programming-languages | compilers | startup

TLDRvia the TL;DR App

Many of us have, at some point, dreamt of creating a programming language that redefines the way we develop software. And most of us have also come to accept the reality that such a feat, if not entirely impossible, is very difficult to accomplish. Over the past few years, I have read a lot about languages and compilers, and I have identified a list of components that helped the most popular and powerful programming languages become what they are today. While it’s still highly unlikely that you’ll create the next C or Java, you have no realistic chance of achieving such a lofty goal without paying attention to the following list.

So, without further ado… Let’s begin!

#0: The Right Choices are Made Beforehand

Every project needs a defined direction, and if you don’t know what the point of your language is, you‘ll end up nowhere. Ask yourself the following questions to determine the worth and scope of your new language:

Why are you designing an entirely new language?

Which problem(s) (ideally a lot more than just one) will be solved by re-inventing the wheel? Are the benefits of your language promising enough to convince businesses and developers to move from established, maturer tools to relying on the ones you are going to build?

How will you finance your project?

If your project is open-source, where will you receive funding? Donations? Will you be backed by a major company? Or will you be supported solely by your motivation to create a viable language?

What is the target of your language?

Web development? Embedded systems? General-purpose?

What styles will your language facilitate?

Will you support multiple programming paradigms, or will you force developers to fit into one?

How are you going to spread the word about your language?

What will you do to promote your effort, and garner public support?

#1: Familiarity and Accessibility

Let’s be honest — nobody wants to learn an entirely new syntax just for the sake of being able to produce a program in your language. Try to adhere to general conventions that appear across a wide array of languages. Many languages have adopted features of C syntax, such as curly braces, parentheses for functions, and keywords such as if or for. Something like the following, while exaggerated, is a total rejection of established conventions, and as a result is hard both to read and write:

FUNCTION maininput <INTEGER argc, STRING argv()>output <INTEGER>[std>>cout:"Hello, world!" + std>>endl;]

You don’t want to be too verbose, either (looking at you, Java!):

public static function mainrequires input(int, char**)produces output(int) {using System and its "out" property, call println with ("Hello, world!");exit with result 0;}

A language also needs to be easily accessible to anybody who uses it. Whereas a platform-specific tool such as MASM only needs to distribute a Windows installer, a language designed to run on every major operating system needs to provide headache-free installation options for each one. For example, to get started developing PHP on a new computer, all you need to do is run the suitable installer for your OS, and open a text editor.

If you support Windows, it never hurts to provide a good development experience. Ruby famously sucks on Windows (mostly because nobody uses it on Windows), and Dart doesn’t even provide an official installer for Windows.

#2: Actively Maintained

Commit velocities of a few Web languages and tools (credit: 池田 泰延])

Pascal was a great language back in its heyday. So was Ada. Fortress, too! So why aren’t they popular in 2017? The answer is simple: none is actively maintained. Regardless of what happens in the future, what technological advancements we make, or how consumer needs change, none of the aforementioned can ever again evolve to reflect it, because their codebases are static. Nobody is working on a Pascal compiler in 2017. Whichever errors you come across in your development, you are stuck with, or have to write a workaround for on your own. Active maintenance means Github issues don’t sit stale for months or years at a time, and it also means that developers and companies can have more confidence relying on your tools. And as an added bonus, people will want to use your project because they can see effort is still being made to keep it up to date!

#3: Fail-fast and Descriptive Error Messages

Elm’s friendly error messages.

Everybody can agree that runtime errors suck. They are costly, hard to track down, and in most cases, entirely preventable. A fail-fast system works to diagnose runtime errors before they ever occur. Ultimately, this can save you time, headaches and money. The better your language toolset is at detecting and preventing errors, the more attractive it will be to new developers.

Elm’s success as a Web development language can be attributed in part to the descriptive error messages produced by its compiler. Not only does it detect type mismatches, but it even detects misspelled variable names. The more detailed the error messages are, the easier it is to mitigate bugs before they even reach your application.

#4: Type Safety

Type safety makes it easier for a language to fail fast. How many times have you seen an error like this?

NoSuchMethodError: Class 'Wtf' has no instance method 'IsThisNonsense'.

Receiver: Instance of 'Wtf'

Tried calling: IsThisNonsense()

In strongly typed languages, such as Java, such errors are always able to be statically analyzed and caught at compile-time.

The debate over strongly-typed vs. dynamically-typed languages will likely never end, but I personally recommend strong type-checking. If you can catch every type error at compile-time, you won’t need to add the overhead of runtime type checks to your final products.

#5: Versatile Tooling

Microsoft’s Visual Studio line provides top-quality tools for a variety of language, most notably C#.

Good tools save time. Good tools save money. Good tools save lives.

Ok, maybe good tools don’t really save lives, but it can’t be denied that a language with adequate tooling is more productive to work with, and generally a more enticing choice than a language where you are left virtually on your own.

C# is a great example of language tooling. The .NET framework not only includes a robust compiler, but also tools for debugging and decompiling IL bytecode. Combined with Roslyn, NuGet and Visual Studio, the C# development experience is one where virtually everything is provided for you.

#6: Metaprogramming

The most future-proof languages are capable of evolution over time. If developers have to wait until a new SDK version implements a crucial feature, then they will leave your platform, and choose one where the feature is already present.

According to Wikipedia:

Reflection is the ability of a computer program to examine, introspect, and modify its own structure and behavior at runtime.

Giving users a vehicle through which they can add language features themselves is a good way to keep them around longer. And in some cases, commonly metaprogrammed libraries are adapted as language features themselves. For example, ES6 introduced the extends keyword to JavaScript, and eliminated the need to use third-party libraries to extend object prototypes.

#7: Vibrant Community

This one is mostly out of your control, but is also one of the most important steps on your path. A thriving community can be a magnetic factor that draws people to your language. If it’s all crickets and tumbleweeds, you will face difficulty gathering an audience for your project. Just think about: who wants to use a language nobody talks about, writes libraries for, or can answer questions about? Not me. Not you. Not anybody else. Ruby is notoriously a pain on Windows because the overwhelming majority of its community uses Mac or Linux. How do I get Jekyll up on Windows 7? Who knows? Nobody.

Today’s trending JavaScript repositories on Github.

Community activity is also a decent gauge of the amount of people using your language. The more people are using your language, the more likely someone will be looking for support or answers, and for somewhere to find them. JavaScript is the most popular programming language on this planet, and it’s clear to see. Of Github’s trending repositories, at least half are written in JavaScript, every day. There are thousands of JavaScript Gitter and Slack rooms, and it’s one of the most common languages taught at coding camps.

#8: In-depth Documentation

This one is a no-brainer. Every programmer runs into errors while coding, and those errors are often exacerbated by a lack of sufficient documentation of the API’s that failed. Do everyone a favor — document public API’s.

A language you build should also make documentation a first-class feature, rather than trying to patch it in after releases. Something like Javadoc will work. Dart is a great example of documentation support — the Dart SDK includes a static documentation site generator, and every package uploaded the Pub repositories has documentation generated and hosted on-site.

#9: Stability through Versioning

Nobody is going to migrate to your language if every new change introduced is a breaking change, which can be very expensive. Enforcing strong versioning policies allows developers to update without fear of unholy retribution at the hands of the API gods. SemVer is a popular system of rigid version constraints, and by following it conventions, your language can become virtually future-proof. For example, Dart’s Pub package manager installs dependencies by resolving SemVer constraints to suitable versions of libraries.

If you choose to commit to SemVer, your package manager (or whatever similar tool you use) can be written to compare library sources to past versions to ensure that SemVer is followed. This is something of an extreme, as many developers would prefer to just publish packages, without an angry tool nagging them for renaming an API.

However, if you do, it would be a great way to force package version numbers to reflect on the changes they present, and prevent unforeseen application breaks after updates.

#10: Library Support

NPM has a massive number of JavaScript packages hosted on its servers.

Aside from syntax and tooling, perhaps the deciding factor in switching to a new language stack is its ecosystem. NPM has over 400,000 JavaScript libraries available to the public. No matter what additional functionality you need in your Node.js apps, you can be confident that somebody else has already implemented it for you.

New languages are sorely disadvantaged here, simply because they have not been around long enough for developers to publish a comparable amount of libraries. Thus, if you aim to keep users around, you must provide a wealth of functionality out-of-the-box.

Dart (I think you can see my bias towards this language!) takes a “batteries-included” approach, and provides a massive standard library that removes a lot of the need to have a quarter million packages uploaded to its package manager. Functionality like left-padding strings has been around for ages now, and it’s also an arduous task to delete Pub packages, so if some developer decides to pull a package, the entire Internet will not break.

#11: Efficient Memory Usage and Concurrency

Applications at scale run a high risk of crashing, in general. Simply put, it takes a lot of resources to handle high traffic and expensive operations. To make matters worse, memory management is a PITA to manually implement.

While it might not necessarily be a feature of the language itself, your compiler/interpreter/VM should make an attempt to prevent memory leaks, buffer overflows, and other memory mismanagement errors. Most modern virtual machines, such as the JVM, the .NET runtime, and the Dart VM implement memory management via garbage collection, and as a result, developers of the corresponding languages do not have to worry about allocating memory or releasing pointers.

Multithreading also is a common technique to run asynchronous code in a parallel fashion, and thus, concurrent operations should be trivial to implement, and require little to no extra boilerplate code. For example, Google’s Go language has concurrency support baked right into the language itself:

func main() {var c chan string = make(chan string)

go pinger(c)go ponger(c)go printer(c)

var input stringfmt.Scanln(&input)}

#12: Testing

It is well known that the maintenance phase is the longest phase in the software development life cycle. Test-driven development is a common process that assures quality of produced software, but it also relies heavily on the ability to write very specific unit tests to verify success in various use cases.

The better suited your language is to testing, the less bugs will be encountered at runtime, and the more trusted your language is for production.

A good idea is for the team behind the language to publish the testing utility, but using community-supported, mature testing libraries (such as JUnit, Mocha or Cucumber) is also a viable strategy.

#13: Portability and Modularity

Java runs on a large variety of platforms, including those smaller than an actual cup of java (image credit: Robert Savage).

Not all languages need to be portable, but those that target multiple platforms need well-designed systems to allow code reuse across systems. Most, if not all languages have some sort of import keyword that allows you to pull in code from other files. A modern language needs to go beyond solely using imports to split large files apart, and to actually implement some kind of module system.

Modularity allows developers to explicitly separate logic, and also makes it possible to use certain parts of libraries across any platform. Dart and ES6, for example, implement module systems.

Module systems also make it easier for compilers to strip out code that is unused, and ultimately reduce the output size of compiled code. This is key for languages that compile to Javascript, as having less code parsed on the client side goes a long way to prevent the main bottleneck of Web browsers.

#14: Established Standards

Public standards can go a long way to make your language more stable, and more productive in team environments. A serious language will have a thorough specification available to read on a free online platform (Example: Dart ECMA specification).

Consider implementing a configurable linter and formatter, and providing it as part of your language’s standard toolset. Eventually, the standards will be fried into developers’ brains, and joining new commercial teams or open-source projects will be an easier experience. For example, Dart ships with both a linter and formatter.

You might even consider refusing to compile/run code that is poorly formatted. This is a bit extreme to bake into a compiler, but project authors might consider including formatting/linting checks in presubmit scripts to make sure all commits are readable and understandable. Example with ESLint:

// package.json{"scripts" : {"prepublish": "eslint **/*.js"}}

#15: Unicode Support

Language developers need to remember: Not everyone in the world speaks English. Supporting only Latin characters effectively shuts off millions of potential developers, and also prevents anyone in a country where English is not the predominant language from ever using an application developed in yours.

If something like the following (credit: Nick McCurdy) is possible in Java, why shouldn’t your language support Unicode?

#16: Backed by a Major Company

This isn’t necessarily required, but if a major company backs your language, you automatically secure several benefits:

  • Credibility — People will see that a large business is using your language in production, and trust it to be capable of handling modern-day demands.
  • Funding — Because your language is powering multi-million dollar applications, the company has an incentive (really an obligation) to provide resources necessary to ensure the language stays afloat, and adapts to fit growing needs.
  • Emotional Validation :) — Building a language is by no means an easy task; it takes hours upon hours of painstaking work that at times feels thankless. A large company using your language reassures you that your efforts were not in vain, and more importantly, tangibly proves to you that people are actually using your language.

This is a major factor, if not the main factor, in keeping languages like Java, Go, Dart and PHP actively maintained and regularly updated. If your language takes off, you may consider pitching it to companies, giving talks at developer meetups, or even running it at the base of your own applications.

That’s the end of this list. Granted, you might try all of the above suggestions and still fall flat, but it’s still worth a try at the end of the day. Take what you’ve learned, and go make that pipe dream a reality!

Thanks for reading! Liked the post? Please show some “ ❤” with a tap on the green button! :)

About the Author

Tobe O. is a 17-year-old programmer whose favorite language has to be Dart. When he’s not at school, he’s playing sports, making music, or working on the Angel server framework (check it out — no, really!). Find him on Twitter!


Published by HackerNoon on 2017/02/18