The Emergence Of Distributed POSIX

Written by mhausenblas | Published 2016/11/26
Tech Story Tags: containers | cloud-computing | devops | distributed-systems | standards

TLDRvia the TL;DR App

In the past couple of months I’ve noticed folks hinting at or suggesting something that, for a lack of a better term, I’ll call distributed POSIX or dPOSIX for short.

What do I mean by this? Well, with whatever we come up for dPOSIX (besides, hopefully a better name) it will presumably have the following characteristics:

  • Container as first-class citizens: no matter what the unit of scheduling is (Kubernetes had pods from the get-go, DC/OS is introducing it with 1.9, Docker will at some point likely follow?), dPOSIX will be treating containers as first-class entities. For example, we can expect dPOSIX APIs will understand how to express to run different types of distributed processes—such as long-running ones or batch jobs—using different container runtimes, from the currently dominating Docker to AppC and OCI.
  • Distributed runtime environment: dPOSIX will assume a bunch of nodes (aka cluster or cell or datacenter) as the runtime environment; doesn’t matter if you’re talking about 10 (most of us) or 100,000 (some big ones) nodes. Some of the implications being: 1. software defined networking is front and center, 2. cattle (mostly) and some pets (as a safety blanket, for ‘legacy apps’), and 3. immutable infrastructure on all levels (from node to service/function). The only two questions left are: how long will you push back for going all in concerning public cloud (more below), and how to minimize cloud provider lock-in (hint: container orchestration systems allow you to be agnostic about it) where possible.
  • DevOps is the default: dPOSIX requires to re-think the roles and responsibilities how we build and operate services and applications. Increasingly, cloud native deployments at least blur the line between developers and operations and in the extreme case (serverless, erm, Function-as-a-Service) require that the person who wrote some code is also responsible for operating it in prod.

Michael?! Where are the indicators you’ve been talking about, earlier on?

Glad you asked:

  • My esteemed colleague Karl Isenberg gave a talk about POSIX for the Datacenter at the recent O’Reilly Software Architecture conference.
  • The magnificent Kelsey Hightower gave a keynote at KubeCon that has the underlying theme of using the familiar concepts from a single machine in a cluster of machines.
  • Also at said KubeCon, Harry Zhang provided a Comparison of Container Orchestration and Management Systems (on the conceptual and API level) and it turns out that the three major players—DC/OS, Kubernetes and Docker (SwarmKit)—increasingly use an overlapping set of primitives such as service, deployment, etc.
  • Subbu Allamaraju, whom I know from my time as a RESTafarian and value a lot as a distributed systems practitioner, published a provoking piece on Don’t Build Private Clouds earlier this week, making the case that you should be building a private cloud as much as you should be running your own nuclear power plant.

Some background information and further reading on the topic:

  • If you’re not that familiar with POSIX, have a look at some practical definitions.
  • For a laundry list of items to actually build a dPOSIX-conforming system, check out Joe Beda’s post Anatomy of a Modern Production Stack (09/2015)
  • In the filesystems world, the distributed POSIX aspect has been around for quite a while.
  • The C programming language played a crucial role for POSIX and while I don’t think a single language will define or dominate dPOSIX, many (potential) dPOSIX compliant systems are written in Go.

Published by HackerNoon on 2016/11/26