nuclio: The New Serverless Superhero

Written by yaronhaviv | Published 2017/10/19
Tech Story Tags: serverless | kubernetes | cloud-computing | docker | aws

TLDRvia the TL;DR App

We’ve gotten used to there always being a trade-off: if something is abstract and easy to use as serverless, it’s probably slow and inflexible. Either that, or we’ve had to squeeze maximum performance and address unique application requirements at the expense of usability. But nuclio’s serverless functions (https://github.com/nuclio/nuclio) lets us rapidly create and run code without an infrastructure hassle. nuclio runs faster than bare-metal code, addresses a broader set of applications, is simpler to debug and most importantly, it runs ANYWHERE.

A while ago iguazio needed a way to add elastic and ad hoc data processing capabilities in its real-time data platform, so we developed a high-speed FaaS layer for this. We then took what we learned from the first generation and built a unique open-source serverless platform. We call it nuclio:

  • Delivering real-time performance and maximum parallelism
  • Enabling simple debugging, regression and a multi-versioned CI/CD pipeline
  • Supporting pluggable data/event sources with common APIs
  • Portable across low-power devices, laptops, on-prem and public cloud

Read more on serverless background and current challenges in my post

nuclio’s Architecture

nuclio’s core component is the function processor (written in Go). This processor works through abstract interfaces and is the function’s “OS,” providing all access to events, data, logs and so forth. The same function code can be fed from a variety of pluggable event sources (currently supporting HTTP, Kinesis, Kafka, RabbitMQ, MQTT, NATS, iguazio’s V3IO, and emulators).

nuclio’s function processor architecture

External data (objects, files, databases and streams) is accessed through a data binding interface which takes care of all data connection, security and caching aspects. We can write a function that will use local files, or access remote data over HTTP, or an extremely fast scale-out DB/stream over TCP or RDMA without changing the code.

The nuclio processor is REAL-TIME. Access to its “OS” events and data is done with parallelism in mind, zero-copy, smart memory/thread reuse and non-blocking IO. Writing your own bare metal function will likely be slower than running it in the nuclio processor ecosystem. A single nuclio function processor can run 400,000 function invocations per second with a simple Go function or up to 70,000 events per secound using Python (PyPy) or node.js and respond in less than 0.1ms latency. That’s 100x faster than most serverless/FaaS solutions. Access between the Go-based processor and other language runtimes is done through low-latency shared memory access to eliminate context switches or process start overhead.

nuclio supports four application modes: sync, async, stream and batch/interactive jobs, and dynamically distributes events, streams and job tasks among processors (the dealer). These make serverless applicable to new workloads including heavyweight backend and analytics tasks.

nuclio platform services

nuclio focuses on portability and reusability:

  • Works in low-power devices, Docker, Kubernetes, or inside an IDE using nuclio SDK
  • Events/data sources map to functions or versions/tags at deployment time
  • Logs and statistics can be sent to multiple destination types, or to the IDE screen.
  • Function images are stored in a shared repository and pushed to multiple clusters/devices

Portability allows users to test and debug functions on their laptops with the SDK or Docker, ship them to run on Kubernetes clusters in different clouds, or push them down to multiple edge IoT devices. It also simplifies regression testing and diagnostics: functions feed from emulated events and write the output to a structured log that is compared against expected results; they’re promoted to beta if they pass automatically.

Get started with nuclio in 60 seconds

Deploying a full Kubernetes cluster or a bunch of services isn’t required to get started, just test drive nuclio with the all-in-one Docker version. Type the following in Linux (assuming Docker is installed):

docker run -p 8070:8070 -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp nuclio/playground:stable-amd64

-then open a browser at this address HTTP://<machine-ip>:8070 to see the nuclio playground UI.

nuclio’s playground comes with a few built-in examples that explain how to write functions, use logs and add package dependencies through inline comments or defined events.

To get started, select an example from the drop-down list, edit and rename it, and then push deploy. Build errors will show in the log, fix them and re-deploy. Once you’re done, use the invoke tab to generate manual events and test the function. Use the log level selector to change verbosity on the fly and debug problems. Note that under the hood, nuclio generated Docker containers or Kubernetes deployments from your function.

You can develop nuclio functions using your favorite IDE, import or clone the nuclio-SDK and see the README or examples. Look at nuclio’s official web site or Github documentation To learn more, we go over its architecture there, as well as provide details about usage and CLI. Can also get help in nuclio slack channel.

nuclio is still under development, it supports Golang, Python, and Node.js with Java coming soon. Give it a star and join us to accelerate the development of new features.


Published by HackerNoon on 2017/10/19