The Meeshkan Keras Backend Covers 100% of the Keras Public API: Why This Matters

tl;dr Meeshkan now covers 100% of the Keras Public API!

When Meeshkan moved from using a DSL to using Keras for Machine Learning ingress, the Meeshkan core covered only 25% of the Keras Public API. To keep the business running without interruption, we used a small script to determine if a Keras model could be deployed on the Meeshkan backend and, if not, fell back to using either Theano or Tensorflow.

What this meant for Meeshkan was higher costs and potentially slower processing for about 75% of our jobs, as these jobs could not take advantage of our distributed infrastructure. We decided not to pass these costs along to our customers but rather keep one low rate for all Machine Learning jobs. Needless to say, we’ve been working tirelessly since the launch of the Meeshkan Public Beta to finish our custom backend so that every Keras job that doesn’t explicitly call K.some_proprietary_method can run on our low-cost, pretty-darn-fast network.

I’m thrilled to report that we have finally reached our goal! We will be rolling out a new version of Meeshkan to production over the next few weeks that covers 100% of the Keras Public API. We’ll start with A/B testing and then using for all customers doing Machine Learning on our network.

Ugh…this is boring…now that Meeshkan covers 100% of the Keras Public API and all of the tests are passing, there is nothing left to fix. How am I gonna spend my Sundays from now on? With my family? Quick, I better write in a couple bugs!

What I’d like to write about today is how we achieved this milestone, what we learned, and why I think this step is particularly important in our company’s evolution.

Getting to 100%

Writing a Keras backend is a pretty daunting task — aside from the 3000ish lines of Python code needed to glue everything together in Keras, the underlying library on which the backend relies needs to be rock solid, as Keras will not test your library’s code and will only report numerical inconsistencies between your computed results and those of other (assumedly correct) backends. Furthermore, while Keras does a lot of helpful things, it is ultimately up to the backend writer to implement RNN, Convolution, Max Pooling and other thorny tensor manipulations from ML’s greatest hits.

Over the course of three months, it took us four rewrites of the Meeshkan core and three rewrites of the backend to get to a version that was production-ready. All the while, CI servers spinning on Amazon made sure that tests were passing and that code coverage was impecable in the most critical places.

Lessons Learned

For the future Keras backend writer, here are some lessons I learned (the hard way!) about how to put together a backend for Keras. As not all of this information is available in the Keras documentation, I hope it saves people time and provides useful information.

1. Use this gist to organize your work

There is no one central place in the Keras code base that tells you what functions you need to implement to get your backend to both pass the tests AND cover the Keras Public API. The closest they provide is at https://keras.io/backend, which lists certain functions that only work for some backends and a few functions that are not part of the Public API (meaning not used in the Keras source outside of the backend code). The dictionary above maps keys to functions in the Meeshkan backend, but you can just grab the keys and grep them in the various Keras backends (i.e. grep “def normalize_batch_in_training”) to see how they are put together. I’m sure this list will change over time, but that’s what it is on Sunday, January 21st at 5:23PM Helsinki time.

2. Reverse engineer with reckless abandon

For 50ish percent of the Keras backend API surface, you can write a mini- backend in less than a week by reverse engineering the Tensorflow backend and comparing your version to the real one. By doing this exercise, you’ll grock how the internals of Keras work and have a pretty failsafe baseline as you begin to implement a backend for your library of choice.

3. `_keras_shape, _keras_shape, _keras_shape`

Keras has a more-or-less undocumented interface called _keras_shape that can basically be called anywhere, anytime to get the shape of a tensor.

$ git grep -c _keras_shapebackend/cntk_backend.py:7backend/meeshkan_backend.py:1backend/tensorflow_backend.py:8backend/theano_backend.py:79engine/topology.py:23models.py:4

So attach a _keras_shape to, well, everything. We use a sneaky decorator to tack it on a posteriori.

4. Use other Keras backends as a template

Keras has three open-sourced fully-featured backends: CNTK, Tensorflow and Theano. Between these three backends, there is almost always one from which you can gain inspiration as you are writing yours. Especially for idiosyncratic Keras conventions like batch_dot that have no real equivalent in other ML projects, scaffolding your implementation from another backend’s code will save you lots of time. It will also allow you to make logical breakpoints in your code to debug at the same place across different backends.

5. Go on, treat yourself to a fully-connected dense neural network

Writing a Keras backend, you will invariably get bogged down in the minutia of implementation details for core tensor operations and will not see the forest for the trees. When you’re feeling blue, never forget that you are a dot-product, a ReLU and a L2 Normalization away from a fully connected neural network. Of course, don’t neglect the low hanging fruit! I’d start by just implementing a function that returns your backend’s name.

def __backend(self):return 'meeshkan'

But once you’ve done this, don’t hesitate to choose your favorite backend functions so that your library can build cool things with Keras right away. It’s insanely motivating!

6. When in doubt, WWND (what would numpy do)

A normal human brain cannot possibly anticipate every tensor operation necessary to implement a Keras backend and must read through the code to know what’s going on. In doing so, you will invariably find a few operations that are not present in your base library and that are inefficient to compose from primitives or are just not possible with primitives. So, you’ll be forced to write new functionality into your core library. When doing so, I found that if you aim for API parity with numpy, you can’t go wrong. There are just a couple functions, like np.gradient, where the numpy implementation doesn’t help much for tensors. Of course, there’s a whole class of functions (like those in tf.nn) that can’t by numpy-fied. But the majority of odds and ends you’ll have to implement to glue everything together can be thought of in a numpy state of mind. So keep your API calls similar to numpy — you’ll be thankful you did when others read your backend code understand what’s going on out of the box without having to peruse your library’s documentation.

Deploy with Confidence to Meeshkan

Meeshkan has recently achieved a lot of cool milestones — we had some great meetings with partners in San Francisco, kicked off the Accelerace program, and passed 6000 student mark on our Udemy course. But the geek in me thinks that this Keras milestone is the most important of all. Here’s why.

Keras is, in my opinion, the de facto lingua franca of AI. Like the English language, it is imperfect and idiosyncratic, but like the English language, so many enthusiasts are inspired by what it offers that they converge around it to communicate. Meeshkan is no different — we are enamored with the Keras project and feel that lots of great ideas in AI can be articulated with ease using their concise API.

More and more companies and hobbyists are using Keras every day to make their AI models. And as AI gets easier, people want to do more of it but are blocked by the limits of their own machines. In order for Meeshkan to have a compelling business argument, it needs to give people a substantial advantage over their PC while at the same time providing the same high-quality AI that other Keras backends do.

Because Meeshkan is tested against all other Keras backends and passes 100% of the tests, you can deploy to the Meeshkan network with confidence knowing that your AI jobs will yield the same results that one would obtain from other backends like Tensorflow and Theano. At the same time, you get the benefit of the blazingly fast Meeshkan backend, which is the best time-to-money bargain on the market and will only get better once the new backend is fully deployed.

I hope that you’ll start building some incredible ML projects on Meeshkan! We can’t wait until our new backend is fully rolled out, at which point we will report back with some neat benchmarks. We hope to earn your trust as we strive to offer the industry’s best mix of quality, speed and affordability.