How to build AI culture: go through the curve of enlightenment

Learn from the 90s

In the beginning of the web, there was a ‘developer’ who wrote the ‘code’. This code would get built and then chucked over the wall to the ‘operations’ folks. They — you know, operated the website that their company made money from. Sys admins, data center people, database admins (I forgot they still exist — hi DBAs!). There were strict protocols on managing releases. Testing was a pain, and it took a long time to ship anything.

Remember the Joel test? Yes, that old thing that 30 and 40-something year old engineers know about.

Getting a no on any question on the test said something about team culture.

Do you use source control? Engineers spend time worrying about (or breaking) someone else’s code. Losing all the code because a hard drive crashed was a thing.
Can you make a build in one step? Engineers spend non-zero time making builds. Multiple steps introduce mistakes. Less builds are made, so new code is tested slower
Do you make daily builds? If something breaks a build, it takes a while to get noticed. (The world has continuous integration now — things have come a long way!)
Do you have a bug database? Emails, post-it notes, phone calls with angry customers and “business” people. Forgotten bugs. Bug are not well documented and cannot be reproduced.
Do you fix bugs before writing new code? Tech debt slows down how quickly you can ship new features. A small team is spun out for a re-architecting the codebase project. It takes a year.
Do you have an up-to-date schedule? Dates are meaningless. It’s done when it is done. You should have thought about this when you made your stupid forecast.
Do you have a spec? Engineers do the thinking for the PMs. Things get built, “But that’s not what I asked for! Can you make it do this other thing instead?”
Do programmers have quiet working conditions? Let’s-get-coffee. Hey-how-do-I-do-x. Omg-have-you-seen-this-email-from-HR. Some engineers are asking if the company will pay for headphones. No, says HR.
Do you use the best tools money can buy? Grumpy engineers are slower engineers.
Do you have testers? You have more bugs
Do new candidates write code during their interview? There’s a few people on the team who have mastered the art of keyword-stuffing their resumes but can’t learn a new programming language to save their lives
Do you do hallway usability testing? Engineers build features, only to discover months later that users hate them. A ‘usability issues’ epic is created. If you had a bug database in the first place

All of these things have one thing in common — they slow down an engineering team. Working software reaches users slower. User feedback is slower. Product innovation is slower. The business creates customer value slower.

A competitor that delivers customer value faster eventually upstages you. All because you didn’t invest in a build system.

The trouble is small at first. These things have a knack for compounding their effects over time. We have an intuitive linear view of technological progress i.e. the pace of today is used to make projections for how fast things will be achieved in the future. If software delivery gets exponentially slower over time, we underestimate how much slower it will be in the future because of this linear view.

Measuring speed

So the agile community came up with the concept of velocity. It was a diagnostic metric for how quickly a team can ship complex code on an existing code base. At the end of each sprint, story points are added up and that’s the velocity of the team. If velocity drops, the team ships complex stories slower and you are headed in the wrong direction. Do something about it!

Problem is — velocity depends on a lot of things. It is hard to know what to do to push it in the right direction. There are certainly no quick fixes. The quick fixes that exist (like adding a new member to the team) do not tackle the core problem.

And it takes more than engineers to build a product. PMs, UX folks, designers. It is so difficult to come up with a ‘are we fast enough’ metric that encapsulates all aspects of building a product. Velocity does not fully capture this. We were still building products that people didn’t want.

Then the Lean Startup thing happened.

The Lean Startup methodology

Have an idea for a product? Hypothesis > Build MVP > Validate hypothesis. Get through the whole loop (one iteration) as fast as possible.

The product-building world finally saw the writing on the wall. Speed matters. Speed of iterations matter.

Fast iterations = success. Especially so in ML

Prashast, our CTO, built some of the tooling needed to pull off successful production ML systems at Google. He was convinced that any ML setup needs to allow for fast iterations. This is how he explains it.

You can’t just “do ML” and have it magically work. A train-and-forget mentality means that your model goes stale very quickly. Products change, users change, behaviors change. In reality, it is a long road of constant experimentation and improvements. You need to try simple things first. Then different features in your model. Data is not always clean. You experiment on different models, A/B test them. Things go wrong in production all the time. It takes months of constant tweaking to get things right.

Once you start, you need to think of it like any other software project. It needs building, testing, deployment, iterations. Each iteration cycle makes things just that little bit better. The more iterations you can get through, the faster your ML setup improves.

To validate this, we spoke with data science teams about how they use ML. As you might expect, there is a wide spectrum: Extremely sophisticated teams processing petabytes of data and delivering billions of predictions every day, to teams just getting a grip on training their first model.

Sure enough, mature teams are setup for fast iterations. Uber’s internal ML platform is one such example.

These teams were not always like that though. It seems that teams go through a curve of enlightenment.

You could say the same about ML!

There seem to be 2 types of organizations. One type takes the ‘lean AI’ approach and the other, ‘I read somewhere that we need an AI strategy’ approach.

The reason most teams go through this curve of enlightenment is that building an AI culture is a journey. Teams start with something simple they can deliver quickly, show value and then build on it. Most of the time, starting AI efforts means going backwards on the curve. Teams spend time getting the data instrumented, cleaned and rethinking data infrastructure because these things slow down any AI effort.

Teams that attempt to jump directly into the middle of the curve “Let’s build out an ML platform because we have an AI strategy now” usually fail. This approach highlights the disconnect between product teams (including data scientists!) and the boardroom. It’s no wonder that data scientists are frustrated and companies have an AI cold start problem.

Want to build an AI culture? Go through the curve and enable faster iterations.

Some ways that AI teams enable faster iterations are:

Clean, well-labeled, consistent data
One-click model training and deployment
Self-service data science. Reduce engineering dependencies to iterate on the model like trying out new features, build new models and automating hyperparameter optimization
Scalable, performant systems for data access, data manipulation, clusterized model training, model deployment and experimentation, online prediction queries and candidate scoring
Infrastructure peeps treat data scientists as first class citizens who use what they build
The ‘ML dev’ environment mirrors the ‘ML production’ environment

Here is our version of the Joel test to measure culture of an AI team.

Data pipelines are versioned and reproducible
Pipelines (re)build in one step
Deploying to production needs minimal engineering help
Successful ML is a long game. You play it like it is
Kaizen. Experimentation and iterations are a way of life

This is why we built Blurr. Getting data together, processed, cleaned and mangled for machine learning is not a do-once-and-forget type of activity. This is the base of any AI effort and enabling continuous improvement on the base is critical for a successful AI culture. Blurr provides a high-level YAML-based language for data scientists/engineers to define data transformations. Replace 2 days of writing Spark code with 5 minutes on Blurr.

Blurr is open source because we believe that an open source approach will accelerate innovation in anything AI. We even develop in public — our weekly sprints are there on GitHub for everyone to see.

The DevOps tooling market exists to ship software faster.

Source

Our vision is that there will be an MLOps market that helps teams ship ML products faster, and Blurr is the first technology we are putting out to enable this. Because the biggest problem right now is iterations on data.

We have a data driven culture. AI comes from data. Therefore, we have an AI culture!

No, you don’t.

Being data driven is removing human biases in decision making. Is a higher load time for the app a bad thing? Is this new model better than the old one? Let’s look at the data and decide!

AI culture is an algorithm-driven culture. Humans build machines that make decisions in a product. Algorithms are deployed to achieve human-crafted aims (improve ad CTR, conversion rate, engagement).

AI culture is being comfortable with probabilistic judgements. A product recommendation has a 60% chance of increasing engagement than this other recommendation. 40% of the time, it is not going to be better. Is that good? Start somewhere and improve it.

AI culture is a state of constant experimentation and iterations. Everything else in an organization needs to support that.

Humans are complicated. We expect deterministic behavior from machines when we ourselves are stochastic decision makers, complete with pattern recognition abilities and cognitive biases. Humans run companies and they play politics, which can be incredibly frustrating when trying to build an AI culture.

This makes me wonder how humans (and human-made societal structures like companies) will behave with super-intelligent machines. 2029, baby!

Blurr is in Developer Preview, be sure to check it out and star the project on GitHub!

If you enjoyed this article, feel free to hit that clap button 👏 to help others find it.