Five Lessons From 3,650 Days of Web Performance Work

Written by dlite | Published 2018/04/20
Tech Story Tags: web-development | web-performance-work | web-performance | weekly-sponsor | software-development

TLDRvia the TL;DR App

Disclosure: Manifold, the marketplace for independent developer services_, has previously sponsored Hacker Noon._ Use code HACKERNOON2018 to get $10 off any service.

The last decade of my professional life has focused on making Rails, Elixir, and Python apps faster and behave in a calmer fashion. Below are five language-agnostic lessons I’ve learned the hard way.

1. Don’t motivate via case studies, motivate with the emotional toll of latency

Every 100 ms of latency loses Amazon 1% in sales! An extra 500 ms of page load time drops Google’s traffic by 20%! An electronic trading platform that is 5 ms slower than the competition triggers $4M in lost revenue per millisecond!

Web performance companies love fearmongering. The propaganda is frequently sound bites from cases studies of extreme scenarios. You might expect me — a co-founder at Scout, an application performance management (APM) company, to produce infographics showing how your slow, un-indexed SQL queries are slowing killing your business. YOU will be responsible for the premature death of your employer if you don’t install Scout!

The reality is that most of us in the Hacker Noon community are not responsible for apps the size of Amazon or Google. We cannot produce a scientifically-sound study tying the latency of our apps to revenue. The real problem for every team — be it Google-scale or the 99% of apps that aren’t — is a stream of unpredictable stability problems spreading distrust. A toxic work environment and customer relationship forms when on-call developers are worn out and sales and support teams can’t trust the technical team.

You need a reliable web app in the same way you don’t want to stress about how your car will holdup over a week-long road trip. A reliable app is a core emotional need of a healthy company.

2. Less focus on averages, more focus on outliers

After ten years, it’s still hard not to think of the response time distribution of requests to a web app as a bell curve. What feels natural vs. what actually happens in real life:

This means that if you are optimizing your app based on aggregate data, you are optimizing a scenario that doesn’t exist. To better understand what’s going on for the slowest requests, you need access to unfiltered events. For example, requests may only be slow for a power user that has a lot of data in your system.

Tooling is still catching up to handling these high-dimensionality events. Scout is one tool that lets you explore data collected from raw transaction traces via Trace Explorer:

Sending structured logs and events to systems like LogDNA and Honeycomb is also a great way to explore more generic multi-dimensional data.

3. Staging environments are pretty much useless

Outside of practicing for major changes that can’t be simulated in development, I’ve found staging environments to be a wasteland. It’s typically cost prohibitive to replicate a production environment exactly in staging and the load on staging is far less than production. Staging ends up becoming an odd limbo phase halfway between development and product.

4. No shame in paying for database help

For web apps, the database is the most frequent bottleneck. There’s a lot involved in scaling a database. Some of the best money we’ve ever spent at Scout was when we brought in Percona to assist with configuring our databases as we grew. Be okay knowing you don’t possess DBA superpowers.

5. Logging is cheap. Use it.

When a customer reports an issue and all I can say is “works for me”, I feel pretty inept. I’m much more aggressive logging sensitive areas of code today than ten years ago. A robust test suite isn’t a replacement for all of the odd edge cases that can occur in a production.

Summary

When you build a calmer web app, you instill those same traits inside your company’s DNA.


Written by dlite | Working on booklet.ai, co-founded Scout Server Monitoring (acq. 2017) & ScoutAPM (acq. 2018).
Published by HackerNoon on 2018/04/20