How and Why I Self Host

Written by ikarus | Published 2020/06/26
Tech Story Tags: raspberry-pi | self-hosting | kubernetes-cluster | docker | open-source | why-self-host | how-i-self-host

TLDR This article is the first article written by Will Ho on a self-hosted blog. Ho is a Machine Learning Engineer. He started hosting his own Wordpress site for shits and giggles in his final year of university education. How and Why I Self Hosted is the practice of running and maintaining a website using a private web server. Ho's first post on the blog marks the first time he has written this article on his own site. He has also written an article on how and why he started self hosting his blog.via the TL;DR App

def main():
    print('Hello World!')
    print('This marks my first post on this blog!')

if __name__ == '__main__':
    main()
As I originally posted this piece on my self-hosted blog, I’ll start with an article on why and how I even embarked on this journey.
Self-hosting is the practice of running and maintaining a website using a private web server. (Wikipedia)
I started self-hosting not because I had a grand vision from the very beginning, but rather it was for a ridiculously superficial reason.

Ridiculous origins

In 2017, in my final year of my university education in Computer Science, I was staying in the university hostel and I had a ton of extra time on my hands. I was the most senior in the hostel and I wanted my room to stand out among the rest. Being my narcissistic self then, I wanted to build something ludicrous and flashy that screams handsome and nerdy at the same time, something that screams, ME.
And so I bought 5 used Raspberry Pi 1Bs for cheap and assembled them into a cluster
I will go into details of my build process in another post, for now let's take it for what it is.
Turns out, I never really found a real use for the cluster. Several months flew by and I've kept the cluster powered but done nothing more than flash the SD cards with Raspbian. Talk about an expensive Christmas tree!
Then a day came when I thought I should try to host my own Wordpress site for shits and giggles. It was a really painful process getting PHP and Nginx to work with such restrictive hardware but eventually after a month of tinkering, I made it work.

The Need for Space

With my confidence boosted from this monumental achievement, I started tackling the next problem at hand.
I was running out of space on my Dropbox.
My 25GB of free space from the Dropbox Space Race had expired a year earlier and since then I had been living on a couple of meagre gigabytes of space for all my academic files.
Dropbox Space Race?
For the uninitiated, the Dropbox Space Race was the hottest marketing campaign to rock universities in 2012, where students in each school contribute collectively to reach a stretch goal of a certain number of referrals and depending on their school's position on the leaderboard, they are granted each a certain amount of free space on Dropbox for 2 years.
My university was ranked #1 among the likes of Massachusetts Institute of Technology and UC Berkeley. Shows how crucial cloud storage is to us (or how FOMO we all are).
You may or may not know this but file-sync & cloud storage was incredibly important back then to students in a highly-competitive environment such as mine. Given the amount of gaming students do, it is unsurprising for laptops to give up the ghost without warning. Hence, being able to backup and sync all your notes and homework from past semesters across your devices may make the difference between an A- and a B+ (or a C+ and a D depending on where you stand on the bell curve).
Thus I searched for Dropbox alternatives and found Nextcloud, a multi-platform FOSS (Free and open-source software) cloud storage/file-sync. It also doubles as a photo backup solution allowing photos to be synced at full-size without the sketchy 'High Quality' compression of Google Photos.
Coincidentally, it runs on the LAMP stack (Linux, Apache, MySQL, PHP), exactly the same setup as Wordpress which I'm already familiar with.
I then proceeded to tinker with it over 2 weeks before I finally got it to work the way I wanted, tweaking maximum file upload sizes, applying some MySQL InnoDB hacks and so on.

Pain-point Discoveries & Patches

Some say self-hosting is an addiction, in that you tend to 'come back' for more after the first. I didn't understand it at first but as I reflected while typing this article, I saw why.
Over the next few years, I gradually discovered more and more pain-points in my day-to-day workflow, and I patched each of those with a self-hosted FOSS alternative to paid software. For example, I realized I needed a password manager, a media library manager, a media streamer to access my media on the go, a cross-platform note-taking/syncing app, etc.
As I delved ever deeper into the world of open source, I found an entire universe of great software maintained by a vibrant community. Deploying new applications and testing them soon became an activity I look forward to every week.
A few of my all-time favourite self-hosted apps:
    Before I knew it, my cluster grew from 6 nodes of Raspberry Pi 1B running apps on the bare-metal operating system, to a 10-node hybrid cluster of armv7 and arm64 SBCs running 56 containerized workloads on Kubernetes.
    I'm the kind of person that gets a small kick out of giving the finger to those big corps by achieving what their products do at a fraction of the cost. Let's just say I'm the worst person to introduce self-hosting to.

    How has it been for me?

    It was an incredibly rewarding journey so far with my cluster(s) and along the way, I have picked up a wide range of proficiencies ranging from Linux, Docker, Docker Swarm and Kubernetes to DNS, TLS and Network topology. Being on the cutting edge of FOSS, I was able to discover and report bugs, submit pull requests and have since started my own small projects.
    Without self-hosting as my motivation, I would have given in to procrastination and never gotten down to actually do all those things I've done.
    To sum it up, it's an experience like no other.
    Now after hearing from me, it brings us to the next question.

    Should you self-host?

    There are many reasons why you should or should not self-host your applications, I'll attempt to highlight the top few.
    You should self-host if:
    1. You're still reading patiently at this point
    2. You enjoy tinkering with stuff and fussing over minute details, spending many hours or at times even days at the tiniest of problems
    3. You are a generalist who wants to learn anything and everything about computers
    4. You like the cheap thrills you get from being able to achieve everything big companies do with their products at little to no cost
    5. You want to own your data and web services
    I put ownership of data as the final point as I believe that complete ownership of your data is a moonshot if not impossible goal in today's highly interconnected world and should be the last motivation you should have when it comes to self-hosting.
    I cannot emphasize point #2 enough. Over the past 3 years of self-hosting, I've run into numerous problems and spent many days debugging the issue only to find that the problem was a typo in the configuration or worse, a yet-to-be-discovered bug with the FOSS software I was trying to host. I'll illustrate this with an example.

    A painful typo you'll never spot

    Just 4 weeks ago, I was trying to deploy OpenLDAP, a lightweight client-server protocol for accessing directory services, commonly used for storing credentials and user metadata. This was meant to be the core of my SSO service where users can just use a single username and password to access all my services. Have a look at my Kubernetes env configuration for osixia/openldap-backup, a dockerized implementation of OpenLDAP.
          containers:
          - name: openldap
            image: osixia/openldap-backup:1.3.0
            env:
            - name: LDAP_ORGANIZATION
              value: ikarus
            - name: LDAP_DOMAIN
              value: ikarus.sg
            - name: LDAP_BASE_DN
              value: 'dc=ikarus,dc=sg'
            - name: LDAP_ADMIN_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: openldap
                  key: admin_password
            - name: LDAP_READONLY_USER
              value: 'true'
    I struggled with why OpenLDAP did not pick up the LDAP_ORGANIZATION value from the environment variable for 2 days, deleting, redeploying, changing the order of the variables, and only when I actually dug into the code did I see how trivial this error was.
    slapd slapd/internal/generated_adminpw password ${LDAP_ADMIN_PASSWORD}
    slapd slapd/internal/adminpw password ${LDAP_ADMIN_PASSWORD}
    slapd slapd/password2 password ${LDAP_ADMIN_PASSWORD}
    slapd slapd/password1 password ${LDAP_ADMIN_PASSWORD}
    slapd slapd/dump_database_destdir string /var/backups/slapd-VERSION
    slapd slapd/domain string ${LDAP_DOMAIN}
    slapd shared/organization string ${LDAP_ORGANISATION} ## THIS LINE 
    slapd slapd/backend string ${LDAP_BACKEND^^}
    The variable name should have been LDAP_ORGANISATION and not LDAP_ORGANIZATION
    Looking back at the documentation, I couldn't believe how I could have missed that.
    Turns out the organization maintaining the Docker image of OpenLDAP, osixia, is based in Nantes, France and that was probably why they wrote the variable names in British English instead of American English. 🤦
    If what you just read terrifies you deeply, I'd suggest you reconsider.
    You should NOT self-host if:
    1. You prioritize convenience and reliability of services over all else
    2. You hate getting stuck at absolutely meaningless problems
    3. You are not the least bit concerned about ownership of your data and web-services

    Closing note

    For those who have decided to self-host after reading this, I'm going to tell you that the reliability of self-hosted services is far from that which you get from paying for an equivalent service. No matter how much experience you have, you're bound to run into reliability problems at some point in your journey.
    If you're still okay with that, then go forth!

    What's next for my blog?

    In the upcoming posts, I'll showcase my network map, detail the build process and evolution of my clusters and share some of the things I learnt, that you can do to squeeze more out of your Raspberry Pi/SBC.
    Originally published at https://ikarus.sg on June 12, 2020.

    Written by ikarus | Machine Learning Engineer
    Published by HackerNoon on 2020/06/26