Django SaaS Architecture: Single-Tenant vs Multi-Tenant - Which Is Right for You?

Written by pbityukov | Published 2023/11/01
Tech Story Tags: django | multi-tenant | postgres | python | saas | saas-development | single-tenant | hackernoon-top-story | hackernoon-es | hackernoon-hi | hackernoon-zh | hackernoon-fr | hackernoon-bn | hackernoon-ru | hackernoon-vi | hackernoon-pt | hackernoon-ja | hackernoon-de | hackernoon-ko | hackernoon-tr

TLDRDjango is a popular framework that you can select to develop an application for your company. But what if you want to create a SaaS application, that multiple clients will use? What architecture should you choose? Let’s see how this task can be approached.via the TL;DR App

Django is a popular framework that you can select to develop an application for your company. But what if you want to create a SaaS application that multiple clients will use? What architecture should you choose? Let’s see how this task can be approached.

Single-Tenant Architecture

The most straightforward approach is to create a separate instance for each client you have. Let’s say we have a Django application and a database. Then, for each client, we need to run its own database and application instance. That means that each application instance has only one tenant.

This approach is simple to implement: you need to just start a new instance of every service that you have. But at the same time, it can cause a problem: each client will significantly increase the cost of the infrastructure. It may be not a big deal if you plan to have just a few clients or if each instance is tiny.

However, let’s assume that we are building a large company that provides a corporate messenger to 100,000 organizations. Imagine, how expensive it can be to duplicate the whole infrastructure for each new client! And, when we need to update the application version, we need to deploy it for each client, so the deployment will be slowed down too.

Multi-Tenant Architecture

There is another approach that can help in a scenario when we have a lot of clients for the application: a multi-tenant architecture. It means that we have multiple clients, that we call tenants, but they all use only one instance of the application.

While this architecture solves the problem of the high cost of dedicated instances for each client, it introduces a new problem: how can we be sure that the client’s data is securely isolated from other clients?

We will discuss the following approaches:

  1. Using a shared database and shared database schema: We can identify which tenant owns the data by the foreign key that we need to add to each database table.

  2. Using a shared database, but separate database schemas: This way, we won’t need to maintain multiple database instances but will get a good level of tenant data isolation.

  3. Using separate databases: it looks similar to the single-tenant example, but won’t be the same, as we will still use a shared application instance and select which database to use by checking the tenant.

Let’s dive deeper into these ideas and see how to integrate them with the Django application.

A Shared Database With Shared Schema

This option may be the first that comes to mind: to add a ForeignKey to the tables, and use it to select appropriate data for each tenant. However, it has a huge disadvantage: the tenants’ data is not isolated at all, so a small programming error can be enough to leak the tenant’s data to the wrong client.

Let’s take an example of database structure from Django documentation:

from django.db import models


class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField("date published")


class Choice(models.Model):
    question = models.ForeignKey(Question, on_delete=models.CASCADE)
    choice_text = models.CharField(max_length=200)
    votes = models.IntegerField(default=0)

We’ll need to identify which records are owned by which tenant. So, we need to add a Tenant table and a foreign key in each existing table:

class Tenant(models.Model):
    name = models.CharField(max_length=200)


class Question(models.Model):
    tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE)
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField("date published")


class Choice(models.Model):
    tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE)
    question = models.ForeignKey(Question, on_delete=models.CASCADE)
    choice_text = models.CharField(max_length=200)
    votes = models.IntegerField(default=0)

To simplify the code a little bit, we can create an abstract base model that will be reused in each other model that we create.

class Tenant(models.Model):
    name = models.CharField(max_length=200)


class BaseModel(models.Model):
    tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE)

    class Meta:
        abstract = True


class Question(BaseModel):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField("date published")


class Choice(BaseModel):
    question = models.ForeignKey(Question, on_delete=models.CASCADE)
    choice_text = models.CharField(max_length=200)
    votes = models.IntegerField(default=0)

As you can see, there are at least two major risks here: a developer can forget to add a tenant field to the new model, or a developer can forget to use this field while filtering the data.

The source code for this example can be found on GitHub: https://github.com/bp72/django-multitenancy-examples/tree/main/01_shared_database_shared_schema.

A Shared Database With Separate Schemas

Keeping in mind the risks of the shared schema, let’s consider another option: the database will be still shared, but we’ll create a dedicated schema for each tenant. For implementation, we can look at a popular library django-tenants (documentation).

Let’s add django-tenants to our small project (the official installation steps can be found here).

The first step is the library installation via pip:

pip install django-tenants

Change the models: the Tenant model will now be in a separate app Question and Choice models won’t have a connection with the tenant anymore. As different tenants’ data will be in separate schemas, we won’t need to link the individual records with the tenant rows anymore.

The file tenants/models.py

from django.db import models
from django_tenants.models import TenantMixin, DomainMixin


class Tenant(TenantMixin):
    name = models.CharField(max_length=200)

    # default true, schema will be automatically created and synced when it is saved
    auto_create_schema = True


class Domain(DomainMixin):  # a required table for django-tenants too
    ...

The file polls/models.py

from django.db import models


class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField("date published")


class Choice(models.Model):
    question = models.ForeignKey(Question, on_delete=models.CASCADE)
    choice_text = models.CharField(max_length=200)
    votes = models.IntegerField(default=0)

Notice that Question and Choice don’t have a foreign key to Tenant anymore!

The other thing that was changed is that the Tenant is now in a separate app: it’s not only for separating the domains but also important as we will need to store the tenants table in the shared schema, and polls tables will be created for each tenant schema.

Make changes to the settings.py file to support multiple schemas and tenants:

DATABASES = {
    'default': {
        'ENGINE': 'django_tenants.postgresql_backend',
        # ..
    }
}
DATABASE_ROUTERS = (
    'django_tenants.routers.TenantSyncRouter',
)
MIDDLEWARE = (
    'django_tenants.middleware.main.TenantMainMiddleware',
    #...
)
TEMPLATES = [
    {
        #...
        'OPTIONS': {
            'context_processors': [
                'django.template.context_processors.request',
                #...
            ],
        },
    },
]
SHARED_APPS = (
    'django_tenants',  # mandatory
    'tenants', # you must list the app where your tenant model resides in

    'django.contrib.contenttypes',

    # everything below here is optional
    'django.contrib.auth',
    'django.contrib.sessions',
    'django.contrib.sites',
    'django.contrib.messages',
    'django.contrib.admin',
)

TENANT_APPS = (
    # your tenant-specific apps
    'polls',
)

INSTALLED_APPS = list(SHARED_APPS) + [app for app in TENANT_APPS if app not in SHARED_APPS]

TENANT_MODEL = "tenants.Tenant"

TENANT_DOMAIN_MODEL = "tenants.Domain"

Next, let’s create and apply the migrations:

python manage.py makemigrations

python manage.py migrate_schemas --shared

As a result, we’ll see that the public schema will be created and will contain only shared tables.

We’ll need to create a default tenant for the public schema:

python manage.py create_tenant --domain-domain=default.com --schema_name=public --name=default_tenant

Set is_primary to True if asked.

And then, we can start creating the real tenants of the service:

python manage.py create_tenant --domain-domain=tenant1.com --schema_name=tenant1 --name=tenant_1
python manage.py create_tenant --domain-domain=tenant2.com --schema_name=tenant2 --name=tenant_2

Notice that there are now 2 more schemas in the database that contain polls tables:

Now, you’ll get the Questions and Choices from different schemas when you call APIs on the domains that you set up for the tenants - all done!

Although the setup looks more complicated and maybe even harder if you migrate the existing app, the approach itself still has a lot of advantages such as the security of the data.

The code of the example can be found here.

Separate Databases

The last approach that we will discuss today is going even further and having separate databases for the tenants.

This time, we’ll have a few databases:

We’ll store the shared data such as tenant’s mapping to the databases’ names in the default_db and create a separate database for each tenant.

Then we’ll need to set the databases config in the settings.py:

DATABASES = {
    'default': {
        'NAME': 'default_db',
        ...
    },
    'tenant_1': {
        'NAME': 'tenant_1',
        ...
    },
    'tenant_2': {
        'NAME': 'tenant_2',
        ...
    },
}

And now, we’ll be able to get the data for each tenant by calling using QuerySet method:

Questions.objects.using(‘tenant_1’)…

The downside of the method is that you’ll need to apply all migrations on each database by using:

python manage.py migrate --database=tenant_1

It also may be less convenient to create a new database for each tenant, compared to the usage of the django-tenants or just using a foreign key as in the shared schema approach.

On the other hand, the isolation of the tenant’s data is really good: the databases can be physically separated. Another advantage is that we won’t be limited by using only Postgresql as it’s required by the django-tenants, we can select any engine that will suit our needs.

More information on the multiple databases topic can be found in the Django documentation.

Comparison

Single-tenant

MT with shared schema

MT with separate schema

MT with separate databases

Data isolation

✅High

❌Lowest

✅High

✅High

Risk of leaking data accidentally

✅Low

❌High

✅Low

✅Low

Infrastructure cost

❌Higher with each tenant

✅Lower

✅Lower

✅❌ Lower than single-tenant

Deployment speed

❌Lower with each tenant

✅❌ Migrations will be slower as they need to be executed for each schema

✅❌ Migrations will be slower as they need to be executed for each database

Easy to implement

❌ Requires a lot of changes if the service was already implemented as a single-tenant app

Conclusion

To summarize all the above, It looks like there is no silver bullet for the problem, each approach has its pros and cons, so it’s up to the developers to decide what trade-off they can have.

Separate databases provide the best isolation for the tenant’s data and are simple to implement, however, it costs you a higher for maintenance: n database to update, database connections numbers are higher.

A shared database with a separate schema bit complex to implement and might have some problems with migration.

Single tenant is the most simple to implement, but it costs you by resource over-consumption since you have an entire copy of your service per tenant.


Written by pbityukov | Backend Software Engineer.
Published by HackerNoon on 2023/11/01