Multi-tenancy after 10 years of Cloud Computing

It has been ten years since Amazon launched EC2. Cloud is very much real. Thousands of systems run in the Cloud.

Multi-tenancy is a key concept tied to cloud from its inception. It let multiple untrusting parties share resources while giving them the illusion of his own space. The best example of this idea is an apartment complex, which gives each tenant his own space while sharing resources like plumbing, shared spaces, security, maintenance etc. For multi-tenancy to be useful, the resource cost per tenant should be less than the cost of owning his own resource (house) exclusively.

A multi-tenant App in the Cloud needs to manage three types of resources: OLTP executions, data storage, and OLAP batch executions (analytics). For an example, think of an App that has a web app or a mobile app. The App will have a set of services that provide the backend for the app, a database, and a system to process its analytics. Since the state of the art architectures is built using stateless servers, separate databases, and separate analytics systems, we can tackle multi-tenancy separately for each aspect data, execution, and analytics. Let’s explore each of these aspects and discuss techniques for supporting them.

Execution Multi-tenancy

Cloud can implement sharing at different levels.

Give each tenant his own machine.
Give each tenant his own VM
Give each tenant his own container (e.g. docker)
Let multiple tenants share the same process (e.g. JVM)

As going down the list, each gets more resource sharing, but less isolation (security). IaaS ( Infrastructure as a Service) platforms provide level two in the above list. PaaS and SaaS providers should choose among 2–3.

One of the key advantages of the Cloud, if it is properly done, is that a user only “Pay for what she used”. Basically, there are no costs for being available, rather you pay only if it gets used. For IaaS such as AWS, this comes almost for free as it shares hardware across users through VMs.

However, “Pay for what you use” does not come free for PaaS and SaaS. Let’s assume a Cloud provider “XCloud” that has 1 million users, who have deployed their PaaS/SaaS apps in XCloud.

At a given time, only a few hundred to thousands of those Apps will be active. However, we do not know which Apps. So when a user arrived at XCloud for AppY, out of the blue, we need to be able to serve him. We have two choices.

Keep at least one instance of each App running
Boot up the App fast when a user has arrived.

Option one is really wasteful. Think AppY is Gmail, and Google does not want to keep running a VM per each Gmail account all the time. If they keep a once instance for each user running, it is hard for them to let users pay only when there App used.

This is the difference between VMs vs. containers or in process multi-tenancy. VMs take seconds to minutes to boot, and unless your app is really simple, there is no way you can boot it up fast enough to serve the user who just came. Typically, you need to get back to the user in less than ten seconds.

Containers like docker solve this problem. They are lightweight VMs and can boot up in milliseconds. Thinking about the Gmail example, it is no wonder containers came from Google as they have the same problem. It is worth noting that using docker does not guarantee that your app will start fast, and you can very well mess it up at your App’s code. However, with careful coding, you can often hit the mark. Hence, containers remove a major overhead added by infrastructure.

Another interesting point to note is that this is the reason microservices and docker based systems are so interested in the start time. If your system can boots-up fast enough to serve the user, that will lead to major cost saving as you do not need to keep an instance running all the time per App. Moreover, it will also significantly simplify the autoscaling algorithms as then the algorithm does not need to predict the load and keep a buffer of instances running to counter for startup time.

Comparing containers vs. in process multi-tenancy (level 4 in our list), the latter can be even more efficient. We have done a lot of work in the area [1,2]. However, sharing a process look scary in retrospect as you need to trust the programmer of the platform of not having made a mistake. Also, performance isolation is tricky with in-process multi-tenancy.

It seems Cloud computing has settled with containers based multi-tenancy for executions, the middle ground. Containers provide acceptable performance and isolation. In retrospect, a sensible choice.

Finally, there is a much simpler solution that works in some cases.

It is worth noting that, if your servers are stateless, which is the case in most enterprise deployments, then you can do without multi-tenancy at all. If server is stateless and it can look at the request, figure out the tenant, and just in time do what is required for that tenant, then you do not need multi-tenancy. Instead you can run a pool of servers common to all tenants and assign one to each app as users arrives. For data, you can use a multi-tenant data base as described in the next section. If this solution possible, it will reduce the complexity of your architecture significantly.

Data Multi-tenancy

After a restart, both VMs and containers lose their disks. Hence, multi-tenanting stateful servers such as databases, message broker etc are much more complicated than running them in a container. One option is to use the mounted disks such as S3 or block devices. However, this could be slow. For most applications, we need to use a database system that runs on top of proper hardware.

Even on top of hardware, we need to share storage across tenants. There are several choices. The article Multi-Tenant Data Architecture provides a great outline of these choices except for the following choice #5, which did not exist at the time of writing.

A Database server per tenant
A Database per tenant (the same server is shared between multiple tenants)
A Table per tenant
A Table shared among tenants
Multi-tenant aware databases

The approach #1 and #2, sharing the database server or the database, is acceptable in an IaaS setup, but prohibitively expensive for PaaS or SaaS setup. For example, it is not practical to keep a million databases one per each tenant.

The approach #3, giving a table per tenant, is better than #1 and #2, but still prohibitive if we are talking about million tenants.

The approach #4, having one table shared between many tenants, provides the necessary performance. However, all tenants need to share the same schema with this method, which is acceptable for most PaaS and SaaS scenarios. Just like in-process multi-tenancy, for isolation, we need to trust the programmers of the database system to not have made any mistakes. However, it is easier to verify the SQL filtering rather than verifying Java or C++ code used with the in-process multi-tenancy.

The fifth approach (e.g. Oracle now have a multi-tenant database server) will provide the best of all worlds. If carefully implemented, it can provide #4 level or better performance and isolation at the database level. Although we would have to trust the RDBMS developer to not make a mistake, I believe that within the RDBMS there is a better chance of handling isolation. Moreover, database developers are likely to understand the domain much better, which will let them do a better job.

Analytics Multi-tenancy

Thanks to Big data, everything must have analytics. For most SaaS applications, analytics has become a competitive advantage. Multi-tenancy requirements for Analytics are different from OLTP use cases that we have considered so far. Implementing them in a PaaS or a SaaS environment needs answers for several challenges.

Analytics includes data collection, data storage, running analysis, and provide controlled access to results. Cloud providers can do following to facilitate data collection.

Add instrumentations to track transactions
Provide data collectors operators that users can place within their apps
Provide a data sink API (e.g. REST/JSON API) to which users can publish events

Each method should track tenant and user information with each event.

While handling the rest, solutions for storage, analysis, and data collection are intertwined with each other. In my opinion, the answer depends on several questions about what user needs.

Does all tenant share the same schema? Should they able to define their own event types?
Do users need user-level data isolation or can they live with tenant level isolation?
Do users and tenants need to run their own queries ( analysis) or can they live with pre-canned queries?

Based on these requirements, solutions change. Let’s look at each solution.

Use Super-tenant Space

If the PaaS control the data generation, data analysis, and presentation of the final results, then the system can push data into a super-tenant store and handle permissions itself. Although this might be a boring solution, this is a common use case for most SaaS apps, which gives their user’s pre-cooked analytics and nothing else. If users want to do their own analysis, the system can offer a way to export their data.

Shared Tables Filtered by a Column

Just like with the data storage, if all users can share the same set of tables (schema), then we can store data by just adding a user and tenant columns to the table. This works pretty well. Assigning ownership to the results of the data calculated by analysis, however, has some complexity. Hence the analysis logic (e.g. Hadoop Jobs) have to resolve the ownership of each record by adding ownership data as part of the data record. Another advantage is that with this method, PaaS can run a single set of analytics queries once on all the data partitioned using “group by” operators.

Tenant’s Own space

Providing a private space for each tenant is the most flexible. It provides most of the control to the end user. Each tenant can have his own schema and his own analytics queries.

As usual, this flexibility needs to be paid for. It is expensive due to several reasons.

If there are a lot of tenants, giving a database table for each tenant is a problem if you are using conventional databases as they are not designed to handle millions of tables nor databases. As discussed before, one solution is to use a natively multi-tenant database (e.g. Oracle 12). Otherwise, the PaaS needs to handle many database servers and partition tables between those servers, which will be very complicated.
Unlike earlier method where SaaS can run a single analytics job to process data across all tenants (e.g. partitioned using “group by”), having a tenant’s own space would need analytics job per each tenant. Often initiating an analytics job has significant overhead. Hence, in contrast to running a single job, having many jobs will be significantly slower. In this setup, SaaS need a much larger computing cluster. Another challenge is that users can run a heavy job that will hog the cluster and slow down others. Hence, the SaaS needs a way to limit the amount of computing power used by a single tenant.

It is possible to do a hybrid solution where SaaS provides a shared Schema for shared data and provide a tenant’s own space for user-defined data. PaaS can charge additional cost from the users for private space and thereby limit the number of users. In either case, the cost would be proportional to the data generated by the tenant and cost structure need to be adjusted accordingly.

Conclusion

After ten years of Cloud computing, there are thousands of systems that run in the Cloud. This article explores multi-tenancy, a key idea in the cloud. Multi-tenancy is the ability to run multiple tenants (users) within the same system. While realizing multi-tenancy, a Cloud may choose between multiple levels of isolation ranging from sharing same hardware using VMs to sharing the same process through clever programming. We explored how to build a multi-tenant App. A multi-tenant App in the Cloud needs to manage three types of resources: executions, data storage, and OLAP batch executions ( analytics). We explored each and saw that the executions are converging towards containers and databases towards multi-tenant databases. Furthermore, the article discussed several solutions to support multi-tenant analytics.

References

Milinda Pathirage, Srinath Perera, Sanjiva Weerawarana, Indika Kumara, A Multi-tenant Architecture for Business Process Execution, 9th International Conference on Web Services (ICWS), 2011
Afkham Azeez, Srinath Perera, Dimuthu Gamage, Ruwan Linton, Prabath Siriwardana, Dimuthu Leelaratne, Sanjiva Weerawarana, Paul Fremantle, Multi-Tenant SOA Middleware for Cloud Computing 3rd International Conference on Cloud Computing, Florida, 2010

Hope this was useful if you enjoyed this post you might also find the following interesting.

Mastering the Four Balancing Acts in Microservices Architecture_Microservices are the new architecture style of building systems using simple, lightweight, loosely coupled services…_medium.com

Chronicle of Big Data: A Technical Comedy_Act 1: Google doesn’t like Databases_hackernoon.com

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMIfamily. We are now accepting submissions and happy to discuss advertising &sponsorship opportunities.

To learn more, read our about page, like/message us on Facebook, or simply, tweet/DM @HackerNoon.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!