T-SQL Tuesday #85 - Backup and restore for multi-tenant databases (in the cloud)
Backups are one of the most common things DBAs discuss, and they are at once one of the simplest and most complicated parts of our whole job. So let’s hear it for backup and recovery!
Kenneth also asked for blog posts with #EntryLevel content, and I took up the challenge to write some entry level information. This post will then also be my tenth post in a series of twelve in Tim Fords Entry-Level Content Challenge.
I’m going to blog about backup and restore for multi-tenant databases in the cloud. Specifically, I will blog about data architectures described by Chong, Carraro, and Wolter (see link to their article on Multi-Tenant Data Architecture below).
Imagine that you need to design the database backend for an application used by thousands or millions of customers (which we from the application and database also call tenants) worldwide.
A simple first design could be One tenant, one database, where the data for each tenant is saved in its own database:
This model is shared-nothing seen from the database side. With thousands of tenants, the administrative cost of managing these will be high. On the other hand, Backup-restore of a single tenant is easy – we can use standard methods supported directly by SQL Server/Azure SQL database.
In the middle of the isolated-shared spectrum
resides the Shared database, separate schema design. Here data for multiple tenants is stored in the same database, but resides in tenant-specific tables:
Backup-restore is not directly supported by standard methods in SQL Server/Azure SQL database. One possible way to backup a tenant could be to have a script, which could bcp data to text files. Restore could similarly be a script, which could bcp from txt files to tables in the destination database. Both scripts could be auto-generated from tenant metadata. If the schema for a tenant has 100 tables, the number of tables in a database in this model grows quickly, and the administrative cost of maintaining scripts and tenant metadata could be high. As a side note, no query execution plans can be reused across tenants, since table names are different.
In the other extreme, we have the Shared Database, Shared Schema design. Here, data for multiple tenants is stored in the same database and in the same schema:
We don’t have to store all tenants in the same database, but one tenant will have all data in the same database. The attribute TenantID shown in the illustration above is also called as a partition key. In this model, we can reuse execution plans across tenants, and we can put as many tenants into a database as we have CPU resources to serve (now, a bottleneck for our database will probably be the number of concurrent connections the infrastructure can support). For databases in the cloud, we do have standard methods for backing up and restoring tenants: Meet the open source client library Elastic Database. With this library, we can split one shared tenant databases into two shared tenant databases (in case the load on one single tenant database gets to high), move a single tenant into its own database (of which we can take a backup the normal way), or move a single tenant from its own database into a shared tenant database.
That’s it for this months T-SQL Tuesday. Multi-tenant database design is a big topic. I hope this post have encouraged you to want to learn more. Go to the links beneath to explore this exciting topic. Thanks for reading along…
Read more about T-SQL Tuesday #085 here:
Read Frederick Chong, Gianpaolo Carraro, and Roger Wolter’s article on Multi-Tenant Data Architecture
Read more about the bcp utility here:
Read more about Elastic Database client library here:
Or watch the microsoft virtual academy course on Elastic Database
Read more about Elastic Database split-merge: