Scotland, 2019. C. Brady. Used by permission.

Isolated Multitenant SaaS

Kevin Kautz

--

As Software-as-a-Service (SaaS) grows in each industry, in a world where everything is cloud-hosted, there are some intriguing variations on the themes of multitenant software with data isolation.

Fully Isolated Software-Only

Take the time to read the Medium post from Palantir, where they state clearly that they are not a data company but a software company. This is from Palantir is not a data company (Palantir Explained, #1):

We build digital infrastructure for data-driven operations and decision making. Our products serve as the connective tissue between an organisation’s data, its analytics capabilities, and operational execution.

The company publishes and continually upgrades software in each client’s isolated instance. Their CI/CD pipeline delivers to many client production environments. This means that data isolation is a given. Each client has its own data stores, and there is no data from Palantir that is shared to multiple clients.

Isolated Software Instances with Shared Data

A different variation can be found in a financial technology company, Experian. In this case, data from Experian is added to the client’s own data. The link to their SaaS product is here: https://www.experian.com.cn/en/experian-aperture-data-studio:

Our platform’s unique underlying database enables business users to conduct profiling analysis and relationship discovery with incredible speed.

It’s not immediately obvious, but clients of Experian who use this software also purchase credit reporting data from Experian. Their data studio is preprogrammed to make it easy to import data that you buy from Experian.

This means that in addition to the fully isolated software instances, each client also has a well-paved road to bring in data from Experian as the data source. The Experian software loads the Experian data. Data from Experian is provided to many clients, so this data, at least, is shared data.

Shared Data without SaaS

Another variation allows multi-tenant shared data without providing software to your clients. Snowflake, the cloud database company, offers a data exchange where the company that owns the shared data can publish that data by simply authorizing clients to link directly to the cloud-hosted database: https://www.snowflake.com/workloads/data-sharing/:

Replace traditional data sharing methods and eliminate the need for copying, transforming, and moving data. Instead, provide direct access to data that remains in place.

In this variant, the vendor’s shared data can be refreshed continuously and every client that has been granted access can see the latest data. Of course, if each client also happens to store their own data in a Snowflake cloud database, this means that you can literally do SQL joins from the client’s own database to the shared database from the vendor.

Hybrid of Client Isolation Instance and a Multitenancy Instance

In an article that describes a more traditional multitenancy application architecture, Florin Coros describes how you can deploy your software to an isolated instance for one or more large clients, and serve many smaller clients through a single multitenancy instance. He describes this in the following blog post: https://oncodedesign.com/data-isolation-and-sharing-in-multitenant-system-part1/

For example, in a scenario with one big tenant and many other small tenants we might have one database for the big tenant and one shared for the small ones.

Industry Trends

Given the constraints of regulations surrounding privacy (GDPR, CCPA, CPRA, HIPAA, FHIR) and data localization (touched in NAFTA, TPP, and Schrems II), the emerging consensus seems to be that we will need a blend of the above variations.

That is, what is emerging is a way to solve for the following:

  1. Cloud-hosted SaaS instances where compute resources are client-owned and the software is continuously deployed by the SaaS vendor.
  2. Cloud-hosted DaaS instances that can be directly accessed in the cloud by multiple clients without cloning or copying data. These might be shared-access tables, as Snowflake provides, or data APIs that the software uses to reach shared data from the vendor within the client’s instance of the software.
  3. Client-owned data stores within the cloud-hosted SaaS instance of the vendor software. These can be fully compliant with privacy and data localization regulations.
  4. Vendor services for SaaS functions or DaaS content that can be purchased and enabled individually as needed using capabilities that the vendor’s SaaS instance provides to make this easy. This is not exactly vendor lock-in, but because it might be unreasonably difficult for SaaS clients to get those same services from third-party vendors, it’s very close to lock-in.

Conclusion

It’s no longer enough to distinguish SaaS and DaaS, to understand multitenancy and isolation, or to evaluate the costs & benefits to buying data and services from the same company that provides your SaaS platform. Instead, we must blend all of the above into a single deployment. That’s the future of cloud-hosted SaaS with client isolation for software and data.

--

--

Kevin Kautz

Professional focus on data engineering, data architecture and data governance wherever data is valued.