Database-as-a-Service (DBaaS) is a PaaS component that dramatically simplifies the provisioning, configuration and management of databases in an enterprise. DBaaS allows IT organizations to provide databases with an easy-to-consume, self-service user experience. That the same time it ensures that these databases are operated in a manner that maintains data security and privacy. End users benefit from improved agility, application reliability and performance.
This is no mean feat given the diversity of databases available today. Ranging from single instances to replicated and clustered databases, a variety of relational and NoSQL databases, each one has its own unique attributes and quirks. A DBaaS provides a common set of abstractions through which to consume all of these different databases and this article describes the architecture(s) that make this possible.
A Simple Block Diagram of a DBaaS
A DBaaS provides users with a well-defined API through which clients can interact with the service.
In response to requests from users, the DBaaS manages resources provisioned from an Infrastructure-as-a-Service (IaaS) abstraction. The actual resources reside on some physical hardware which is managed by the IaaS.
The interactions between the DBaaS and the IaaS are managed by an Orchestration Engine.
A DBaaS allows operators to encode best practices and corporate standards for database operation into a set of policies enforced by a Policy Manager. These are specific database configurations that restrict the extent to which an end user can customize the operating parameters.
A Configuration Manager controls these and allows the end user to manage the configuration of individual databases, while also allowing the operator to enforce changes across multiple databases.
While the system is operating, databases can generate a variety of planned and unplanned events and an Event Manager handles these in conjunction with the policies in effect. Where operators have a requirement for reporting of any kind (this could be periodic reporting, billing or any other summarization of system activity), the Event and Reporting functions of the DBaaS satisfy them.
Let us look now at the key value proposition of a DBaaS and see how these components deliver the value proposition of a DBaaS.
The API Service – Common Abstraction
- An API service which provides a single common API for all database technologies dramatically simplifies the life of a developer or devops user. Whether the request is to create a MySQL version 5.6 or version 5.7 database, or for that matter, a MongoDB version 3.4 instance, the API call is the same and may be represented simply as a REST action like:
With a data payload specifying the database type and version.
Similarly, generating a backup of a database could be exposed as an API call which could be accessed as:
With a data payload specifying the instance and the type of backup.
It would be entirely the responsibility of the DBaaS to do all the right things to provision the database(s) requested, or generated the backup requested so the user would not need be concerned with the details.
A Policy Manager – Enforcing Standards and Best Practices
The Operator specifies policies for new instances and controls the actual provisioning.
Similarly, the mechanism for backup can be specified by the user. Some databases support many different kinds of backup. In this case, he operator can specify how backups are taken (prescriptive), what kind they would like while also specifying a default. An operator could for example enable incremental backups for some databases and snapshots for others and these policies would be enforced by the Policy Manager.
The operator may also specify other kinds of policies which control the resources used by the DBaaS. For example, an operator could restrict a developer to only inexpensive instances using traditional disk storage while allowing production users to provision larger more expensive instances on expensive high-performance storage (SSD’s).
The diagram illustrates this. This set of policies is not used only at initial provisioning time; the DBaaS would remember this set of policy choices. If at a later time, some operation was to be performed on the database either by the operator or the user, the Policy Engine would ensure that the policy is preserved.
A Configuration Manager – Implements Policies, Simplifies Management
While the Policy Manager defines a broad set of policies for operation, a specific aspect of these translate into the actual configuration of the database instances. For example, different databases allow the end user to configure the total amount of memory and the maximum number of connections that will be supported. MySQL defines memory usage through a number of independent configuration parameters and uses the max_connections variable to define the maximum number of connections. Another database (say MongoDB) does this differently and uses the maxConns variable to define the number of connections.
The Configuration Manager allows an operator to establish a group of configuration options that will be associated with every database instance at launch. It will include a codification of the best practices that the operator wishes to enforce on all instances. The Configuration Manager allows the operator to define these in terms that are database independent, and translates them into the correct configuration files that get injected on instance creation.
Furthermore, while a database may allow the configuration of numerous options, the operator may restrict the ones that the user can change, and may further restrict the ones that the user may choose. For example, why MySQL allows between 1 and 100000 connections (with a default of 151), an operator may specify a default of 10 and allow the user to only change the value to a number between 5 and 500.
Finally, a DBaaS may be managing thousands of database instances at any given time and the Configuration Manager allows a user to apply a specific configuration change to all his or her instances of a given type (update all MySQL 5.6 instances to a max_connections value of 200), and allows an operator to update all instances of a given type (irrespective of the user who provisioned them) to some specified configuration.
Events and Reporting – Gaining Visibility Into The Operation of DBaaS
In a running DBaaS system with many users and a large number of databases being provisioned and destroyed on an ongoing basis, there is an enormous amount of telemetry that is generated. Some of this is logging and events from the databases themselves and it is often important to the end user. Private and Public cloud users of DBaaS often wish to impose billing and chargebacks to their users and the events and reporting function generates the data stream for this purpose. In general, DBaaS does not implement billing and chargeback as these are handled at a higher level by the operator (potentially at the level of the cloud).
Running database instances sometimes encounter errors and failures and these notifications are fed back to the Policy Engine which can respond to them and initiate self-healing activities. For example, if a user provisions a replicated database and specifies that n-replicas must always be 4 (policy), a failure of a replica must be handled by automatically launching a new one and reinitiating replication. In launching that new replica the configuration established (through the Configuration Manager) will be used.
The Orchestration Engine – Talking to the IaaS
The actual communication of the intent of the policy manager to the IaaS is handled by the Orchestration Engine. IIt receives requests and executes them in an asynchronous manner. While the rest of the DBaaS may be happy to operate in synchronous mode, things that involve physical hardware, or the provisioning of virtual machines, virtual disks and networks can often take time. An Orchestration Engine therefore exposes an asynchronous set of interfaces that can be consumed by the Policy and Configuration Managers, and implement the polling and error recovery required to handle the vagaries of the actual hardware.
In the case of multi-cloud DBaaS solutions, the Orchestration Engine is able to speak with multiple different regions of a single cloud, or multiple different clouds and provision resources in the right location based on policy.
Extending the example of the developer and production user policy earlier, the operator may specify that developers must get virtual machines or containers while production users must get bare metal. Implementing that aspect of policy and handling the various different underlying IaaS components is the responsibility of the Orchestration Engine.
DBaaS – A Lot More Than Just Provisioning
It is a common misconception that DBaaS is merely ‘fancy provisioning’. While provisioning is certainly the first thing that one interacts with, there is a lot more that is required to simplify the usage of databases, and make them truly consumable like a utility service.
Databases are persistent, they retain state and the maintenance over their entire lifecycle is a complex and unforgiving activity. While one could write some custom automation around provisioning, the benefit of a DBaaS solution lies in the fact that the abstractions apply to many different databases, in a myriad of configurations, and through the lifecycle of these databases.
Users of DBaaS solutions derive many benefits including much increased developer agility, DBA productivity, application reliability, security and performance. As you will no doubt appreciate from the above description, delivering all of these require a complex architecture such as the one outlined here and the utility of a DBaaS solution is that it provides this in a simple and easy to consume form.