IT Controls: How Far to Take the "Version Everything” Mantra
If you work (or are otherwise interested) in the world of IT, chances are you've heard the mantra “version everything". For the last ten years or so the industry has been taken over by talk of Continuous Delivery and DevOps.
Both these IT philosophies are predicated on continuously working your way through a number of recurring tasks under slightly different conditions – such as the build and deployment of applications and packages, while resetting environments and databases as needed. As such, it's hugely important to automate all eligable processes (using either canned scripts or purpose-specific tools) as well as to catalogue and validate all changes throughout the process in a version control system.
Run a Tight Ship: Plug the Versioning Leaks
Unfortunately, not everybody realizes that database code is included in “everything”. That’s why database enforced change management and source code version control solutions complement each other so well.
There are many important differences between working with normal files or scripts and working on a database. As an example, the database is a centralized and shared resource that, by definition, must be homogenized for the entire team. There's also the small matter of database objects having their own syntax for creation and modification.
Either way, the point is that these differences only make it all the more important that database be handled with appropriate IT controls. It's somewhat ironic then that the database is on the short list of IT domains most likely to neglect versioning.
Because of the need for universal database synchronicity, there are several approaches to managing the DB work process. For example, one method is to have each developer work in a sandbox environment, which is similar to having a local directory for the files. Another method is to share database schema(s) between the teams.
Each of these methods has its comparative benefits and drawbacks. Regardless of your preferred approach, it's essential that it follow these version control best practices:
- Every change is versioned. No one can modify an object within the database without documenting the change.
- Synchronization between the team is needed in order to prevent code overrides.
- The merge process of parallel development from different environments must be easy.
- The team must have the ability to collect all changes and wrap them into a deploy unit.
Let’s explore each of these requirements in greater detail:
Every Change is Versioned
Having all changes versioned means that no one can modify any object in the database, not even connections using the command line, without having the change documented in the version control repository.
When you use Perforce, for example, to manage your files, this feature is automatic. It's unlikely that someone will change something directly in the Perforce repository. All changes are submitted to Perforce, otherwise they do not exist.
In order to achieve this, you must have a database enforced change management method that facilitates digital agility while at the same time maintaining prudent IT controls to prevent out-of-process changes.
Synchronization on a Central Resource
Working on central resource requires a method for synchronizing the work. For example, you might have a calendar for a meeting room in order to synchronize the appointments to be held there. Database objects are no different. Without proper synchronization, conflicts can happen if someone doesn't following protocol.
Some synchronization methods provide an indication that an object is being worked on (i.e. a developer will Check-Out the object), much like the “On-Live” light in a TV or radio broadcasting room. The status of the object can be monitored through a version control application that in addition to monitoring and presenting the current status can also save the definition of the objects after the change.
But having an indication in the version control UI is not a silver bullet in terms of IT controls, as another developer can still log in to the database using the database IDE, and start working directly on the object without seeing the in-work indication (Check-Out). Preventing this problem requires integration with the database engine.
Easy Merge Process of Parallel Development
When developers have a sandbox environment, they're required to promote changes into the integration or team environment, and update the sandbox with the latest changes. This process is very easy when a solution such as Perforce is used, because conflict notification and merges happen when changes are submitted. At this point, Perforce checks to see if there is a conflict with the file, and aids in the process of merging the text and updating the local file.
With database development, developers cannot work disconnected from the database without compiling the change in the database and checking for syntax errors. The question therefore is: what script will be stored in the version control repository? Will it be the created object or the accumulated alter statements?
If it's the created statement, then the submit change process is responsible for updating the database environment (which is also the developer sandbox environment).
If it's the accumulated alter statements, then a developer must execute the alter statement in the sandbox and add it to the script.
A conflict can happen when two developers add their own alter statements to the table. No one can execute the script without error, as the script has changes from both developers and each developer needs to execute only the changes made by the other one
In order to streamline the merge process during parallel development, therefore, a three-way analysis comparing the live database schema and a baseline is required.
Collecting all Changes and Wrapping Them into a Deploy Unit
It doesn’t matter if you’re working in Waterfall or Agile, you must have an easy process for collecting all changes from one single source (the source control repository) and wrapping them in to a deployable package.
For native code (such as Java, .NET), this is usually known as the compile or build process. This process normally executes on a special server called build server, which takes the latest revision of the code and initiates the build process.
The same process should exist for database code. Of course, the process for the database is a little bit different, as it must consider the existing schema definition (a live database which has the previous release) and the target schema definition (the definition in the repository).
Note that this is not just a matter of running a copy and replace process as it is with source code version control. Instead, it is a collection of SQL statements that modify existing database schema(s) to create a new structure while maintaining the integrity of the underlying data.
The Makings of Improved IT Controls
Any system designed to support rapid and continuous database deployment, especially within a continuous delivery and DevOps framework, should be easy-to-use, incorporate best practices for IT controls, and be fully automated in order to keep the lifecycle moving smoothly and prevent developers from pulling through the wrong changes.
Indeed, to follow through on the mantra to “version everything”, database code changes should be included. For that, you'd be well-served to make use of a smart enforced change management solution for the database.