In my last post, I wrote an overview of running Db2 in containers. Today, I’d like to talk a bit about where we get Db2 containers and several common ways they are used. Catch my next post about a critical mistake I think IBM is making with containerization.
Ways Db2 Containers are Commonly Used
Replacing Database Servers in Production and Non-Production Environments
Where I work, we run most non-production Db2 environments in containers, and our production and load test environments in EC2 instances (VMs). We spent some time agonizing on this demarcation. It is important to consider your platform and your requirements when deciding what to run in containers and what not to. For me, it’s a no-brainer for development environments because it makes it easy for our application support team to include Db2 in their helm charts and spin up a new environment whenever they need to with little to no DBA involvement.
On the production end, consider the pluses/minuses that I ran through in the overview of running Db2 in containers. It is actually more complicated to have some environments in containers and some not, because now I have to have two different build processes and have to ensure that those build processes remain in sync. Things like patching also require a different process, so I highly recommend that if your production is not in containers, you have at least one non-prod environment in containers.
Sandboxes for Developers
While it seems like an easy stretch from running anything in containers to having a developer container that developers can use on their local environment, there are a few gotchas. The first is that if you have a custom image that includes a license for your other containerized environments, you’ll need to be sure you handle licensing appropriately on the developer locals so you don’t violate the licensing agreement. The second is getting an actual copy of the data, as your containers to spin up a new server environment are often using persistent storage with the data that is not included in the image. Where I work, we handle this by having a purpose-built environment on our K8’s infrastructure where we use a number of processes to copy and clean the data, and a weekly process to take that data and using a sidecar container, make it available to developers. The sidecar approach means that they can tweak their local environment if needed, and when they need a data refresh, just pull the sidecar from our internal docker repository to get the latest data. This works really well.
Sandboxes for DBAs
The first time I used Db2 in a container it was for this. I’ve nearly always had a local linux VM with Db2 installed where I could just mess around and try stuff. I learn a lot through experimentation. Now I just use a db2 container for that most of the time, and it’s much less time in building and maintaining it. I generally don’t care about the data in that environment so this works perfectly for me. However, when I have things that are specific to the structure of our databases, I just spin up the developer’s container to get the right data structures to work with.
Database Maintenance
We now run ALL database maintenance from containers. No more cron, no more scripts on the “servers”. Every script we use will work from a container. Secrets in Rancher (the orchestrator we use) make this secure and easy. I can use the same containers from my local and just pass in the needed flags and passwords. Sometime in the last 6 months, I’ve gotten to the place where this seems super easy and fast to me. It also means we don’t copy scripts around all over the place, and don’t have to make sure that when we update a script it gets out to servers – the latest script is always in the container we use.
Where to get Db2 Containers
There are several different common types of containers and places to get them, and reasonable uses given the source.
Db2 Developer Community Edition Container from DockerHub
This is the official IBM container available on Docker Hub. It includes either a free Db2 developer community edition license or a 90-day “try and buy” license, and you should be sure you don’t tweak it to violate the licensing agreement. Unfortunately, this container requires privileged access (think root for Docker/K8’s) and only allows one to run on any given host/node, so it is therefore not something you’re going to want to use for most shared environments. It is excellent to pull down and spin up locally to test something at the Db2 level, though, and that’s primarily how I use it. It also will tend to only have certain tags/versions of Db2 available, and I’m guessing they won’t keep the older ones around, so be sure to cache a copy somewhere if you’re using it in any kind of role where the fix pack/version matters.
This container may include features you don’t use, and may take longer to spin up as a result. I’ve toyed several times with using this container as a starting point and building customizations on top of it. I still hope to go this way someday, but with the custom method below, I’m able to easily avoid the constraints for privileged mode and how many I can run on a host.
It is also explicitly stated on the page in DockerHub that this container cannot be used for production and is not supported for such use:
Db2 Warehouse / IIAS
The IIAS product uses containers, but largely at a level you have no control over. The same or similar containers are used by Db2 Warehouse (local). This is the form factor I have the least experience with. IIAS is an appliance to replace Netezza using Db2 MPP in containers. Db2 Warehouse is a software-defined appliance that you install on a set of dedicated servers. The initial idea, at least, was that you basically give Db2 Warehouse a set of servers, and it manages defining containers across those servers. While these absolutely represent running Db2 in containers, the amount of control you have at the container level is limited. I believe the code for this is obtained through passport advantage if you’ve purchased Db2 Warehouse.
Create Your Own
Creating your own containers for Db2 is a learning experience and one that I’ll share a more detailed blog entry on at a later time. At a high level, you simply build dockerfiles and entrypoint scripts that create a container as you need it to be. This offers more flexibility than other approaches to containerization. One of the reasons we choose this approach is because it is the easiest one for making sure our containerized and non-containerized environments are as identical as possible. We reuse the scripts called by an entrypoint script in our cloudformation templates to make sure the same standards are in place no matter where a database environment lives.
Db2 is not fully supported on custom containers – they can require you to reproduce a problem on hardware to prove that the issue lies with Db2. Having centOS support makes building your own containers and running a mix of containerized and non-containerized Db2 easier as we can have the OS similar between VMs (using RedHat) and containers. At least once, IBM has stopped supporting Db2 on centOS, but started back up after receiving negative input on this.
db2u (OpenShift)
db2u is stunning. It is a truly cloud-native way of implementing Db2. It uses many different containers to run Db2, and I expect they’ll be doing some really cool stuff with it as time goes on. I am entirely dying to get my hands on it. I am convinced that if I could use db2u, I could convince some of the workloads that are currently going to MYSQL or SQL Server for the clients I work with to go to Db2 instead.
The biggest problem is that db2u ONLY works on OpenShift. We don’t use OpenShift where I work, and I’m pretty sure we never will. That is a decision made by our platform team, at a level where I will never be able to change it. From what I’ve heard, it wouldn’t be hard technically for IBM to offer db2u outside of OpenShift, but it is a business decision at the highest levels to not do so.
Summary
Consider carefully your use cases and licensing before deciding what containerization strategy and source works best for you.
Originally published on DataGeek Blog