Note: the statements in this entry are my opinions. I quote IBM once in this article, and that statement should be seen as the only pure fact here. I invite IBM and others to add their perspective in the comments below.
I see IBM making a mistake with how they’re handling Db2 in containers. I’ve pushed behind the scenes to try to get them to correct this, but my pushing is not having an effect at this point.
The Beauty of db2u
I first learned about db2u in the fall of 2019. I don’t know if it was publicly available yet at that point, but I was at an IBM Gold Consultant summit at the IBM Toronto lab. One part of that meeting was a “science fair” of different aspects developers were working on for Db2. I learned things I can’t share, things I couldn’t share at the time, but can share now, and some things about products that already existed and were public at the time. There were maybe 10 or 15 different tables in a room, all talking about different things. It really was one of my favorite parts of the summit. I remember about three of them really well now, 1.5 years later, and the clearest in my mind is db2u.
I fully believe db2u could be a huge strategic advantage for IBM. In a world where the traditional enterprise RDBMS vendors are struggling to remain relevant and and part of the discussion for many new projects, db2u is a stand-out. db2u, conceptually, is a nearly cloud-native way of doing a relational database management system. While “cloud-native” is a buzzword, it’s one that many organizations are demanding these days. Let’s break it down a little.
Where I work, we score each database platform on several aspects to determine how “cloud-native” it is. We pull this process from the same process used to evaluate potential applications. We also throw in a few of our own evaluations at the end related to the platform we use. Let me go over these briefly.
What is presented to the end-user as a single application is actually delivered as a set of co-operating services. For databases, the “end-user” is often actually the application. Part of the idea here is also that the different components can be independently scaled based on requirements and load. Traditional Db2 doesn’t do well on this item, as there’s a single large install and any components are highly interdependent instead of somewhat independent.
Loosely Coupled Microservices
Microservices composing the solutions should be independently deployable and replaceable. Services should communicated with other services at runtime, dynamically. Again, Traditional Db2 doesn’t do well on this item.
Scale up or down independently in an automated fashion. Db2 would allow online scale-up of storage, cores, and to a limited extent memory, but the full meaning here is often adding or removing nodes. Perhaps PureScale would meet this requirement (though If I recall correctly, scale down may still require an outage), but I can’t actually run PureScale on most clouds.
Responsive to Business Changes
Can be updated and deployed frequently and independently with zero downtime. While Traditional DB2 with HADR can allow patching with minimal down time (just the time for a failover), I still cannot update a version level without a significant outage. This is even true for PureScale as I understand it. Also, Db2 patches themselves are often still quite a long ways apart.
Run reliably, securely, and predictably in spite of transient issues in the cloud including network, capacity, and varying loads. Traditional Db2 does get points here. I have been repeatedly shocked over the years by the type of failures Db2 using HADR can survive and remain up. However, until November 2020, there was no supported solution for automating failover in the cloud. TSAMP largely did not work for this. I haven’t had the opportunity to upgrade the OS and Db2 since November to try out the new Pacemaker/Corosync solution for this
Uniform and discoverable API – designed to be a part of other applications. I would argue that SQL gives any RDBMS some points here. No one runs a database for long without applications that are often designed to go with it. Nearly every language has interfaces for working with Db2, and while we can argue about the quality of particular drivers, Db2 has working drivers for the most popular programming languages.
Free to move as required – not connected to infrastructure constraints. This one drives me nuts because I would argue that very little in infrastructure or applications actually meets it. Nearly everything has a limited subset of platforms on which it works, the question is just how limited that list is.
Built on Open Standards
Extensively leverages open source components and community support. There is an interesting dichotomy here between the old-school requirement for vendor support and the ability to just fix things yourself that open source offers. I’ve seen struggles with other vendor-supplied software where technicians I work with knew exactly how to fix a specific problem they were encountering with enterprise software, yet even providing a vendor with details and code snippets for parts of code they could see, it took months or years for the fixes to work their way into the software.
Db2 kind of checks this box because it’s available on the IBM cloud. But if you’re not willing to use the IBM cloud, you’re out of luck.
Available as-a-service on AWS
At one time, IBM announced they were offering Db2 as a service in AWS. However, there was no “click to buy”. When I reached out and tried to get a POC for an OLTP application, they were wholly unable to meet the very reasonable SLAs we were looking for.
Can run on AWS EC2
This is our last-ditch at this point. If we can’t consume it as a service or run it on EC2, we won’t be using it. Db2 can at least run on AWS EC2, though until November there was no officially supported solution that worked for automating failover.
As I understand it, db2u is the cloud-native answer to the gaps above. My understanding is that it is composed of multiple containers for multiple purposes, which can be independently managed, allowing it to be more cloud-native than other Enterprise RDBMSes out there. The release cycle is also shorter than traditional db2 for frequent fixes and updates.
In addition to this, there’s a nifty new tool IBM is offering called Click-to-Containerize. While I haven’t had a chance to use it yet, it really does sound like an excellent way to move data into a container.
db2u sounds amazing to me. All the stability and proven performance of db2, but embracing new methodologies and taking advantage of all they have to offer. The problem I run into is that db2u is only available on RedHat OpenShift. I don’t have any experience with OpenShift to say whether OpenShift is good or bad. I can say that I know some brilliant engineers who evaluated it and chose Rancher, instead, for our use case. There are some tremendous advantages of having your database containers right next to your application containers. Having a single namespace that everything is in reduces communication overhead and makes spinning up new environments that consist of both application containers and db containers very easy. Even large organizations are likely to select an orchestrator for Kubernetes and stick with it. Developing significant skills in more than one Kubernetes orchestrator is not very likely.
What this means for me is that I don’t get to use db2u, while at the same time needing to containerize all that I can. Consequentially, I’m pressured not to use Db2 and choose database management platforms that score higher on our cloud-native index whenever possible.
I see requiring OpenShift as a big mistake that IBM is making for Db2, and one that they’ve made before. When the cloud was a new thing, IBM refused to offer Db2 on non-IBM clouds, outside of a bring-your-own-license model. IBM has chosen not to offer Db2 on AWS RDS despite repeated pressure from influencers and customers, and this has severely crippled developers’ use of Db2 and exposure to it. IBM has chosen not to offer a reasonable managed Db2 option on other clouds, which is a direction companies like MongoDB have chosen to go.
They’re repeating this mistake again with OpenShift. IBM supports Db2 on fewer operating systems than when I started as a DBA. In some respects the operating system has become a slightly less important layer when compared with the virtualization or containerization platform. This leads me to dismiss the excuse that IBM wants to support fewer platforms rather than more to reduce complexity and duration of testing and therefore build more faster and better. Yes, less complexity is nice, but it also reduces potential market share.
Running Production Db2 in Containers
To compound on this mistake, the only place where Db2 in containers is fully supported is on OpenShift. Let me say that again. If you run Db2 in containers and it’s not on OpenShift, IBM will generally not support your Db2 implementation. This is considered an unsupported platform. I suspected this was the case, so I reached out to some IBM contacts and the statement on this that I got was:
You are correct that IBM position is that Db2 in containers is fully supported and certified on RedHat OpenShift.
IBM Db2 Support will accept Support cases for Db2 running in custom containers or open source Kubernetes distributions only if error can be reproduced on Db2 running on Redhat OpenShift, VM or baremetal server running on supported OS and Virtualization infrastructure
So not only do I not get the cool stuff associated with db2u, but without OpenShift, there is no supported way to run Db2 in containers in production. I can’t even build my own container and run Db2 in it for production under the licensing terms. Technically this means that Docker and/or Kubernetes alone are unsupported platforms. I don’t see how IBM can only support Db2 on one type of Kubenetes implementation and claim that means they support Kubernetes.
Don’t get me wrong, I get the dilemma. I’ve discussed with Db2 developers the absolutely massive set of tests that IBM has to run for the vast array of OS and virtualization platforms that they do support for Db2. I get that they don’t want to pay to train their support staff to troubleshoot Db2 on “unsupported platforms” like building your own docker container. I have seen their lack of knowledge in a containerized world. The question is, if support isn’t there for where people want to run Db2, why not just go with another RDBMS? That’s the direction I see small and mid-sized companies going. Away from enterprise RDBMSes and towards the RDBMSes that will support where they want to be.
If this affects you and/or you agree, please go vote for the AHA Idea to show IBM it’s not just a couple of us.
Edited after publication to add link to the RFE/AHA Idea.
Orignally published on DataGeek