Kubernetes and containers are obviously much talked about in the IT world today, but how to manage the stateful applications and data that run on top of cloud native platforms is also — especially for operations — important. The process includes managing the data from legacy stateful applications as organizations make the shift to highly distributed containerized environments.

In this edition of InApps Makers podcast, Alex Williams, founder and publisher of InApps, discusses the concepts of big data, storage and stateful applications on Kubernetes. Guests Tom Phelan, fellow, big data and storage organization, Hewlett Packard Enterprise (HPE) and Joel Baxter, distinguished engineer, HPE, draw from their deep experience managing stateful applications and data in containerized environments. They also discuss KubeDirector, an open source platform for running non-cloud native stateful applications on Kubernetes.


The Evolution of Stateful Applications on Kubernetes

Also available on Apple Podcasts, Google Podcasts, Overcast, PlayerFM, Pocket Casts, Spotify, Stitcher, TuneIn

One of the original problems Baxter sought was how to solve the problem of data access in multicluster environments. He explained that, at the time, a typical usage model might have involved pre-ingesting data before copying the data to a cluster environment for processing.

“There are a lot of downsides with that process since you’re adding requirements for more storage and you’re adding time to the whole process. But one nice thing about it is once you have your data in the cluster, you could just let whoever had access to that cluster have access to that data,” said Baxter. “It’s kind of a separate stage where you could have some administrative kind of person say, ‘okay, it’s fine for you to access this data now. So, here’s your copy.’”

Read More:   Update Data and Decisions: What Is Your Data Really Telling You?

The other alternative involves moving the data to a common repository. “You have people access it on demand and now, all of a sudden, you’ve explored all these authentication and authorization issues, such as who’s supposed to have access to which pieces of data and how do you manage that on the fly at runtime,” said Baxter. “So, that was definitely one of the things that stumped us for a while.”

A solution was eventually found that consisted of matching the Kerberos identities between the compute clusters. “That was part of the value that could bring in customers,” said Baxter.

As a way to help port often numerous legacy applications to a containerized environment, KubeDirector helps to fill in the gaps that traditional operators cannot provide when making the shift to cloud native.

“There are plenty of operators in the open source community with which you can download and run your apps, but when we’re talking about enterprises with a collection of hundreds — if not thousands — of legacy applications, they want to run on containers. They don’t have the expertise for each of these applications,” said Phelan. “So they are not typically going to be able to write an operator, which is a piece of Go code, that has intimate knowledge of the application and a very good understanding of the Kubernetes API.”

KubeDirector is able to process hundreds of YAML files by providing the operator functionality for the corresponding application. “This means you can have hundreds of these YAML files and, with a single operator and a single instance of the KubeDirector, manage and control all the different types of legacy applications,” said Phelan. “It’s a really powerful tool that makes it very easy to add support for new applications to Kubernetes.”