Kumar Goswami
Kumar Goswami is the CEO of Komprise. He has spent 23+ years delivering products that solve complex IT problems with simplicity and cost-efficiency.

Explosive data growth, niche storage technologies and a plethora of cloud services have created a perfect storm of chaos in enterprise IT. Data management must evolve in step.

The preeminence of the data-driven business has been gaining steam over the years; CIOs without a practical plan to get there won’t survive. Yet evolving into a data-driven organization requires a multifaceted strategy, from technology decisions and hiring to setting organizational priorities. A 2021 NewVantage Partners survey of chief data/analytics officers found that only 24% of respondents reported having a data-driven organization and less than 50% say that they have achieved data-driven innovation.

Analytics and AI tools, people, skills and culture are of course necessary ingredients for data-driven operations. What may be overlooked lies deeper: how the data is stored and managed itself. Data storage cannot be a set-and-forget exercise. Our world has changed too much in the past decade. As the Harvard Business Review stated in a recent article on the mandates for data-driven companies: “Data flows like a river through any organization. It must be managed from capture and production through its consumption and utilization at many points along the way.”

Without a thoughtful plan and process for the continual management of data from a cost, performance and business-need perspective, CIOs face impending disaster:

  • Data is growing at a CAGR rate of 24% per year, according to IDC, and will reach 175ZB (Zettabytes) by 2025. How will organizations manage it all without breaking the bank and the backs of their IT staff?
  • Employees need the ability to search and quickly access an organization’s data to drive competitive advantage. Due to shadow IT and cloud sprawl, IT leaders often don’t know the full scope of data they have nor where and how it’s stored.
  • CXOs want to see better returns from IT investments — and more guidance on how to turn these mountains of largely unstructured data into revenue streams.
Read More:   Top 15 Data Structures Interview Questions & Answers - 2022

These pressures demand a fresh approach to enterprise data management.

How We Got Here

Twenty years ago, all but the very largest companies had just one or two file servers in the data center and the data being stored was largely transactional. Data sprawl was not a concern. Yet with the rise of the smartphone, online business models, IoT and cloud apps, a data deluge ensued. Coincidentally, compute power became cheap enough, thanks in part to the cloud, to where companies could analyze massive volumes of data like never before. Meanwhile, data volumes have grown exponentially in recent years as technology has become pervasive into every aspect of work and home life with smart home devices, industrial automation, medical devices, edge computing and more.

The preeminence of the data-driven business has been gaining steam over the years; CIOs without a practical plan to get there won’t survive.

A host of new storage technologies have come into play as a result, including software-defined storage (SDS), high-performing all-flash NAS arrays, HCI, and many flavors of cloud storage. But storage innovation has not solved the problem and in some cases has made it worse because of all the silos. It has become unfeasible to keep pace with the growth and diversity of data — which these days is primarily unstructured data.

The High Price of One-Size-Fits-All Data Management

Despite the data explosion, IT organizations haven’t necessarily changed storage strategies. They keep buying expensive storage devices because unassailable performance is required for critical or “hot” data. The reality is that all data is not diamonds. Some of it is emeralds and some of it is glass. By treating all data the same way, companies are creating needless cost and complexity.

For example, let’s look at backups. The purpose of regular backups is to protect the hot or critical data, to which departments need reliable, regular access for everyday operations. Yet as hot data continues to grow, the backup process becomes sluggish. So, you purchase expensive, top-of-line backup solutions to make this faster, but you still need ever-more storage for all these copies of your data. The ratio of unique data (created and captured) to replicated data (copied and consumed) is roughly 1:9. By 2024, IDC expects this ratio to be 1:10. Most organizations are backing up and replicating data that is in fact rarely accessed and better suited to low-cost archives such as in the cloud.

Read More:   Why GitLab Opted to Make Its ‘Core’ Offering Free – InApps 2022

Beyond backup and storage costs, organizations must also secure all of this data. A one-size-fits-all strategy means that all data is secured to the level of the most sensitive, critically important data. Large companies are spending 15% of IT budgets on security, according to a recent survey.

Building a New Model for Data

It’s time for IT execs to create a sustainable enterprise data management model appropriate for the digital age. By doing so, organizations can not only save significantly on storage and backup costs, but they will be able to better leverage “hidden” and cold data for analytical purposes. Here are the tenets of this new model:

  1. Automation. It is no longer sufficient to do the annual spring-cleaning exercise of data assets. This needs to be a continual, automated process of discovery, using analytics to deliver insight into data (date, location, usage, file type) and then categorize the data into what is hot, warm, cool and cold. Having ad hoc conversations with departmental managers is inefficient and no longer scalable. Get data on your data.
  2. Segmentation. At a minimum, create two buckets of the data: hot and cold. The cold bucket will always be much larger than the hot one, which should remain relatively stable over time. On average, data becomes cold after 90 days but depending on the industry, this can vary. For instance, a healthcare organization that’s storing large image files may consider a file warm after three days and cold after 60 days. Select a data management solution that can automatically move data to the age-appropriate storage device or cloud service.
  3. Dead data planning. It can be difficult to know with confidence when and how IT can eventually delete data, especially in a highly regulated organization. Deleting data is part of the full lifecycle management process, even though some organizations never delete anything. Analytics can often indicate which data can be safely deleted. For instance, an excellent use case relates to ex-employees. Companies are often unknowingly storing large amounts of email and file data from employees who have left the company. In many cases, that data can be purged — if you know where it lives and what it contains.
  4. A future-proof storage ecosystem. New data storage technologies are always around the corner: DNA storage is just one example. Organizations should work toward a flexible, agile data management strategy so that they can move data from one technology or cloud service to the next without the high cost of vendor lock-in; this can be in the form of excessive cloud egress and rehydration expenses which are common with many proprietary storage systems. This viewpoint can be rife with conflict in IT shops with entrenched vendor relationships and the desire for “one throat to choke.” Over time though, this can be a limiting strategy with more downsides than upsides.
Read More:   Update CoreOS Updates etcd for Large-scale Container Coordination

IT leaders have the potential to unleash previously untapped unstructured data stores to enhance employee productivity, improve customer experience and support new revenue streams. Getting there requires evolving from traditional data management practices. It’s time to stop treating all data the same — perpetuating the endless cycle of backups, mirroring, replication and storage technology refresh. By developing a modern information lifecycle management model with analytics, automation, and data segmentation, organizations can have the right data in the right place, at the right time and managed at the best price.

Feature image via Pixabay.