This is Article 3 in a three-part blog series by
Daniel Howard (see bio below) on data management covering
the then and now,
the anatomy of a data management platform, and making data management work for you.
Previous blogs in this series have discussed the state of data management and described the most important capabilities for a data management platform to possess. This article will move on to answer what is perhaps the most important question about data management: why should you care? Which is to say, what are the benefits of adopting a data management platform, what use cases does data management serve, and what challenges does it enable you to address? As with previous blogs, this will not be a totally comprehensive list of these challenges and use cases, but will instead serve to highlight some of the most significant areas that can benefit from data management.
Regulatory compliance
First, and arguably foremost, there is the matter of regulatory compliance. You will inevitably need to store some data, and more likely than not some sensitive data, to keep your business functional. And unless you want to operate across only a small fraction of the global market – and note that said fraction that is getting smaller all the time, thanks to new regulation – there will be government-enforced compliance mandates that you will need to meet, such as GDPR, CCPA, LGPD, and so on. Noncompliance means risking both substantial monetary costs (via fines, in some cases very large ones) as well as reputational damage (when your noncompliance is disseminated through various news outlets, especially in the aftermath of, say, a data breach).
Hence you will need data management tools to keep your sensitive data safe and your company compliant. These come in a number of different forms, such as – to name only a few – sensitive data discovery, to physically locate your data and determine whether or not it’s sensitive; various types of data anonymization, including encryption, redaction, and data masking for protecting that sensitive data once you’ve identified it; and data governance or policy management, to apply those protective techniques from the top down. Essentially, data management allows you to find and protect sensitive data across your enterprise, in an automated fashion, at a scale that is simply impossible – or at least overwhelmingly impractical – to do manually. Moreover, it also allows you to prove to the appropriate regulator that you are able to do so in a far more effective manner than any sort of manual solution ever could. Given the legally required nature of this topic, it would not be a stretch to say that data management products are not just recommended for regulatory compliance, they are required.
Analytics
Analytics is also an important area that can be well-served by data management. It should hardly need explaining why analytics is important, but the essence of it is that it allows you to analyse and thereby understand (and better predict) your organisation, your system, your customer base, and whatever else you care to. For one example of this, customer analytics can be vital for analysing the characteristics, desires, and behaviours of your customer base, especially in the context of a competitive industry, and is therefore invaluable for keeping up with customer expectations in an increasingly accessible and digital world. For another, MDM (Master Data Management) can be employed to enable a unified, trusted 360-degree view across your various, disparate systems. There is also the matter of AI and machine learning to consider. These technologies are potentially very powerful, but must be managed carefully for them to achieve their full potential (for instance, your models will need to be regularly supplied with high quality, up-to-date training data).
There is much that data management can do to enable analytics. Perhaps most obviously, there are various data movement methodologies, most famously data integration, that will allow you to centralise your data (inside a data warehouse, say) in order to query multiple datasets at once. The same technologies will also allow you to load training data for your machine learning models. Alternatively, data virtualisation or federation will let you build a virtual data warehouse rather than a physical one, allowing you to analyse multiple data sets in a single query without moving the actual data anywhere. Data quality tools help you to hold your data to a good standard, and thereby ensure the accuracy of your analytics. Stream processing and streaming analytics can help you to analyse
and/or load data in real-time as it enters your system, while Change Data Capture (CDC) can allow
you to detect and propagate changes to your data as soon as they happen. In short, data management has a lot to offer to analytics.
Migration to the cloud
As discussed at the start of this series, the cloud – and migration to it – is a hot topic at present. This is unsurprising, as the cloud provides a great deal of benefits in terms of scalability, flexibility, agility, and so on. To quote the first entry in this series, “Cloud storage is highly dynamic and scalable; it’s frequently cost-efficient (although of course this will vary from case to case and can be impacted by egress costs); and most of all, it offloads the burden (and cost) of building and maintaining your storage solution (or a large part of it, in the hybrid case) to your cloud service provider(s).
But if you want to migrate either part of or all of your system to the cloud, you will need, or at least want, data management to help you out. Most particularly, you will want tools like data integration that will allow you to intelligently migrate your data from on-premises to cloud systems. There are even products that have been specifically designed to support this sort of mass migration of data, as well as the mapping of source to target data models and any necessary cleansing and transformation. You will also need the ability to protect your data once it’s in the cloud, especially given the differing security requirements of cloud and on-prem, as well as a way to bridge the gap between your on-prem and in-cloud data if you ever intend to use both at once, which will be the case more often than not (even if you eventually intend to put everything in the cloud, some of it will probably stay on-prem in the meantime).
Self-service, collaboration, and access
Finally, there is the matter of self-service, collaboration, and access, sometimes referred to as the democratisation of data. The gist of all this is that there is an increasing desire for data, and indeed data-driven systems like analytics, to be widely available to users across your organisation. Ideally, data should be accessed via self-service, meaning that a user requests access to data and automated systems provides them with it – or at least an appropriately secured version of it – almost immediately. This stands in contrast to the traditional process of sending a request to IT and waiting hours, days, weeks or even months to receive the data you requested, by which point the original window of opportunity for which the request was made has probably long since passed by.
Data management can help with this task via automated data privacy and governance functionality that is needed to make the speedy delivery of secure data a real possibility. After all, these technologies can essentially automate the process that IT previously went through in order to deliver secure data. They are most often provided via a data catalogue, through which users can see and request all of the data in your organisation that they have access to, as well as understanding what that data means, where it can be applied, how it relates to other pieces of data, and so on. Data catalogues also frequently contain collaboration-oriented features such as data sharing, comments, and ratings. Some catalogues even offer proactive data delivery functionality, meaning that you can use them to automatically deliver data to your users that you think they’ll need without them even having to request it.
The point of this series has been to demonstrate the current state of data management, the most useful potential features of a data management platform, and the way such a platform can be used to address common use cases that exist across multiple industries. As far as this article in particular is concerned, the short version is that data management is not an optional capability, it is an essential one. Whether you are interested in analytics, in providing self-service data access, in machine learning, cloud migrations, or regulatory compliance, data management is and will always be far closer to a necessity than a nicety.
For further reading and more information, check out:
About the author
Daniel Howard, Senior Analyst
Bloor Research International Ltd
www.bloorresearch.com
Daniel is an experienced member of the IT industry. In 2014, following the completion of his Master of Mathematics at the University of Bath, he started his career as a software engineer, developer and tester at what was then known as IPL. His work there included all manner of software development and testing, and both Daniel personally and IPL generally were known for the high standard of quality they delivered. In the summer of 2016, Daniel left IPL to work as an analyst for Bloor Research, and the rest is history.
Daniel works primarily in the data space, his interest inherited from his father and colleague, Philip Howard. Even so, his prior role as a software engineer remains with him and has carried forward into a particular appreciation for the development, DevOps, and testing spaces. This allows him to leverage the technical expertise, insight and ‘on-the-ground’ perspective garnered through his old life as a developer to good effect.
Outside of work, Daniel enjoys Latin and ballroom dancing, board games, skiing, cooking, and playing the guitar.