HOW TO UNLOCK YOUR COMPANY DATA’S FULL POTENTIAL: LESSONS FROM TWO DECADES IN IT DATA AND ARCHITECTURE
I’ve worked in IT for more than 20 years, in a variety of data and architecture roles across business intelligence, data management, data migration and data science. The common thing about all the projects I’ve worked on, irrespective of industry or vertical, is the need for trustworthy, high-quality data. That continues to be true for new initiatives today such as digital twin, digital thread, predictive and prescriptive analytics, and even large language models such as ChatGPT. All of these things need accurate, complete, reliable data to fully unleash their potential.
The majority of organisations these days are data rich. They capture data from a wide variety of sources ranging from customer interactions, website traffic, manufacturing processes, sensors, etc. That being said, most delivery projects I’ve been involved in have been constrained by data in one way or another. For example, data siloed and inaccessible to other teams, poor quality data, lack of understanding about datasets, and difficulties in accessing data.
There are two major initiatives that I think can help mitigate these issues: data governance and data cataloguing.
Data governance is the process of applying controls and processes to manage data through its lifecycle from creation to archival or deletion. That covers building teams drawn from across the business to define what good quality data looks like and proactively ensure that data meets these standards; the rules the data must comply with, the fields or attributes that are needed, and how long it can be retained for.
Data governance is not new and is certainly not quick or easy to implement. Many companies I’ve worked with have deployed some form of data governance, but only one pharmaceutical company did it well. The differentiating factor was that the benefits of good-quality data and the impact of poor-quality data were understood at all levels of the business, making the organisation more receptive to implementing controls and management processes.
Data cataloguing involves building a register of data used in an organisation or process, and then capturing and curating information about it. Data catalogues can hold information about what something means (i.e., a business glossary), the structure and format it takes and which system it comes from. Catalogues can be used to signpost which data sets are available, who to contact in order to gain access and some information about the lineage of the data as it flows through the organisation.
Let’s be honest, everyone wants to work with the newest and shiniest technologies. Much earlier in my career, I’d have leapt to the technology first, looking to implement the next new shiny thing to solve business problems. It’s clear to me now, however, that the success of the shiny and new will be constrained by the quality of data fed into them. Cataloguing and strong data governance are the fundamentals that successful data-driven organisations are built from.
Like the architecture of buildings, when considering data architecture, it pays to put solid foundations in place first before delivering the new and shiny stuff.