What is common to all business transformations is that they are more about people than technology, processes or data. This is no news to anyone who has worked closely with transformative companywide programs. Perfectly planned does not lead to happily ever after.
Data transformations present new kinds of challenges and two specifically. First, data concerns, touches, and is affected by everyone. Secondly, data as an organisational capability is still work in progress for most companies.
We are still on the path to discover and understand that data has value. This is what all the data asset and becoming data driven talk is about. We would not purposefully break something of value, but still, it is often what happens to data. We are entering data into systems and applications because we are asked to do so. Our first thought is not that it will create value and new business opportunities. We see the rationale, but we do not act accordingly.
If we decide to look outside of ourselves, we notice that continuously accelerating digitalisation is creating demand for faster value creation and causing maturity gaps within companies to grow extensively. It will soon be breaking even more business models.
There is a loud call for new ways to organise business and transformation programs, we all need to answer. The old ways have been battle-tested and proven to work when implemented properly. The challenge we are experiencing today is that they are getting too slow to be successful.
Learning 1 – Principle based transformation designs create room for delivering value
We need to stop paralysing data initiatives by trying to control the things we understand. Centralised data organisations already have both their feet in the grave, we just don’t see it yet.
For a successful transformation it is important to define a current state, target state and how you plan to get there. The different areas can and should be tackled in a way and order that suit your company specifically.
A data transformation typically covers these seven areas:
- Data vision and data strategy
- Data governance
- Data management and data quality
- Organisation and capability development
- Architecture and technical capabilities
- Information architecture
- Data delivery and building data solutions
What happens and in what order depend mostly on the current maturity, level of ambition, targets and the business models. In practice, it will also depend a great deal on who in the organisation is running the show.
At the beginning of the data journey everyone is learning and iterating together. The need for iteration will break familiar first-time-right approaches. Iteration will make one-size-fits-all approaches and most initiatives to find companywide best practices inefficient. Copy-pasting a practice from one place to another context is a path to failure, especially if you do not consider the underlying principles.
Sadly, many companies try to scale practices in unstable environments, and it does not work. You need to at least complete the iteration first. Due to the accelerating speed of change, the business environment may never really become fully stable again. You need more dynamic approaches to organisations and processes, where change is a built-in part of the design.
Another key ingredient is the need to understand the data. It creates a rising need for subject matter experts to recognise opportunities and figure things out independently. This will put an end to what is left of the idea of having a centralised data organisation. This is a big cultural challenge for organisations where people are used to being told what to do and/or have specific siloed job descriptions.
Freedom to make decisions on the subject matter level will no longer be a leadership philosophic discussion about self-direction. Nor will it be a question about doing things in a smarter more agile way in an environment that needs to follow corporate rules. It will growingly become a necessity to get enough data capable hands on deck. The real question to discuss should be how distributed and organically formed your decentralised data organisation is going to be.
We have learned that the medicine is to create space for the iteration and self-direction with a principle-based design. Defining a minimum of companywide guidelines that will bring freedom for fast iteration on the subject-matter expert level, and at the same time ensure that you do not lose sight of the company level targets. You need companywide principles, guidelines and governance to avoid creating hundreds of pieces that do not fit together.
In practice this means that you form principles for your data transformation based on your vision in order to guide you to the desired future state. This makes it possible to iteratively define practices on a local level without messing up the big picture.
In a similar way you can have principles for your data products. These guide the data product teams, but at the same time allow them the freedom to organise, create practices and make decisions as they see fit. Global guidelines and local practices.
Learning 2 – Targets must be measurable, if you want higher ROI
KPI’s are everywhere. Still, we seem to forget to measure the value of our data programs.
For a successful data journey, you need to derive the targets from the data vision. The targets and metrics are a key part in communicating what the visions means. You need to determine why the company is starting a data transformation, what the transformation will mean on a corporate level, on a business unit level and for the people involved. Common data visions like “becoming data driven” or “using more data for decision making” are too broad.
The confusion rising from unclear targets will prevent people from taking action thus slowing down your transformation. It pushes forward the date for larger scale value delivery and breaking even.
The lack of measurable business targets also often results in measuring activity instead of value. Unfortunately, the number of workshops and identified data assets tell us little about the business value or if we are prioritising the right things.
Lack of clearly defined intent also leads to not knowing, if you are going in the right direction. This is an important aspect of self-direction mentioned earlier. Leadership needs to connect the dots more clearly even in those organisations that already are self-directed. Having great people and action happening, will not be enough. Transformative leadership is required as the challenge is to clearly communicate intent without stepping on the teams’ freedom.
One way of approaching the problem is by creating a KPI architecture with metrics for achieved, operational and execution value that are connected to the data programs initiatives.
The metrics for achieved value tell us if we are on the right path to meet our strategic KPIs, for example increased profitability and growth. The operational metrics tell us about improvements in our business operations, for example increase in number of customers acquired or improved process efficiency. The execution metrics are mostly about measuring learning and behavioural change. They should react fast and connect to the actions that we want to see happening.
The number of documented data assets, data community activity level, trainings or data quality improvement actions completed are not the best metrics for determining the business value of our data journey, but they can be good indicators about what is going on.
If relevant and meaningful KPI’s are not set well in the beginning of a data transformation project, it can be difficult to backtrack and later justify the investments made, especially if there is dissatisfaction or division between stakeholders. You can be sure there will be. It is only a question of when.
Learning 3 – Trainings are not enough for people to act
If you are pouring more money and time into your transformation program, yet not getting any more value, you are likely in a capability trap. Stop and think.
Trainings are a good tool for communication, improving awareness and increasing general data literacy. The problem with trainings is that data is often a too far stretch from what we know, and we therefore fail to ground things in our own experience. This leads to people understanding everything on a theoretical level, but unfortunately less about how it relates to their daily work. This leads to limited or no ability to execute the tasks related to their assigned data responsibilities.
To support the change, you need to approach people holistically and be present in the context relevant to the person you are trying to help. Sitting in a centralised ivory tower will not help anyone. Instead help people take the steps you want to see and make data topics relevant by bringing them close to their own subject matter. This will likely also by design make the change less invasive and decrease friction within the organisation and between data & IT.
An additional benefit of being present in the data community is that you will gain valuable insight about what is going on and learn how you can support people even better. This will greatly help you understand maturity related problems and performance gaps in order to refine your transformation roadmap.
Learning 4 – Focus on data entities, when you structure your data
Less than half of the data has added value outside its original scope of use. Forgetting this makes us direct our attention away from delivering value.
Data entities represent the real-world things that you have or maintain data about. They are the lowest level on which it is useful to conceptually identify the data and define the relationships within it. Entities are the building blocks for creating a structure for your data. Data entities can also be referred to as for example data elements, data concepts or data objects.
Conceptual data modelling is the tool for identifying the important entities, and the data models describe how the entities relate to each other. If the conceptual modelling is done properly the data entities are source system agnostic and defined together with the business in an understandable and relevant way. Some examples of data entities are Customer, Sales order, Invoice, Contract, Manufacturing Plant, Equipment, Product and Project.
Data entities are not to be mixed up with attributes and artifacts. Attributes describe or give us more information about the entity, but they are not real-world things on their own. For example, having an order number does not make any sense, if we do not have the order itself. Or a truck defined as yellow does not tell us anything of value, if we do not have the truck.
Artifacts can shortly be described as the outcomes or result of data utilisation. These are for example reports and analysis where we use the data. Artifacts are not the origin of the data, so we do not want them to be data sources.
The idea with talking about data assets is to help us see data as something of value. Seeing data as the intangible asset that it is. Data holds and can produce value, but just having it also includes risk. The pursuit of increasing data utilisation creates the need for managing data, which is one of the reasons we are talking about data journeys and governance in the first place.
From a general terminology perspective, data asset can pretty much refer to any valuable that data we have. Everything from a single entity to all the data a company owns. But this is not very practical, if we want to create a good structure for our data. So, for example, data cataloguing purposes data assets can be defined as logical groups of data entities. If done well, defined data assets give us a good structure that helps us understand the business purpose of our data.
Defining data assets also helps us decide the data governance responsibility of each asset. Drawing the lines between the sandboxes. This can unfortunately also encourage silo thinking and make data utilisation more difficult. Focusing too much on control and ownership instead of value creation is a problem, especially when the data governance practices are not up and running. Also, to remember that data used across the business likely needs more defined governance than data used only within one specific area.
Another problem might present itself when we start identifying and documenting the data in the data catalogue in order to create an inventory of our assets. It is not that it is not valuable or important, but you can end up in a conceptual documentation nightmare. Especially, if you forget that less than half of the data has added value outside of its original scope of use.
So, the learning. When you start creating a structure for your data focus on the data entities. Having well defined building blocks will bring you value in the long term no matter if you decide to work with data as an asset or a data as product approach. Well defined data entities will help you understand the data on a level that makes use case implementation easier.
Focus should be on value and data delivery, not a perfect conceptual library of all your data. Defining data assets will help you structure and govern the data. Yet spending too much focus on defining assets and identifying the data creates a risk of directing your attention and resources away from value creation.
You need a solid data foundation, if you want to do large scale data utilisation. But it is better to focus on the business important data first and build the foundation step by step, not all at once.
Learning 5 – Create accountability for data quality and stop the blame game
Most of us do not care about data quality until it becomes a problem for us. Then we start to look for who is to blame.
Data quality is conceptually simple. As we know, the difficulty comes from almost everyone having a role to play. Every process, practice, system, person and action affect the quality of the data. Currently data quality is everyone’s and almost no one’s business.
We all share the same problem. The data quality is not as good as you would hope, and it does not really get any better. If it does get better, the erosion will slowly and surely eat up the improvements achieved.
Another familiar difficulty is that the people responsible for data quality feeling they cannot do much about the problems since they are not doing the data entry. They are often in supporting roles and have limited ability to nudge the businesspeople in the right direction. This is the story you hear from most data owners, data stewards and master data specialist. When you feel you cannot affect the problem you are responsible for, there will be no real accountability.
Typically, solving data quality is either a data management or data governance topic. In the case of data quality, you need to do the right things in the right way to reach your goal. So, you need both.
If you look at data quality too much from a data management angle, you can do analysis, clean up data and create transparency, but you will have problems doing much about the sources of the quality issues. You cannot solve the root cause; you can only make the problem smaller and easier to live with. One symptom of too much data management focus is the feeling of the data being a mess, no matter how much you work on improving it. You need data governance to create structure and define standards.
If you focus too much on data governance, you will run a risk of staying too conceptual- Especially if your data owners are from the business organisation and unfamiliar with working with physical data. A symptom of too much focus on data governance is that the data capabilities improve, but data quality and value delivery do not.
Ways to attempt to solve this is with a federate data governance model that builds in a culture, principles and mechanisms for continuous data quality improvement. The key is to create accountability for data quality for the businesspeople in data governance roles so that they start working on the issues and root causes data management cannot solve. This would mean that governance becomes an organisational capability that drives the possibility to create value.
Another way is to implement a data as product approach in which one data product team takes end to end responsibility from the source to data delivery to data quality improvement. This creates an entirely different dynamic as the data product teams take the data producer role and the data users become data consumers. The data product team then needs to have both the subject matter expertise and organisational reach to fix root causes in order to fulfil its purpose.
To start fixing the problems consider the following:
- Make problems transparent and measurable. Making the data quality transparent, will create the need and pressure to act. An easy first step would be to add notes about reported data quality issues in your data catalogue.
- Define what you are talking about since much of the data terminology is not universal and commonly understood in the same way. Luckily, when it comes to data quality, the dimensions for master data are quite standardised. Things like missing data, duplicates, format or data not being updated tell us something specific about the data.
- Define what good quality data means for the users of the data. This is important if we want to serve the data consumers in a satisfactory manner. Since you can do as much data quality improvement as you have resources for, you need to understand what is good enough to be able to prioritise.
- Data quality improvement needs to be a continuous process. Monitoring data quality and creating triggers for improvements is a way to implement the range of good enough data. Then if data quality drops below say 80% completeness, it automatically triggers an action for data improvement.
- We need a working data governance to ensure we are doing the right things and enabling impactful data management.
- Decide to think of data quality as product quality and implementing a data as product approach, which creates end-to-end responsibility for the data including everything from extracting the data from the source to delivering the data to continuous improvement.
Having written all this, I wonder how many master data management teams there are struggling with the same data entry related problems day in and day out? How many data owners there are unable to fix the root causes for the bad data quality? Or maybe it is just that we need to fix our attitudes first?
One thing I can say for sure. Shifting blame to the business users for data quality issues does not help. A bad system will always defeat the individual.