The problem: Disregarding data after it leaves the data warehouse
In reality, data governance and data quality processes for BI are often overlooked. The problem is that companies, in general, are good at data warehouse governance but have no control over what’s happening after their KPIs are delivered from the data warehouse. Or they believe that after the data leaves their data warehouse, nothing happens.
The belief that data is being prepared before it enters the BI solution is based on the assumption that end-users have limited control over the data used in their visualization tools, which is a very outdated way of thinking.
Modern BI tools aren’t just data visualization tools
First of all, there is a misconception that a visualization tool is only used to visualize data. Traditionally, all the transformations used to happen in the database or data warehouse (or anywhere else outside the BI solution) and then you used a BI tool to visualize that data. But this is not the case with modern BI tools. Today, there are so many things you can do within your business intelligence tool, including data transformation. Instead of going to the IT department, people responsible for visualizations make changes themselves.
Qlik was one of the first tools that came in and challenged the old centralized way of working. QlikView users got access to the full BI data flow and started transforming data and building applications directly within the BI solution. This led to the number of Qlik applications growing exponentially in the organizations. That’s one of the reasons why data governance is an even bigger challenge in Qlik installations compared to other BI tools.
Not all the data travels through your data warehouse
It’s also worth noting that modern business intelligence tools (like Qlik, Power BI and Tableau) are extremely good at connecting with other data sources. We have a customer that just by looking at the automatically generated data landscape map in NodeGraph realized that only 50% of their data was actually travelling through their data warehouse. This is not a unique case. What is the value of having world-class data warehouse governance when it isn’t all-incapsulating?
Get full control of your data landscape with automated end-to-end data lineage
Until you’ve mapped the full flow of your data, you can’t properly evaluate the state of your solution. An ideal starting point is to visualize the end-to-end data lineage of your Power BI, Tableau or Qlik tool. With NodeGraph’s Dependency Explorer, you can automatically and instantaneously map your entire data solution — from data source to end-user application, on a field-level.
NodeGraph Dependency Explorer showing data lineage from Snowflake database to Tableau workbook
This can serve as your go-to data lineage solution or as a powerful extension to the lineage tools –allowing you to visualize how your data assets are connected to the overwhelmingly overlooked BI environment. Of course, you can always attempt to produce a manual or semi-manual solution in-house. However, we believe that if you’re relying on people to perform manual processes to achieve this data overview, you will never have 100% coverage.
You need a data governance framework that can co-exist with agile and flexible BI environments
When it comes to data warehouses, these tools are usually good at data governance as they have been working with it for such a long time. Also, it is easier to govern a more rigid and structured data setup than an agile BI environment where you often want to make quick changes and additions. This alters the way you need to work with BI governance – you can’t really have the same traditional central-owned policy. Even on the data warehousing side, there has recently been a shift in technology. Take Snowflake, for example, it works in a completely different way, no longer requiring you to aggregate or structure data beforehand. This, ultimately, renders old governance structures ineffective and again emphasizes the importance of thinking end-to-end and agile.
At NodeGraph, we chose architecture that allows for that. We use a modern and flexible technology that’s built on a graph database which allows us to support very large organizations and solutions. Also, graph databases enable you to start without deciding on a certain model or structure from the get-go. They fulfill the needs anticipated today, but also the needs of tomorrow (that may not already be clear to you).
Communication between IT and business is a platform for building your framework but also one of its key outcomes
Let’s compare how organizations work with data warehouses versus business intelligence solutions. First off, it’s important to understand that data warehouses have traditionally been run by the IT department, ensuring that all data is collected, secure and high quality — while BI has typically been driven by the business-side. More recently, the business side has also been adding cloud solutions to its portfolio. As a result, the data landscape is expanding and becoming more complex at a rate much higher than what a typical IT framework can keep up with. However, the business-side doesn’t have the knowledge and years of expertise that IT has when it comes to data governance and data quality. This might not be a problem if you have one or two applications but consider what havoc it could wreak when we are dealing with hundreds or thousands of BI applications.
Another huge problem is that data governance is not regarded as a business-critical area in the organization but is rather seen as an IT-related question, further explaining the lack of data governance practices in the BI sphere.
Furthermore, while business intelligence tools, such as Power BI or Tableau, have governance tools available to their users, these are not enough on their own. As we have mentioned, governance is more about a process and businesses need to find tools that support their decided-upon process. This might be a data lineage tool or a data quality tool, but it’s up to the BI owners to step up and start the data governance discussion internally.
That’s where the importance of end-to-end data governance and the need to include your visualization tools in your governance framework come into play and drive this discussion forward. If you really want to be successful, you need to see involvement from several different levels of the organization and approach the matter outside-in: from the point where the data is used for decision-making all the way to its original source. It’s a shared area of interest that also creates a good platform for collaboration and better dialogue between business and IT.
NodeGraph Data Catalog acting as a KPI catalog to facilitate collaboration
A step-by-step approach to implementing a data governance framework that includes your BI solution and boosts its business value
We believe in a “think big, start small and scale fast” practical approach to data governance and the power of approaching it from an outside-in perspective, starting from the business perspective, ensuring data quality and data trust when it comes to your BI solution, especially if you use self-service BI.
#1. Think big and explain the value
The challenge with implementing or changing a data governance framework is that it’s difficult to see the direct value. We asked business intelligence experts and our colleagues at NodeGraph to shed some light on this:
“The first thing that will happen once you create a new BI report is that someone, or even you yourself, will question whether the numbers that you see in front of you are correct. If you have a good governance framework — you won’t have to ask that question. If people start doubting their data, regardless of whether they’re working with Power BI, Tableau or Excel reports, it’s a problem.”
Oskar Gröndahl, co-founder and CEO at NodeGraph
“There is also the opposite challenge. At earlier maturity stages, companies sometimes intend to build visualizations and have things to talk about without prioritizing data accuracy. This ties into data literacy. How many business users are actually questioning their numbers? Some just get the numbers and believe they’re right or, even worse, just know they’re right. And it can be truly baffling, the misunderstanding from many people that their data can’t be wrong.
There are still a lot of organizations that are run on “good enough” because there are many other problems that start to arise when you begin to question your data. When you start enforcing governance, it’s going to be frustrating because you become aware of the problems in your data solutions. But, after that, it’s just going to get better. And it’s not a quick fix, either. It’s something that you have to work on continuously.”
Oskar Fahlvik, Managing Director at NodeGraph Inc
#2. Start small and focus on easy wins
Start small and focus on easy wins. One easy win can be to install NodeGraph and view what you have in your data landscape. It automatically goes in and shows you exactly how your data in your BI environment is interconnected, where it comes from and how it is used. This will be an accelerator for your next step.
This step might bring up questions such as “Is that Excel file still being used?” or “Why are we pulling data from there?”. This can be frustrating but starting the conversation will hugely beneficial. There aren’t a lot of organizations that are having this dialogue, and among those, few know how to transform the abstract conversation into something actionable.
The best way to go about implementing end-to-end data governance is to start with an overview of your entire data landscape, including the BI environment.
NodeGraph Field Explorer showing full data lineage script
Follow this up by acting on the errors or inconsistencies that you find and begin laying out the process in small steps. Once you’ve started, the organization will already be learning and seeing what the biggest values of data governance can be. On top of this, you will be able to start building a bigger picture of where you are today, where you want to be in the future and how your way of working needs to change in order to bridge that gap.
All in all, this needs to be aligned with the overarching objectives you have as an organization. Are you trying to:
- Make it easier for self-service BI?
- Consolidate definitions for your KPIs?
- Enable end-users to find easily reports or KPIs they need?
- Solve a compliance issue that requires correct documentation?
There are so many different objectives that you can take into consideration and these are just some examples. The most important thing is that you initially focus your governance efforts on your main business objective. Other issues, gaps and targets will follow. The key is not to have too many, not to get too ambitious.
By using NodeGraph, you can extend the reach of your data governance framework to include your business intelligence environment. We encourage customers to see NodeGraph as a toolbox that provides all the metadata you need, in an automated way and helps you see and analyze all the building blocks of your BI solution so you can drive the most business value from business intelligence.