Expert guide & roadmap

Implementing a data governance framework as part of your BI success.

There are many different best practices and strategies available when it comes to implementing a data governance framework. The problem with these is that they cover everything within the data landscape except the business intelligence (BI) solution. Data is overwhelmingly considered to be finished and fully governed before it enters the BI solution — making it extremely difficult to be in full control of your data journey from end to end.

Including your BI solution in your data governance framework

In this article, we’ll explore why BI is often overlooked when it comes to data governance, what trends have led to this and how they presently affect the industry. We’ve also compiled real-life examples and best practices from our experience of working with 270+ enterprises and 50+ partners worldwide. Data governance looks different for everyone and this article is based on the assumption that every organization must adapt their framework to suit their individual needs.

Our strongest suit is our unique data intelligence toolbox that extends the reach of your current data governance efforts, ensuring that they cover your data from end to end, i.e. from the point where the data is consumed all the way to the data source. We’ll focus on practical and actionable data governance initiatives that provide immediate benefits and, most importantly, boost the business value of your BI solution.

Just so that we speak the same language, the data governance framework represents the “what” of your data governance and the implementation the “how”. Data governance frameworks are based on three pillars – processes, policies, and people. In this article, we’ll focus mostly on processes and partially on roles and responsibilities that aim to include your BI solution.

Your BI environment should be covered by your data governance framework

Data governance is about delivering data that your business users can trust and understand entirely. It is about having processes, policies and people in place that ensure that your KPIs are up to date, uniformly defined and high quality. All this for you to be able to make knowledgeable decisions and excel towards your business goals.

We had a case where after a quick audit with NodeGraph, our customer realized that they had 25 different definitions for a central and important KPI. This is a very powerful example of why you need to have a governing model. You need to understand and be secure of the values in your BI reports, knowing exactly where they come in order to communicate them across your organization and make informed decisions.

Data governance should be seen as a framework that you put that on top of your entire data environment. It as a way of working with your data to deliver high-quality, correct and consistent results to your business-users.

Ultimately, data governance only provides value if it covers your data flow from end to end. If you manage to achieve 90% coverage, imagine what effect that 10% uncertainty will have on your business, looming over all your decisions. Until you examine your entire data landscape including what’s happening in the BI environment, you can’t possibly know if you have a problem, regardless of whether you’re working with Power BI, Tableau, Qlik or Excel dashboard. And if you can’t trust the data, nothing else really matters. That’s why you need to include your BI solution in your data governance framework.

Download our data governance roadmap for BI leaders

Get started with an outside-in approach to data governance with this roadmap.

The problem: Disregarding data after it leaves the data warehouse

In reality, data governance and data quality processes for BI are often overlooked. The problem is that companies, in general, are good at data warehouse governance but have no control over what’s happening after their KPIs are delivered from the data warehouse. Or they believe that after the data leaves their data warehouse, nothing happens.

The belief that data is being prepared before it enters the BI solution is based on the assumption that end-users have limited control over the data used in their visualization tools, which is a very outdated way of thinking.

Modern BI tools aren’t just data visualization tools

First of all, there is a misconception that a visualization tool is only used to visualize data. Traditionally, all the transformations used to happen in the database or data warehouse (or anywhere else outside the BI solution) and then you used a BI tool to visualize that data. But this is not the case with modern BI tools. Today, there are so many things you can do within your business intelligence tool, including data transformation. Instead of going to the IT department, people responsible for visualizations make changes themselves.

Qlik was one of the first tools that came in and challenged the old centralized way of working. QlikView users got access to the full BI data flow and started transforming data and building applications directly within the BI solution. This led to the number of Qlik applications growing exponentially in the organizations. That’s one of the reasons why data governance is an even bigger challenge in Qlik installations compared to other BI tools.

Not all the data travels through your data warehouse

It’s also worth noting that modern business intelligence tools (like Qlik, Power BI and Tableau) are extremely good at connecting with other data sources. We have a customer that just by looking at the automatically generated data landscape map in NodeGraph realized that only 50% of their data was actually travelling through their data warehouse. This is not a unique case. What is the value of having world-class data warehouse governance when it isn’t all-incapsulating?

Get full control of your data landscape with automated end-to-end data lineage

Until you’ve mapped the full flow of your data, you can’t properly evaluate the state of your solution. An ideal starting point is to visualize the end-to-end data lineage of your Power BI, Tableau or Qlik tool. With NodeGraph’s Dependency Explorer, you can automatically and instantaneously map your entire data solution — from data source to end-user application, on a field-level.

NodeGraph Dependency Explorer showing data lineage from Snowflake database to Tableau workbook

This can serve as your go-to data lineage solution or as a powerful extension to the lineage tools –allowing you to visualize how your data assets are connected to the overwhelmingly overlooked BI environment. Of course, you can always attempt to produce a manual or semi-manual solution in-house. However, we believe that if you’re relying on people to perform manual processes to achieve this data overview, you will never have 100% coverage.

You need a data governance framework that can co-exist with agile and flexible BI environments

When it comes to data warehouses, these tools are usually good at data governance as they have been working with it for such a long time. Also, it is easier to govern a more rigid and structured data setup than an agile BI environment where you often want to make quick changes and additions. This alters the way you need to work with BI governance – you can’t really have the same traditional central-owned policy. Even on the data warehousing side, there has recently been a shift in technology. Take Snowflake, for example, it works in a completely different way, no longer requiring you to aggregate or structure data beforehand. This, ultimately, renders old governance structures ineffective and again emphasizes the importance of thinking end-to-end and agile.

At NodeGraph, we chose architecture that allows for that. We use a modern and flexible technology that’s built on a graph database which allows us to support very large organizations and solutions. Also, graph databases enable you to start without deciding on a certain model or structure from the get-go. They fulfill the needs anticipated today, but also the needs of tomorrow (that may not already be clear to you).

Communication between IT and business is a platform for building your framework but also one of its key outcomes

Let’s compare how organizations work with data warehouses versus business intelligence solutions. First off, it’s important to understand that data warehouses have traditionally been run by the IT department, ensuring that all data is collected, secure and high quality — while BI has typically been driven by the business-side. More recently, the business side has also been adding cloud solutions to its portfolio. As a result, the data landscape is expanding and becoming more complex at a rate much higher than what a typical IT framework can keep up with. However, the business-side doesn’t have the knowledge and years of expertise that IT has when it comes to data governance and data quality. This might not be a problem if you have one or two applications but consider what havoc it could wreak when we are dealing with hundreds or thousands of BI applications.

Another huge problem is that data governance is not regarded as a business-critical area in the organization but is rather seen as an IT-related question, further explaining the lack of data governance practices in the BI sphere.

Furthermore, while business intelligence tools, such as Power BI or Tableau, have governance tools available to their users, these are not enough on their own. As we have mentioned, governance is more about a process and businesses need to find tools that support their decided-upon process. This might be a data lineage tool or a data quality tool, but it’s up to the BI owners to step up and start the data governance discussion internally.

That’s where the importance of end-to-end data governance and the need to include your visualization tools in your governance framework come into play and drive this discussion forward. If you really want to be successful, you need to see involvement from several different levels of the organization and approach the matter outside-in: from the point where the data is used for decision-making all the way to its original source. It’s a shared area of interest that also creates a good platform for collaboration and better dialogue between business and IT.

NodeGraph Data Catalog acting as a KPI catalog to facilitate collaboration 

A step-by-step approach to implementing a data governance framework that includes your BI solution and boosts its business value

We believe in a “think big, start small and scale fast” practical approach to data governance and the power of approaching it from an outside-in perspective, starting from the business perspective, ensuring data quality and data trust when it comes to your BI solution, especially if you use self-service BI.

#1. Think big and explain the value

The challenge with implementing or changing a data governance framework is that it’s difficult to see the direct value. We asked business intelligence experts and our colleagues at NodeGraph to shed some light on this:

“The first thing that will happen once you create a new BI report is that someone, or even you yourself, will question whether the numbers that you see in front of you are correct. If you have a good governance framework — you won’t have to ask that question. If people start doubting their data, regardless of whether they’re working with Power BI, Tableau or Excel reports, it’s a problem.”

Oskar Gröndahl, co-founder and CEO at NodeGraph

“There is also the opposite challenge. At earlier maturity stages, companies sometimes intend to build visualizations and have things to talk about without prioritizing data accuracy. This ties into data literacy. How many business users are actually questioning their numbers? Some just get the numbers and believe they’re right or, even worse, just know they’re right. And it can be truly baffling, the misunderstanding from many people that their data can’t be wrong.

There are still a lot of organizations that are run on “good enough” because there are many other problems that start to arise when you begin to question your data. When you start enforcing governance, it’s going to be frustrating because you become aware of the problems in your data solutions. But, after that, it’s just going to get better. And it’s not a quick fix, either. It’s something that you have to work on continuously.”

Oskar Fahlvik, Managing Director at NodeGraph Inc

#2. Start small and focus on easy wins

Start small and focus on easy wins. One easy win can be to install NodeGraph and view what you have in your data landscape. It automatically goes in and shows you exactly how your data in your BI environment is interconnected, where it comes from and how it is used. This will be an accelerator for your next step.

This step might bring up questions such as “Is that Excel file still being used?” or “Why are we pulling data from there?”. This can be frustrating but starting the conversation will hugely beneficial. There aren’t a lot of organizations that are having this dialogue, and among those, few know how to transform the abstract conversation into something actionable.

The best way to go about implementing end-to-end data governance is to start with an overview of your entire data landscape, including the BI environment.

NodeGraph Field Explorer

NodeGraph Field Explorer showing full data lineage script

Follow this up by acting on the errors or inconsistencies that you find and begin laying out the process in small steps. Once you’ve started, the organization will already be learning and seeing what the biggest values of data governance can be. On top of this, you will be able to start building a bigger picture of where you are today, where you want to be in the future and how your way of working needs to change in order to bridge that gap.

All in all, this needs to be aligned with the overarching objectives you have as an organization. Are you trying to:

  • Make it easier for self-service BI?
  • Consolidate definitions for your KPIs?
  • Enable end-users to find easily reports or KPIs they need?
  • Solve a compliance issue that requires correct documentation?

There are so many different objectives that you can take into consideration and these are just some examples. The most important thing is that you initially focus your governance efforts on your main business objective. Other issues, gaps and targets will follow. The key is not to have too many, not to get too ambitious.

By using NodeGraph, you can extend the reach of your data governance framework to include your business intelligence environment. We encourage customers to see NodeGraph as a toolbox that provides all the metadata you need, in an automated way and helps you see and analyze all the building blocks of your BI solution so you can drive the most business value from business intelligence.

Extend data governance framework to include BI - A toolbox

#3 Scale fast and automate to make it easy to follow

Once you’ve consolidated your main objectives, the challenge is to make a data governance framework that is easy for everyone to follow and maintain. This means you should stay away from implementing processes that involve lots of manual work. Automation is key and that’s why the things NodeGraph can do for you are so unique.

“One of the main things that NodeGraph contributes with is an outside-in perspective, allowing you to properly understand how your BI environment is tied together. You really need to start with the business-user and go from there because, only then, can you understand if the data is being used in the intended way.

There are a lot of insights that you can produce with NodeGraph, but one of the real edges that we deliver is that you can follow the data all the way out to the endpoint where it’s being consumed. That way, you can determine whether the governance framework is being followed end-to-end, ensuring that you are in control of your data and can trust that it comes from the right data sources.”

Oskar Fahlvik, Director at NodeGraph Inc

Data governance processes to cover

Here are some ideas to help you get started with including BI in your data governance framework.

#1 Data quality management

Let’s start with an example. We have a use case where a company was loading data every Monday morning and needed to have an up to date report. Every now and then, the data warehouse wasn’t able to deliver the data in time. So they added NodeGraph’s Data Quality Manager on top of their BI solution and began constantly and automatically testing to see if the data was up to date every Monday morning. If it wasn’t, they could be proactive in their response. Because, again, we want to tell the user that the data is not up to date before the end-user realizes themselves that it’s not. So if we can proactively send out an email from NodeGraph saying, “Hey, sorry, but today the data is not up to date, but it will be at this given time”, you don’t have to spend your Monday morning reading an inaccurate report.

NodeGraph data quality manager testing qlik

NodeGraph Data Quality Manager allows users to set up tests at any chosen time to ensure that their data has been loaded correctly

Despite the fact that the entire development environment considers it default to test your code and the results of it, BI tools have yet to adopt this. It’s not common practice to test BI data. This is probably again rooted in the fact that BI is driven from the business-side — a sector that is not used to governing or testing their data processes.

We encourage BI users to test their entire solution so that they know that all their data is correct. We have another customer that has data from 150 stores. Every now and then, some technical issues arise, ex. one store doesn’t report sales numbers correctly. This is hard to spot manually but very easy to test automatically with baseline testing that Data Quality Manager in NodeGraph provides.

NodeGraph data quality manager baseline testing example

Use baseline testing to define what your data solution should look like. The Data Quality Manager will automatically identify and notify if something does not look the way it should.

#2 Metadata management

We are also working with an asset management company in New York. A large part of their business concerns buying and selling loan portfolios. The investment industry is also highly competitive. So, if you can easily and efficiently ensure that your data is correct and relay this information to the rest of your organization, you have a huge competitive advantage.

Here it becomes essential that you can show exactly where the data that is presented to you comes from and how it is tied together. And for them, there was no other solution aside from NodeGraph that would help them ensure the trustworthiness of their data.

Read their case study to learn why they chose NodeGraph, how they used our end-to-end automated lineage solution, and what the benefits were.

Compliance is another use case. Typically, this regards banks, financial institutions, pharmaceutical companies or any other organization where you have a lot of regulations to comply with. In these cases, you need to be in control of your entire data lifecycle – know where it is coming from, who and how has transformed the data, who has access to it and who has actually viewed it and in which reports. By using NodeGraph, you can access all this information and schedule automatic reports to always have them being ready for you.

NodeGraph Field Tracker allows to automatically generate compliance and user access reports

#3 Data change management and impact analysis

It’s hugely beneficial to run impact analysis before implementing changes in your data landscape. For example, if a developer changes this script in the database, what effect can be seen in the BI tool? Which fields and reports are going to be affected and what stakeholders we should talk to first before making any changes to ensure business continuity? There are a lot of steps that take place between the initial input of data to the final output but, ordinarily, you never get notified if anything has changed along the way. By performing impact analysis, you can remedy this. Furthermore, when testing your environment, if a test identifies an issue in your solution, you can perform detailed impact analysis with NodeGraph, showing you end-to-end data lineage for this specific part of the data flow, making it easy to grasp how this error affects the rest of your solution and who is affected by it.

Basically, by employing impact analysis when you start developing, you already know how your solution is going to be affected. This way, you can avoid people calling you telling you that you messed up their data. It’s a very easy simplified way of doing things.

Perform powerful impact analysis in NodeGraph’s Dependency Explorer

#4 Monitoring, measurement, and documentation

Documentation is also a very important use case. The bigger the organization, the more important it becomes. A lot of our customers use our documentation engine as a way to ensure that they regularly and automatically document their entire solution without having to put hours of manual work into it.

Another example is related to the GDPR that we have to comply with in Europe. In this case, you need to be able to present information regarding who can actually access your data. It sounds like the simplest questions to ask, but to answer it without an automated tool to support you is near impossible. With NodeGraph, all it takes is a couple of clicks to see exactly in which reports a certain field in your BI environment is being used, who has access to these reports and who has actually used them.

Everything ties back to having control over data and understanding how this data is tied together — it’s all about trust, understanding and quality. For example, say a business focuses on five specific KPIs when it comes to driving growth and that these KPIs are not consistent. It’s a very simple and specific issue, but it could have a huge impact on the business as a whole.

Final thoughts

We hope that this article has successfully cemented the importance of including your BI environment in your data governance framework, as a way to ensure that the data and KPIs that define your business success have not been skewed by improper usage and are, in fact, an accurate representation of reality.

Moving forward, we will be referring to this holistic view on data governance as data and analytics governance. Data and analytics governance wholly encapsulates the entirety of your data environment, from data warehouse to final BI visualization.

Again, we urge you to start small, think big and scale fast. Select maybe three or four areas or objectives that are important to your organization, that you struggle most with today, and start building your governance efforts as a way to excel in these. Generate a comprehensive data overview to ensure that you properly understand the environment that you are dealing with and leverage automation where you can. This way, you can ensure quality, scale faster and also remain agile and ready for changes when the time comes.

Finally, for more insight into how to get started with your data and analytics governance efforts, we highly recommend you download our Data Governance Framework Roadmap for BI Leaders.

Download Roadmap