What is Data Lineage?

This one is for all you newbies in the business intelligence industry. Or for those of you that need a quick refresher. Or for those of you that just really love infographics. Ok, so everyone is welcome, but why are we here? To explain data lineage and some of the terms that go hand-in-hand. And, considering the visual nature of data lineage, we thought there would be no better way of explaining it than through an infographic. Scroll away!

DISCLAIMER: If you want to read the Insight in its usual presentation, scroll on past the infographic to the bottom of the page.

what is data lineage nodegraph


So, what is data lineage? Data lineage refers to the origin and transformations that data goes through over time. Basically, data lineage tells the story of a specific piece of data. This allows you to understand where the data comes from as well as when and where it separates and merges with other data. Although there are several ways of representing data lineage, visual representations are most common as they allow for a simpler overview of the data solution in question.


One of the main benefits of accessing your data lineage is increased understanding of your data solution. In order for you to be able to govern and understand the data within your organization, you need to know where it comes from and how it interconnects with other pieces of data. This is why data lineage and data governance are so interconnected (a term we will define as well).

In accordance with this, data lineage also makes it simpler to identify any issues that you may be experiencing in your data solution. Further assistance comes in the form of being able to represent data lineage visually.

Types of lineage

Backward data lineage means looking at the data from its end use and back-dating it, if you will, to its source. On the other hand, forward data lineage begins at the source and follows through the end. End-to-end data lineage is the combination of the two, looking at the entire solution from the data’s source to its end-use.

Data governance

Quite simply, the ability to govern your data. A little bit more specifically, it refers to the ability to create and maintain high-quality data set up in ways that you have full control of. Data governance structures vary from organization to organization but are tied together by data protection laws.


Data describing other sets of data. It’s that simple.

Data Transformations

Finally, data transformations are any changes that your data go through in forms of format or structure.

Hope we have answered the question “what is data lineage?” and that you have learned a thing or two! If this hasn’t quite quenched your thirst, you can read more about our data lineage tool on our Solutions page.

If you have any requests for future infographics, let us know in the comments!

If you are new to NodeGraph, first of all – welcome. We are a data quality platform for QlikView and Qlik Sense. To find out how we can help you take control of your data, get started today.