5 data quality lessons from QlikView Cookbook
As some of you might already know, we recently had the huge honor of hosting Rob Wunderlich, the Qlik genius behind the QlikView Cookbook, at our headquarters. During the visit, our very own Head of Sales, Carl, had a sit-down discussion with Rob about data quality and automated testing in Qlik. The following text is a summary of said chat in the form of 5 data quality lessons. You can check out the full discussion with Rob here.
1. Data quality is the measure of how accurately your data reflects the physical reality of your business
If you’ve read any of our Data Quality Guides, you’ll know that there are many different definitions and criteria when it comes to data quality. But we think that Rob summed it up perfectly:
“So it means that the data accurately reflects … the physical reality of the business – whether that’s bank balances, shop floor, number of people working, [etc.]. So it’s complete, it’s accurate, it’s current, [and] contemporary.”
2. “Data quality is the goal and testing is how you get there”
Some of you might wonder how automated testing and data quality are related. Again, this was wonderfully summed up Mr. Wunderlich during our discussion:
“Data quality is the goal and testing is how you get there – testing is the tool that takes you to that goal.”
The sentiment here is that testing, i.e. quality-assuring your Qlik solution according to your specific guidelines, is a great way to achieve high-quality data.
3. Testing comes in many shapes and sizes
While there are other types of data testing available, Rob navigated through some of the most commonly used methods of testing within Qlik:
“Testing is a very broad topic… So [when] we’re talking about testing your Qlik solution and the data delivered from that, [we’re] talking about some very specific kinds of tests. One [type] is regression testing – [answering questions like] ‘Is the data still correct?’, ‘Does it return the correct results?’. Once we’ve validated the initial installation of the dashboard or the solution, we want to know, on an ongoing basis, is it still correct?
Secondly, [we have] the currency of the data – ‘Is it being refreshed on a regular basis and is that refresh delivering results in terms of [completeness and accuracy]?’… Those are the two major kinds of testing. There’re also the Qlik scalability tools which are excellent tools for load testing. You can also do regression testing, [but] load testing is testing your environment to see how many users you can get on there. It’s not really a data quality type of test – it’s a capacity test for planning.”
4. Confidence and scalability are key benefits when it comes to automated testing
“I can think of two significant benefits of a test-driven approach. One is confidence – it closes that gap between the CEO and the person who produced and validated [the data, giving us] a common tool that everybody agrees on. The second one, [that] a lot of people don’t quite realize, is scalability. You can produce more work with the same amount of people … and you’re not spending time running around trying to correct things. You have confidence and you can scale up.”
5. Complexity is the main barrier to adoption
Our final data quality lesson touches on the low level of adoption when it comes to test-driven approaches within the Qlik community. After being asked why it isn’t more common, Rob answered:
“[It’s] primarily due to the complexity of implementing a software-driven test approach in Qlik. It is very difficult in QlikView, much easier and possible in Qlik Sense, and there are tools like NodeGraph that are doing a great job at that. I also think that Qlik developers are generally reporting operations type people who don’t come from a testing background. People who write software for a living, for example, may have been exposed to various concepts and tools around testing, but the development audience that creates Qlik has not necessarily been exposed to that.”
To find out more about Rob Wunderlich, head over to his infamous blog QlikView Cookbook.