QVD files are the backbone of any well designed QlikView application. You store your data in there and read it out again – simple. But, did you know that the QVD header holds some useful meta-data? Did you know that could be loaded into QlikView for analysis? No? Well, read on.
This article was written for QlikView, but the process of reading meta data from a QVD is identical in Qlik Sense. You just need to amend the path to be a lib:// connection. You may also want to look at the QVD Audit script in our free Instant Sense Application.
QVD Files Are XML Files
If you have ever been inquisitive enough to open a QVD file in a text editor you will have seen that the file starts with a chunk of XML. If you haven’t done this previously, why not do it now? As you look through the headers you will see some basic information, such as the time the QVD was created. Further on in the header is the number of rows in the file. You may well be aware that these values can be referred to using a number of functions, for example:
QvdNoOfRecords('c:\QlikView\Data\MyQVD.qvd');
QvdNoOfFields('c:\QlikView\Data\MyQVD.qvd');
Also in the header of the file you will find information on the field structure of the QVD and the lineage of the QVD (the initial SELECT or LOAD statements that went to build the QVD. All in all very interesting to look at in a text editor, but there are no functions to retrieve these things.
Beyond this, the QVD turns into binary for the storing of the data itself.
Retrieving The XML From A QVD
You may be aware that you can load data from an XML file into QlikView. So, it stands to follow that you can load the XML from your QVD into a table. And indeed you can.
When you click the Table Files button in QlikView and select a QVD, QlikView automatically changes to load the QVD data and shows you the content:
Just underneath the QVD radio button is a selection for XML. When you click this a different view of the QVD is shown:
You will see that there are tables you can select for header information, field information and lineage. Field information for a QVD is shown above, you can click on the other tables to see the content of these.
Clicking Finish will insert the code to load from the XML headers into your load script. The script for the field information looks like this:
QvdFields:
LOAD
FieldName,
BitOffset,
BitWidth,
Bias,
NoOfSymbols,
[NumberFormat/Type] as Type,
[NumberFormat/nDec] as nDec,
[NumberFormat/UseThou] as UseThou
FROM [.\Data\MyQVD.qvd] (XmlSimple, Table is [QvdTableHeader/Fields/QvdFieldHeader]);
When you Save and Reload you then have a table showing all the fields in your QVD.
Loading From Multiple Files
This gives us a simple way of getting some QVD information into a QlikView document. But what if we want to view information for a whole folder full of QVDs? Well, we can simply enumerate around them, in a similar fashion as we did with CSV’s in the post How To Convert Drop Folder Files to QVD. Enumerating is done with a for / next loop on a FileList statement. For each file, we want to grab the name and add it as a field. The code (after a bit of additional tidying) then looks like this:
for each vFile in FileList('.\Data\*.qvd')
let vFileName = mid(vFile, index(vFile, '\', -1) + 1, 99);
QvdFields:
LOAD
1 as FieldCount,
'$(vFileName)' as [File Name],
FieldName as [Field Name],
BitOffset as [Bit Offset],
BitWidth as [Bit Width],
Bias as [Bias],
NoOfSymbols as [No Of Symbols],
[NumberFormat/Type] as Type,
[NumberFormat/nDec] as Dec,
[NumberFormat/UseThou] as UseThou
FROM [$(vFile)] (XmlSimple, Table is [QvdTableHeader/Fields/QvdFieldHeader]);
next
If you also want to add in some header information, such as number of rows, you can add another statement within the loop to pull in those values:
QvdTableHeader:
LOAD
1 as QVDCount,
'$(vFileName)' as [File Name],
QvBuildNo as [QV Build No],
CreatorDoc as [QVD Creator],
CreateUtcTime as [Time Created],
SourceFileSize as [Source File Size],
TableName as [Table Name],
RecordByteSize as [Record Byte Size],
NoOfRecords as [Number Of Records]
FROM [$(vFile)] (XmlSimple, Table is [QvdTableHeader]);
For completion you could also pull in the lineage table for each file. You will note that the tables will all associate on the File Name field, as we keep this the same between tables.
With these statements in place you can then create a number of different charts and tables over the data. You can use the resulting app to answer a whole host of questions about your data layer, such as:
- Which QVDs have the most rows?
- Which QVDs have the most fields?
- Which field name is most prevalent across QVDs?
- What QVD creator app created the Customers QVD?
- What on earth did I call the QVD I created from the Balances spreadsheet?
Taking It Further
A recent use case I had for profiling all the QVDs in an implementation was for a server migration. After a period of parallel running we needed to ensure that the data in all of the QVDs was the same on the server being retired to the one being migrated to. By putting an additional loop around these statements to look in two locations, naming the locations in each table and using a composite key (of location and file name); I created an app that profiled QVDs from both servers. A few simple straight tables later and I was able to say with confidence that the QVDs had the same rows on both servers.
As well as being interesting for yourself, you may find that you can turn the application into a useful piece of living documentation for your users. They can find where fields reside in QVDs, particularly useful if you are going to allow them to self serve in Sense, for example.
I hope you find this post useful and it points you towards new ways of making the most of your QVDs.
It gets even more interesting when you start examine the XML part of the qvw.
Indeed Johan. I can recommend QViewer for a quick and simple way of viewing the headers in one QVD at a time. This approach however allows you to do that across many QVDs in seconds.
The best, simplest tutorial I’ve read on this topic. Thanks!
Thanks, Brian! Glad you like it.
Its quite easy and very helpful ………… thank you Steve for such nice tutorial…
Steve,
It was a great tutorial. i am a newbie to qlikview and your script is really easy to understand but i been struggling to understand the field counter you used. i have seen it in lots of places but i could’t find answer for it
Can you please help me with that please.
1 as FieldCounter
was it auto incremental or was it a flag.
Please throw some light on it and help me to understand that.
The counter field is simply a way of adding a value to each row that can then be summed. The reason for this is that the expression sum(ClientCount) is more efficient than count(ClientRef). It can also be more explicit when you have associated tables (where COUNT can give odd results). Hope that helps.
Thanks for this example. Have you seen a way to write custom metadata to a QVD using the STORE command? Basically the reverse of what you have here. I’d like to write custom metadata to existing tags or (better) to new custom tags.
Hi Eric. Thanks for your comment. As far as I am aware it is not possible to write your own custom tags. I would suggest storing a separate meta file alongside the QVD, a simple CSV with the kind of information you want to store. You would then have MyData.qvd and MyData.qvd.meta as two separate files. Apart from the risk of the files being separated I can see very little downside.
Ah, yes. That would work, and it would be pretty simple to handle the case of a missing .meta file anyway. Thanks again!
Hi Steve, is it possible to check datatype (num or text) of a table field inside the script editor?
Generally data types don’t matter too much, as QlikView flexes to deal with whatever it is given.
You could do checks using functions like KeepChar, PurgeChar, Num# or a trick I sometimes uses is multiplying values by 1.
So things like this in your load script can flag different types of data:
if(Alt(FieldName * 1, 0) <> 0, 1, 0) as IsANumber,
if(len(FieldName) <> purgechar(FieldName, ‘0123456789’), 1, 0) as ContainsNumbers,
if(IsNull(Num(FieldName)), 1, 0) as NotANumber,
Some of the functions available, such as Num# you need to test carefully to ensure they always do what you expect – as some results can be counterintuitive.
Sorry for bumping an old topic but have you ever used this metadata to compare against database table information such as number of rows and fields? A project I am working on has a database that is still in development (wonderful for QlikView development!) so I want to track any new fields that are added. Likewise to ensure data quality in terms of any incremental loads that may be in place. As you mentioned above it is a useful piece of automated documentation.
Hi Ralph. I have created an app that persists this information to QVD each day, with a date stamp so daily changed can be spotted. You could do the same against database metadata, without having to first create a QVD. Simply create an ODBC connection and use the SysTables and SysColumns load statements. Hope that points you in the right direction for what you need to create.
Hi Steve,
Really useful post, but i have one question here you mentioned in above script under “QVD Fields” there is a line i.e [NumberFormat/Type] as Type . I know that Type here will show the data type in our application but i am getting data type as Unknow those fields having Varchar as a data type.
Kindly help me regarding this, how to get Varchar as data type(now it is showing as Unknown) in qlikview application
Thanks & Regards,
Reddi Kishor
Hi Reddi,
It would appear that this field has changed in usage over time, looking at a few QVDs now I can see that some have text in here and others numbers. You just need to process that field in the load script, perhaps use an ApplyMap to deal with numbers, a capitalize to tidy it a bit, and an IF statement to fix the UNKNOWN. Perhaps something like:
Capitalize(ApplyMap(‘Map_Type’, [NumberFormat/Type])) as Type,
Map_Type can be an inline load, including UNKNOWN,Varchar.
Hope that helps.
Super useful Technique.
Hi Steve,
This code is very useful, however it did not work when trying to calculate the number of rows.
I am using the following approach:
sub DoDir (Root)
For Each Ext in ‘qvw’, ‘qvo’, ‘qvs’, ‘qvt’, ‘qvd’, ‘qvc’, ‘qvf’
For Each File in filelist (Root&’\*.’ &Ext)
LOAD
‘$(File)’ as Name,
FileSize( ‘$(File)’ ) as Size,
FileTime( ‘$(File)’ ) as FileTime
autogenerate 1;
Next File
Next Ext
For Each Dir in dirlist (Root&’\*’ )
Call DoDir (Dir)
Next Dir
End Sub
Call DoDir (‘lib://Apps’)
Any idea?
Thanks
Hi Ed,
The loading of meta data only works for QVD files, you will not be able to do it for other file types. The approach you have for file size and update time looks fine for that though.
If you only look at .qvd files (i.e. remove the extension loop) and use the code in the blog post above you should be able to load the meta data.
Good luck!
Hi, great stuff :-)
But please could you also show the script for pulling the lineage data?
I don´t see that option when I select XML, and I would really like to see the script generated.
Been googling a lot but can´t find any example script, just links to companies wanting to sell their lineage software…
thanks!
Hi, You will notice that this post was written eight years ago, and in the screen grab there is a lineage XML feed in the QVD. Looking at a number of QVDs now I can’t find a file with this information in, even when looking at QVDs from around that time. There was an option in QlikView to turn the storing of lineage data on or off, and switching it off for some QVDs was a good idea as it could cause a lot of bloat (particularly if you had loops in your generator script). If the information is there (and you can check this in a text editor) it should be easy to get it out, but what you need to be looking at is how to get it stored there – rather than how to get it out. Hope that helps!
Steve