In previous articles I have mentioned how critical it is to ensure your loads from QVD are optimised, but have not gone into the detail of how to do this. This post rectifies that. Here I explain what an optimised load is, why you should use them and how to perform them.
The headline grabbing fact here is that optimised QVD loads are up to 100 times quicker than non-optimised ones. That makes a lot of difference if you are watching a reload dialog, it even makes a lot of difference to your server performance if your reload is running on a schedule.
The reason for the vast difference is related to the much publicised compression algorithm that QlikView uses when storing data for in memory analysis. QVD files are stored in a format that mirrors the compression used in memory (which is why QVD files are so small on disk) and during an optimised load the data is sent directly from disk to memory directly in the same compressed format. When a non-optimised load is performed this is not the case.
So why not make all loads from QVD optimised? The simple fact is that some operations require the data to be unpacked, modified and then re-packed. This significantly slows the process. Just about any change to the data on the way out of the file and into memory will cause a load to be non-optimised.
Some examples of things that will cause a non-optimised load are:
- Adding new fields to the table
- Deriving new values from a field in the QVD
- Retrieving a field twice
- Most WHERE conditions
- Joining to an existing in memory table
- Loading data into a mapping table
In contrast the things you are allowed to do are:
- Rename fields
- Omit fields
- Do a simple one field WHERE EXISTS on a field returned in the record set
This sounds hugely restrictive, but then most things you would want to achieve can be coded for. For example, if you need to add fields – do this in the QVD generate routine rather than when reading the QVD. Similarly, if you need to derive a value do this when you generate the QVD also. Even complex WHERE statements can be done by deriving flags or composite keys in the QVD generate routine and then doing a simple WHERE EXISTS on a temporary table (even if that temporary table is just a single row from an in-line table).
In fact, optimised QVD loads with a WHERE EXISTS clause on each subsequent load statement is a simple but effective way of quickly building documents which contain related subsets of data – but that is something for another post.
So, how do you know if your load is optimised? Well, the first way is by noticing it is still running when you return to your desk with a fresh cup of coffee. The other is by checking the load progress dialog. Optimised loads show the text qvd optimized as the data is being pulled from the QVD – in contrast no message is shown when the load is non-optimised.
Always look out for that text and if it is not there when loading from a QVD then there will be merit in reviewing your load script to make the load optimised.
When the incredible speed of an optimised load is really essential is when you look to build an incremental load strategy and fresh data from source databases are combined with data previously stored in QVD’s. If the retrieval of the old data is not quick and efficient then the whole point of the incremental load is eroded.
It is worth noting here however, that that a non-optimised load from a local QVD file will still typically be much much faster than from any other data source. Sometimes non-optimised loads can not be avoided (or the development required to avoid them is not worth the time saving).
Hopefully this article has given you the information you need to make sure you loads are optimised, or make an informed decision to allow non-optimised loads.
As I have said in previous articles the back end of your QlikView document is at least as important as the front end – and ensuring optimised loads is an important part of getting the back end correct.