Analysis in QlikView is lightening quick, millions of rows can be aggregated and manipulated with very little latency. Getting those millions of records off a server elsewhere on your network – now that can be a different matter. This is where a robust QVD strategy and QlikView Incremental Loads become crucial.
This post was written prior to the release of Qlik Sense but the way that QVDs can be used for incremental loads is exactly the same across both products.

QlikView Incremental Loads
An incremental load in QlikView is when fresh data is loaded from a source database, whilst previously retrieved data is loaded from a local store. Stored data is typically in QVD files or a QVW which is used for a binary load. The same technique can also be used in Qlik Sense, and the use of QVDs is best practice in both applications. There are various approaches to incremental loads, and careful consideration should be given to picking the right one.
I Want My QVD
It is always pushed as the ‘correct’ way of doing things to have a QVD layer in any QlikView implementation. In practice there are very few scenario’s where it does not make things easier to have a QVD layer. Even if you are not in a situation where your are pre-processing the data once before using these in multiple QlikView apps; the ability to logically segregate data extraction and manipulation from data presentation can make development easier and less prone to errors. The incredible speed that data can be loaded from QVD means that unit development and test of just part of the load process before refreshing presentation apps can be done without lots of waiting around. I’ve blogged on getting data design right previously (Perfect Your Data Model) and on the advantages of Optimised QVD loads – so I won’t repeat myself here.
Suffice to say though, whilst QVD’s are a ‘good idea’ generally, when it comes to Incremental Loads they are essential.
Why Do Incremental Loads?
As BI apps are expected to deal with larger and larger amounts of data the amount of time that it takes to retrieve that data becomes a serious issue. This could be due to shear volume of data or the need for frequent refreshes. Either way, you do not want to be pulling all of the data all of the time. What you want to be able to do is just pull the data that has changed, append to that the data that you stored away previously and then get back to the business of analysing. This will reduce load on the source database, the network infrastructure and your QlikView server.
There are many different ways of implementing incremental loads in QlikView, and the choice of approach should depend entirely on the data that you are extracting.
Here are just a few examples of the approaches you can use.
Date Stamped Append Only Table
The easiest type of data to build an incremental load over is a table where new rows can only be appended to the end and old rows will never change. One scenario with these tables is that a date stamp is written to each row. Examples of these types of table are system logs or records of transactions that can’t be amended – only reversed with subsequent transactions.
If the data volumes here are not massive, a kind of pseudo incremental load can be done, using the fact that QlikView can load multiple files with a wildcard (as I’ve shown in this Video posting). Each time a data extract is done a WHERE statement can be done on the SQL select to only bring back the current day (or month, depending on volumes). When the QVD is written the date can be encoded into the filename, eg. TransactionHistory_201301225.qvd.
When data is then extracted from the QVD it can be loaded with a wildcard, eg. TransactionHistory_*.qvd. This approach is very simple, and avoids having to concatenate QVDs. You do however need to careful to ensure that no data at the end of one period is not missed out as you have moved on to the next period before it is pulled. The best way to ensure this is to loop through multiple periods, for example with each load refresh both the previous days file and the current day (or perhaps just do this until 2am if loading hourly). Data volumes and refresh intervals will naturally inform decisions as to what is the most sensible approach here.
With anything involving dates make sure that you give proper consideration to timezones and daylight saving times – as these can cause problems if you are not careful.
Append Only Table With A Unique Sequential ID
If the table again can only have rows appended and there is a unique sequential ID then a different approach can be used. The sequential ID could be a system generated number added by the database engine, or it could again be a date – if dates are recorded to the millisecond and it is not possible to have more than one transaction in a split second.
The first approach was straight forward, but it did potentially mean data was being pulled many times during the period each QVD was for and it also meant that many QVD files were being created. The multiple files would take more space on disk and load slightly slower than if all data was included in a single file. Both of those negative aspects can be avoided by always pulling data from a specific point in the table.
To use this approach the SQL WHERE statement will only pull records where an ID is greater than the last record that was pulled the previous time. This value can be read from a config file created at the end of the previous run. When all data is read in from the defined point forward a peek statement can be done to find the latest ID (ensure rows are read in ID order when using this approach), this can be put into a variable then a table and finally written to a CSV file with a STORE statement.
A similar approach to the first example could be used when the new data was pulled, so it was written into its own QVD with a suffix on – perhaps the last ID in that batch. A better approach may however be to append the previously saved data to the new data – resulting in a single, growing, QVD. As the previously created QVD will be in exactly the same structure a simple load from QVD will concatenate to the new data creating a single table in memory. This in memory table can then be stored back over the top of the existing QVD. As we are here being very precise and careful with the ID we are loading from and up to we can be confident that the resulting QVD will have each row only once and not be missing any rows.
Again, with this approach you need to ensure your assumptions about the data and the ID are correct and take care with the appending. If the size of the dataset allows it is always good to have a fallback routine that does actually pull all data from source – perhaps just run over the weekend to ensure data consistency.
Free Form Data With Modification of Existing Rows
Unfortunately we often are not living in a simple world where ID’s are sequential and any row once created can not be changed. Often changes to rows can be made at any time and the data we have in our QVD’s can become stale and need to be replaced. This can still be dealt with using a QlikView incremental load, it just needs a bit more work. Two things are however required to allow us to do this, firstly a timestamp or audit table to say something has been changed, and secondly a unique key. If the database provides a unique key then this can be used, if not one may need to be created by concatenating fields until uniqueness can be guaranteed (perhaps combining Operator Code and DateTime for example – if you know an operator can only make one change per second). This key will be used later.
As with all the other incremental loads described the SQL statement used to extract the data will have a WHERE statement on it. This time it will be to return rows where the timestamp is since a specific point in time. Unlike with the previous load where we had to be very precise with the point we were loading from; we can allow ourselves some slack here – perhaps on a daily load you may chose to always pull four days of data, so if the load failed for a couple of days it would still backfill the data that had been missed when it next ran successfully.
When the data has been loaded we will have some brand new rows and some that exist in the QVD we have already built and been adding to previously. We still want to do an append from the QVD to concatenate to the new data – but this time we will explicitly exclude any rows that we have just pulled from the database with a WHERE statement in the QlikView load. The syntax is simply WHERE NOT EXISTS (IDField).
Only rows where the ID is not already found will be loaded from the existing QVD. This is why uniqueness of ID is critical, as even if the ID exists twice in the source it will only be kept once in QlikView. Again, after the data has been combined it can be stored back over the existing QVD.
This approach is perhaps the most robust of the incremental load methods I have described here – and it is certainly the way I have implemented incremental loads most frequently out on client site.
Final Considerations
As with any system build, due consideration has to be given to incremental loads. A significant amount of bullet proofing may be required around the solution. For instance – you should code a check to see if the existing QVD is there before attempting to append it, so that the code runs first time before a historical file exists. Make sure there is a plan in place for dealing with database or network failures. A flag at the start of the load that changes it from an incremental load to a full one is a good ‘get out of jail free’ trick, so if there is any concern about the integrity of the data it can be pulled in its entirety from source. If data volumes, or even the existence of historical data beyond a certain point (eg. where logs get purged from the source system but are required in QlikView) mean that a full refresh is impossible it is absolutely critical that QVD files are backed up. The creation of a backup can even be written into the QlikView load script – so there is not reliance on an external process.
Two final points that are important anywhere, but have specific relevance to incremental loads; ensure that loads from QVD’s are Optimised wherever possible (Google this if you are not aware of Optimised vs. Non Optimised loads) and make sure that if you are loading from multiple sources into a single table you ensure that the columns from all sources are identical (if they are similar but not identical a massive synthetic key can be created and things can fall over very quickly).
Hopefully this has given food for thought as you go on and design your own incremental data loads.
Hi Steve,
I am happy that i clarify all my doubts from your posts.Its really Informative.I have a query here.How can we do the incremental setup without timestamps?
Regards,
Sivaraj
Hi Sivaraj, if you have a unique sequential ID then this can take the place of a timestamp. Any unique ID will be useful – but you need some way of knowing if a record is fresh – perhaps account postings in the last two months, for example.
Hi Steve,
I have one oracle table. in that table there are few columns for example.
ID (Number)
Name (Text)
Description (Text)
RowVersion (Number)
Now i want to implement increment load in above table. It means i want to get new records and modified records. Here there is not date column.
I know one way using modified date column. But here once any record change RowVersion column will be increment by one count.
Can you help me to solved this problem?
Thanks
Hi Dipak,
When the source tables don’t have a last updated date on them it is usually best to create a database trigger to create a separate table with ID and DateUpdated in that can then be joined to when doing the select. If the majority of rows never get updated in your database you could simply take rows with ID’s later than the last ID plus any with a version greater than one. This would only work if updates of previous rows were rare (eg. general ledger) as you would have to pull all amended rows on every run.
Hope that helps.
Hi Steve,
Thank you for reply.
I will try this.
Thanks
Hi Steve,
I am having same doubt in my scenario in Table update date is not present. But creation date and Primary key is there , so how to identify my Quantity is update or not.
For eg Fields are:
SOI_SYS_ID,
SOI_CR_DT
SOI_SOH_SYS_ID,
SOI_PII_QTY_BU,
SOI_PAI_QTY_BU,
SOI_DNI_QTY_BU,
SOI_INVI_DN_QTY_BU
Please let me know how to get new insert and updated rows.
Hi Sumita – if you have only a create date and no update date then you will need to create something in the database to identify modified rows. The simplest way may be to create an ON UPDATE trigger on the table and then use that to stamp an updated date on the record. If you are not able (or would rather not) to write back to the original table in a trigger then you could use the trigger to write to an audit trail table that you could then join to in order to identify changed rows since a date. Hope that helps!
Hi Steve,
Thanks for your reply.
Hi Steve,
Right now my incremeanted load is working for newly inserted and updated records.
Can you please explain me how to handle deleted records?
Thanks
Dipak
Hi Dipak, deletes can often require a bit more thought with regard to incremental loads. If pulling a complete list of all live ID’s is relatively quick you could pull this list and then load from QVD with a WHERE EXISTS on that field. A more efficient approach might be to pull a list of deleted ID’s and load from QVD with a WHERE NOT EXISTS. If you can not get a list of deleted ID’s (perhaps because it is a physical rather than a logical delete) then you could look at creating a ON DELETE trigger in the database to write the deleted IDs to a table for use in the load.
[…] Further readings: learnqlickview , quickintelligence […]
Hi Steve,
I’m using the “Append Only Table With A Unique Sequential ID” method, my unique sequential ID is a date… my question is related to the size of the QVDs file, I’m using 5 fact files and one of those now is sizing 1GB (in 2 months). In the final project a just load the last 5 day and the qvw is around 20 mb (witch is faster yet). The question is, Can Qliqview manage qvd files greater than 10 GB?, is not a risk?, Now i have been working in this way for 2 month and for me is great but I’m worried about the qvd’s size.
Best Regards
Silvio
Presently you are able to load up to 2,147,483,647 unique values into a field in QlikView. This should give you plenty of scope! QlikView can manage files that are larger than 10GB – you just need to keep close tabs on how much memory you have in your server. If you load all of a QVD it will take up as much space in memory as it does on disk – and require a little more during the load process.
If you need to claw back some memory then not loading columns that are not absolutely required is the best start point (see Rob Wunderlich’s Document Analyzer for help with identifying these). There are also tricks for reducing size by reducing the number of unique values (eg. by splitting DateTime stamps into two columns or rounding numbers).
If performance starts to become an issue then look at ways of addressing that before it turns into a real problem – there is plenty of advice on-line relating to that.
Hope that helps!
Thanks Steve. Your feedback has been useful for me. The life cycle of those fact Tables is around 1 year, so my issue is solved!
Regards!
Hi Steve,
i am using qlikview Personal Edition i am facing some of problems like
1 ) i am not able to perform ORDER BY on Resident table ( in scripting area )
2) can we perfrom CONCATINATION when we are fetching data from DATABASE ON sql Script.
Hi there,
If the order is important then you can sort the data as it comes in from the original SQL source. More often than not the order of the data in the load is not relevant (unless you are using PEEK) as you can sort in the front end.
There are many ways of performing concatenation – I talk about some on this blog. If you are using SQL then you can perform a UNION in the original SQL load or use the CONCATENATE statement in QlikView.
Hi,
You can use order by only in Resident load only. If you want to use order by then load QVD data into temporary table and then using resident load sort the data and then drop the temporary table.
Resident loads can be very slow, I would recommend not using them unless you really need to. The order that data is loaded in is very rarely an issue.
Hi Steve.
I have qlikview personal edition in that edition i am facing some problems.
so please kindly can you tell me the how to do Qvd partition in qlikview?
Hi Dhananjay, I’m not sure what you mean by partitioning your QVD. Can you please elaborate?
Hi Steve,
Greetings to you….
I am very new to QlikView and need to do the incremental load for my work.
Data source is SQL 2012 server and needs to load the data on a weekly or monthly basis.
There are no unique fields like u said in the second part of ur post.
Could you please tell me a step by step process of creating this incremental load.
Thanks a lot in advance.
Bala
Hi Bala,
If the data is written on an append only basis then the lack of unique key is not a problem – you just need to use a date field to chunk the files up into – eg. pull each month into files such as SalesData_201310.qvd and each load only refresh the last month or two. If historical data can be changed then you will need a unique key in order to make things work. As there is not one in the data you will need to create one – if there is a created date/time stamp that will be a good start, if it is possible that two rows can be written at the same time then append to that as many other fields as you need to make a unique key. Make sure you validate your key over as much historical data as you can – count (or better still sum a counter field) over the concatenated key and there should be no values where the count is more than one.
Hope that helps.
When I use the code as shown to do the LOAD for the Temp_ID: table it fails with a “missng FROM” error and the the subsequent STORE fails as it cannot find the Temp_ID table?
If I use a LOAD * INLINE I can get it to work but was wondering am I doing something wrong?
Should the CSV file have just a single entry containing the Max_ID value…if I use the INLINE option I get two lines in teh CSV with the second line being AUTOGENERATE(1)
My best guess here is that the variable vMaxId is empty at the point you try to load the AUTOGENERATE table. Check for this using Debug Mode in the script editor. You will need to put a trap in there to not do the increment on the first run – as there will be no ID to check for.
Hope that helps.
Not sure if this affects the outcome but I am using the desktop edition.
The variable does have a value set, it displays correctly in a text box after the script finishes.
So rather than setting it with the PEEK command I just set it with a let command
let vMAX_ID = ‘000000000000AC0B’;
and it fails
But it runs if I set it as
let vMAX_ID = ‘1234567890’;
Although the csv file that is created only has a single cell with the text vMAX_ID rather than the value ‘1234567890’
I got this to work…what I did was this…I removed the following line, so I wasn’t storing the value in a variable
let vLastIncidentID = PEEK(‘FieldName’, -1, ‘TableName’);
Then added the PEEK command directly into the LOAD section
TempIncidentID:
LOAD
PEEK(‘FieldName, -1, ‘TableName’) as LastIncidentID
AUTOGENERATE(1)
;
STORE TempIncidentID INTO [c:\shared\LastIncidentID.csv] (txt);
The CSV file is created as expected.
Hi Mark,
The reason this failed before you moved the peek into the LOAD statement was that the load was expecting an integer value (hence no quotes) and the string value caused it to fail. Adding quotes around the variable would cause it to work.
Whilst what you have there should work okay you need to make sure that the values that come out in the ID field are unique and sequential – if they do not sort correctly alphabetically it will all go horribly wrong. You may need to find another field (eg. a date stamp) to load from and use the other ID to ensure uniqueness – with the WHERE EXISTS.
Good call…I checked the field and it is a unique field and I am loading the data in that order…I had another problem after I got the data stored in the csv file.
I added a section to read the CSV file in at the start of the script to get the last stored value to use in the SELECT statement…but when I did this the STORE failed…I’m guessing because QV keeps the file open after the data is loaded from it.
I changed from using CSV files to QVD files and that solved that problem.
Hi Steve,
I have a table and i need to find the Lastest Date data where the value is continously in increasing order.
i am planning to use FOR…NEXT Loop but how can i store the data temperarily?
The best way to persist data for archive, distribution or even temporarily is in a QVD file. I have written a blog post on the creation and use of QVD files that may be helpful here: Introduction to QVDs. Resident loads may also be useful. If you are looping things ensure you drop all data you no longer need between each iteration.
Hi Steve,
I am facing a problem in charts I created for my company.
The problem is that the charts (any kind) take too much time to show the data. I tried the syntax if dimension and in the expresions, it didn’t work. I also tried in include variables it did not work.
Do you have any suggestions regarding this matter?
Thanks
There are a number of ways you can look to optimise performance of charts – you will find reference to some on this blog. Having calculated dimensions is one thing that will certainly hit performance. Try and do as much of the calculation required as possible at load time – not in the front end. If you can use Set Analysis to do any calculations over a smaller set of data that can really help things also. Improvements in the data model can yield major performance benefits, aim for as few associations (joins) in your data model as you can. See posts on ApplyMap and Data Models for some ideas on this.
Steve-
I have a scenario where we work in layers , that yields QVD’s having data merging with logical conditions. But some of fact qvd takes more than 5 hrs to reload. Kindly suggest if I need to do the incremental load. As I know that can happen only while pulling the data from the source. But if you have any fundamental to reduce the reloading time.
Second point , populating time of charts is taking much time. Tried everything but not able to improve. Whatever you have said in your blogs or documents, I have read those and those I have already tried. Thus need something new from you as suggestion.
Hi Vikas,
It definitely sounds like you need to get an incremental load in place. The other thing that is critical is that all your QVD loads are optimised – there is a blog post on this site on this.
If you have followed all the advice you can find on this site and others then it sounds like you may need to get a consultant in. In one scenario I took a load that was taking a few days to run at month end to under an hour. You should be able to find someone local to you that can help you with your load.
Hi
QV_Table:
SQL SELECT ID,
DATEUSAGE,
CODEOPERATEUR,
TMCODE,
TYPEUSAGE,
CODEOFFREUSAGE,
SENSTRANSFERT,
VOLUME,
MONTANTHT,
MONTANTTTC,
PAYS
FROM “IT-LEADERS”.USAGEROAMING
WHERE not Exists(ID);
Concatenate LOAD
ID,
DATEUSAGE,
CODEOPERATEUR,
TMCODE,
TYPEUSAGE,
CODEOFFREUSAGE,
SENSTRANSFERT,
VOLUME,
MONTANTHT,
MONTANTTTC,
PAYS
FROM File.QVD;
STORE QV_Table INTO File.QVD;
when I try with this script it shows me errors (to optimize);
thanks
Hi. The WHERE NOT EXISTS syntax is QlikView syntax, but where you have it in your script it is being passed to SQL. It will be SQL that is complaining. At that point in your script you can only use valid Transact SQL statements (such as to get rows changed since a certain date). The WHERE NOT EXISTS should sit immediately after the FROM File.QVD line.
Hi Steve Dark, how about retrieving data from SAP. Does Qlikview support incremental data retrieval from SAP RFC/BAPI/BW
Hi Raju,
When you get the SAP connector it comes with some routines that do basic incremental loads. You can also write your own incrementals against the data, using the techniques above.
Regards,
Steve
Hi Steve
Ive got incremental load to work as set out in the manual (using dates although I had to use num before all the dates)
But what I would prefer at work is to use a Unique Sequential ID
I could not get the example above to work. The TEMPID gave ths error
Syntax error, missing/misplaced FROM:
TEMPID:
LOAD
AS MaxID
AutoGenerate (1)
TEMPID:
LOAD
AS MaxID
AutoGenerate (1)
Im not a computer programmer. Is there a very simple set of instruction to explain how to load using a sequential number rather than dates.
Thanks Robert
SET ThousandSep=’,’;
SET DecimalSep=’.’;
SET MoneyThousandSep=’,’;
SET MoneyDecimalSep=’.’;
SET MoneyFormat=’£#,##0.00;-£#,##0.00′;
SET TimeFormat=’hh:mm:ss’;
SET DateFormat=’DD/MM/YYYY’;
SET TimestampFormat=’DD/MM/YYYY hh:mm:ss[.fff]’;
SET MonthNames=’Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec’;
SET DayNames=’Mon;Tue;Wed;Thu;Fri;Sat;Sun’;
let vMaxID = PEEK (‘ID’,-1);
TEMPID:
LOAD
$(vMaxID) AS MaxID
AutoGenerate (1);
STORE TEMPID INTO D:\QlikView2\MaxID.csv (txt);
mapload:
Mapping
LOAD * INLINE [
ID, Type
100, AAA
101, BBB
102, CCC
103, DDD];
http://community.qlik.com/message/605085#605085
I got this to work using peek
It might not be the best option but it works well and is quick.
Hi Steve,
Is it possible to use a QlikView function inside a macro ? For example, I want to use QvdCreateTime inside a macro, is it possible ? Otherwise i have to use this in the the load script , but, before running the load script, is it possible to get this from a macro to a variable (like Document trigger onOpen ->External Action – Run Macro)
Thanks you.
Sree
Hi Sree,
You can get the value of a variable in a macro – so anything you can put in a variable is fine. You will not be able to use QVDCreateTime in this way though. There may be a way of doing so in a macro, but I would recommend using the standard VBA FileSystemObject and the get file date time function within that. This should always be the same as the output of the QVDCreateTime anyway.
Steve
Hi, I need to set variables in a input box and I need to store them in a qvd file using a button. Is there anyway to do that? Please help me out
Hi Karthick. Usually QVD creation is done in a load script, so a reload would be required. With a reload you would simply set the variable in the InputBox and then have script like this in the load script:
LOAD
‘$(vMyVariable)’ as FieldName,
There are ways to write out without a reload, using Macros and Export, but I tend to avoid using these methods. You would have to have the variable in an expression field on a straight table and then Export it with the macro to do this.
hi Steve, very informative post – thanks!
do you have any suggestions on how to implement incremental load if you need to join data from many tables to get data for one table which will be in QVD?
in other words, I have a highly normalized database and to get something like Employee.qvd I need to do like 10-15 joins to the various different source tables.
Would you bring all of them as separate QVDs and then create employee.qvd using all these tables? is there sophisticated EXISTS clause to handle update time stamps in all tables that makes up final Employee table?
hope it makes sense :)
Good question Boris! It depends on how many rows are likely to be modified by a row in a joined table being amended and likelihood of changes in the joined tables. I would probably create an incremental QVD for each of the joined tables as staging tables. You could then have a normalized data model in QlikView (which normally I would try to avoid), see how long the joins take if you rebuild your main fact table from all the QVDs in their entirety, or re-join the tables by using WHERE EXISTS (load keys from changed QVDs, load from Fact table with where exists then join changed values, finally append non-changed rows).
Hope that gives you some ideas on how to approach.
good points, Steve! thanks as always!
Hi Steve, I am not sure is this the right place to ask this question, but still I am doing it.
Can u please let me know, how I can achieve last 3 or 4 reload time, I tried reloadtime() as reload in script. It gave me only last reload time. Please help me in this.
Thanks
Rahul.
Hi Rahul. You need to write a bit more code to achieve this. You will need to set the start time at the start of your code, like this:
let vLastLoadStart = now();
Then at the end of your code persist this to a file and add old records:
let vLastLoadEnd = now();
Audit:
LOAD
‘$(vLastLoadStart)’ as LoadStart,
‘$(vLastLoadEnd)’ as LoadEnd
AUTOGENERATE(1);
if Alt(FileSize(‘Audit.qvd’), 0) > 0 then
Audit:
LOAD
LoadStart,
LoadEnd
FROM Audit.qvd (qvd);
end if
STORE Audit INTO Audit.qvd (qvd);
With a bit of luck that will give you a table that you can work with for showing start time, end time and from that you can calculate duration.
It may need a bit of fixing, as I have just typed it straight into the comments, but it should get you started.
I have a chunk of code I use sometimes that records further information, such as filename and number of rows – so all the audit tables can be pulled into a document that gives an overview of what all reloads are doing.
hello there. GREAT post (as I have had to code this before). Under your Final Considerations you mention “As with any system build, due consideration has to be given to incremental loads. A significant amount of bullet proofing may be required around the solution. For instance – you should code a check to see if the existing QVD is there before attempting to append it, so that the code runs first time before a historical file exists.”…
1. How would I check if the QVD exists?
2. I always run a once off dummy code to create the initial Data set. How would you check for this programmatically..and create if it doesnt exist? (Ties up with my question 1, I think).
thanks
Hi Lorna, glad you like the post – there are plenty more like it on this site :)
You can check for a file with the following code:
if Alt(FileSize(‘..\data\myqvd.qvd’), 0) > 0 then
There is no actual FileExists function, so FileSize needs to be used, this will however return null if the file is not there, so Alt is needed to ensure a numeric outcome.
You are correct that your two questions are linked – by using the IF statement above you can avoid having to do the one off data load.
Where you do sometimes have to fudge things a little is if you add an extra field to your data set. You need to add this to the incremental code to add the value to the historical QVD (if the field can be derived from other data in the QVD or if null is acceptable in the field) as a one off to create the field. This can then be fixed after the one off. The other way is to have a separate routine to add the new fields to the old QVD.
Hope that makes sense.
Hi Steve Dark How Qlikview Fetches Data Automatically.Suppose I have Import One Month Data.How Second month Data Will Be Added Automatically .Can U Explain Step Wise?
Hi, that is exactly what this article describes! If the data is not too large then just pull all data every time – this is simplest but slowest. If the load takes too long like this then work through which of the methods above best fits your data and your requirements.
The table from which data is loaded to data.qvd file is very large in size. so we are planning to create separeate qvd file for each day, means for every day when the application run by schedular, it should create a separate qvd file for the previous day (with a time stamp). so that the dashboard.qvw application takes data from multiple qvd files.. How this task is possible??
The whole task should be automated..means by scheduling the task…
Can Anybody please help to solve this puzzle….
This is a very sensible approach. As you create each days file you will need to create a variable with the date and then use this in the store command, e.g.
let vSuffix = date(today(), ‘YYYYMMDD’);
STORE MyData INTO MyQVD_$(vSuffix).qvd (qvd);
The trick here is ensuring that you don’t get any rows that appear in more than one QVD and that you don’t have any rows that don’t appear in any QVD. Careful use of a WHERE statement in your SQL should allow you to enforce this.
You can then load from each of the QVDs that you have created in a loop. Take a look at this blog post that should point you in the right direction for how to load from a loop.
Hi Steve, Great Stuff – you’re a quick solution to most of the issues. Thanks for that.
If I had to load a table based on two other tables (i mean using where exists) with each field belonging to first and second table, how would i do that
Hi Ajay, you can have multiple WHERE EXISTS in a load statement, so therefore can reference more than one table (using an AND). Be aware though that as soon as you have more than one WHERE statement you break the Optimised load from QVD. This will often render any performance gains obtained from an incremental load useless.
Hi Steve –
I have a separate question, while importing text table with around 10 GB of data and loading to qlik view it takes longer time to complete say 4 hours. do you have any best practices are some steps to follow to load quicker. Also some times i get out of Virtual and/or Logical Memory error.
I am using Free version and i working for an POC and i started facing this issue.
Secondly i would like to know about the Pros and Cons of Qlik view. We are currently using Cognos and thinking of using Qlikview for some customers.
Also if you could help me with some Administration, Security and best practices etc. Also if possible with some Poc’s of Qlikview reports and Dashboards. Thanks!
Karthik
Hi Karthik, If the data only arrives in one very large file then the best thing to do is to load it without any transformation whatsoever and then save it as a QVD. This is the most likely method to have it work. You can then load from the QVD to do your transformations. If it still fails, even when not doing any transformations, you may need to split the load – perhaps doing multiple parses in a loop with WHERE statements – storing to QVD and then dropping after each iteration. This will take longer overall, but it may avoid the falling over of the load. You may find that putting extra memory into your machine is the best answer, if that is a possibility.
If all else fails then you may find you need to bulk copy the text file into a database and work on it from there – but I’ve not yet found a text file large enough for this to be the case.
Obviously I am very biased towards QlikView and Qlik Sense. Both are “enterprise ready” platforms, and have all the requisite security and governance options in place. If you Google you should find plenty of whitepapers and case studies to back this up.
[…] Further readings: quickintelligence […]
Hi all,
I would like to know , is there any automated process or mechnism /scheduler , which access SAP table / data through Qlike view interface using connector . is there any scheduling mechanism , where SAP data (may be some customer / sales distribution) data can be extracted and QVD files can be created automatically . if you can through some lights . I am not sure about automated process. only knows sap data can be access through Qlikview using connector
There are various SAP connectors, to point to the different flavours of SAP. One of the ones I have used in the past comes with some example apps which deal with the incremental loading from source. Suggest finding the connector you need and then looking at the documentation that comes with it.
[…] QlikView Incremental Loads […]
Hi Steve,
I am new to Qlik and have couple of questions,
1.Is it possible to handle SCD2 data in Qlik.
2.Qlik lloks like a more visualization tool rather than a traditionl reporting tool?
Please add your thoughts on the above questions
Hi Vimal, I’ve no idea what SCD2 data is, but QlikView can consume data from pretty much any source. It certainly isn’t a reporting tool. I would go beyond Visualisation tool, and say it is for Analysis and Data Discovery. What sets it apart is the ability to slice and dice data with zero wait.
hi
any help plz !!
i want to add incremental “ID”
1
2
3
4
…
thnx
Hi Asma – Not sure what your question is here. You can add an ID to a row by using RowNum() as ID,, but this isn’t going to help for an incremental load, as it would start from 1 again on each part of the incremental loaded. You need a unique ID in your underlying database really.
Hi Steve
I get an error in this line:
LET vFechaInicio = DATE(Today() – 4,’YYYY-MM-DD’) ;
LET vTableName=’GlobalTotal’;
IF (FileSize(‘$(vFFile)F_DRed_GlobalTotal.qvd’))> 0 Then
$(vTableName):
TRACE **** Ejecutando carga incremental Fact GlobalTotal****;
SQL SELECT *
FROM “content_dashboard”.”ga”.”ga_global_total”
Where Date(Date#(“date”,’YYYYMMDD’),’YYYY/MM/DD’) >= ‘$(vFechaInicio)’ // HERE DO NOT LET ME FORMAT
;
Concatenate
LOAD *
FROM
$(vFFile)F_DRed_GlobalTotal.qvd
(qvd)
//Where Not Exists(“id”)
;
ELSE
$(vTableName):
TRACE **** Ejecutando carga full fact GlobalTotal****;
SQL SELECT *
FROM “content_dashboard”.”ga”.”ga_global_total”
;
ENDIF
STORE $(vTableName) into $(vFFile)$(vF)$(vApp)$(vTableName).qvd;
DROP Table $(vTableName);
the date field has this format –> date=’20190608′
please help me,
because I can not format that text field in my extraction layer.
Hi Carlos,
The issue that you have is you are trying to use Qlik code in the SQL statement. All of the code between the SQL and ; needs to be executed by the database. The Date and Date# functions will not work here. The exact syntax depends on which SQL database you are using. It’s going to be something like:
WHERE “date” >= CONVERT(DATETIME ‘$(vFechalnicio)’)
Check with the appropriate SQL documentation regarding the exact syntax for the CONVERT or CAST function.
Good luck!
Hi Carlos,
Please also change your first two lines to initialize the variables vFechaInicio and vTableName to:
LET vFechaInicio = DATE(Today() – 4,’YYYY-MM-DD’) ;
SET vTableName = ’GlobalTotal’;
Good Luck.
Something to watch here is that WordPress has replaced the quote marks used, ensure you replace with straight single quotes.
THANK YOU VERY MUCH AND SOLVE IT
Hi All,
I am not sure I can post this here. I have a question about incremental load. I am having updated date in the table and I am doing concatenate load from the qvd. it always brings me less rows than the previous time.
For example if the full load has 1000 rows (in QVD) and incremental load has 150 rows, I am expecting my concatenate load should give me 1150 rows. but it gives me some random rows always less than the expected(like 750). For each load this gets decreased and finally I am getting only single digit number of rows.
Can any one help me on this?.
Hi Mary,
My best guess is that the concatenate isn’t working and the load from QVD is going into a separate table. Ensure you have an explicit CONCATENATE statement, and you give the table name to concatenate to. Be careful with WHERE EXISTS, as if you do this on a field with duplicates then only one row with each unique value will be there when you append the QVD. If you are using WHERE EXISTS in your incremental routine you must have a unique primary key.
You could use a number of extra STORE statements to write out the contents of the table at numerous different steps in the load, simply name the QVDs Step1.qvd, Step2.qvd. QViewer is a good way of checking the number of rows in each QVD.
Hopefully one of those suggestions will point you towards the answer. Qlik Community is a good place to look if not.
Thanks for the quick reply Steve. I have the same table name only. Here is one of my table.
[BRAND]:
LOAD
[BRAND_ID],
[BRAND_NAME],
[LAST_MODIFIED_DATE] AS BR_LAST_MODIFIED_DATE;
SQL SELECT
“BRAND_ID”,
“BRAND_NAME”,
“LAST_MODIFIED_DATE”
FROM “*”.”Administrator”.”BRAND”;
where LAST_MODIFIED_DATE between to_date(‘$(RUN_STARTD)’,’YYYYMMDD:HH24:MI’)
and to_date(‘$(RUN_END)’,’YYYYMMDD:HH24:MI’);
Concatenate
LOAD
BRAND_ID,
BRAND_NAME,
BR_LAST_MODIFIED_DATE
FROM [lib://dev(tir_admin)/QVD/BRAND.qvd](qvd)
where not Exists(BRAND_ID);
Store BRAND
into [lib://dev(tir_admin)/QVD/BRAND.qvd](qvd);
The Load is not giving error. But always do only partial data load.
My guess is that you have multiple rows with the same BRAND_ID. Do a load without doing an incremental and check for number of occurrences of each ID. QViewer can help with this. After loading double click BRAND_ID to see each unique value and how many occurances there are of each.
Hi Steve,
Brand id is unique. Each brand gets one id automatically by the ERP System during creation of the brand record. I am 100% sure about that.
I would try being explicit about which table you are concatenating to, with CONCATENATE (BRAND) LOAD. Storing out interim steps to QVD should also be helpful.
HI Steve,
I tried CONCATENATE (BRAND) LOAD. I did a full load before the concatenate load and then tried the incremental load. The First lines fetched:0 I can understand as there in no updated data immediately after full load. we have 269 brands. But the lines fetched after the “(QVD (row-based) optimized)” gives me only 241 rows. I can’t understand why it is giving me only 241 not 269. Please help me.
BRAND << BRAND
Lines fetched: 0
BRAND << BRAND
(QVD (row-based) optimized)
Lines fetched: 241
Try commenting out the line where not Exists(BRAND_ID);, note you will need to move the semi colon onto the next line. My guess is that this will then bring in all rows. This then strongly points towards duplicate IDs (perhaps deleted rows in the table which have been flagged as deleted rather than permanently removed?). Identify where this is happening by creating a table with BRAND_ID as a dimension and COUNT(BRAND_ID) as the measure. This could then show you where the problem lies.
Hi Steve,
I tried your suggestion. I can’ t see anything different. It is just missing randomly. First of all brands are getting created one time. Chance for modifying is very less. So most of the time count for incremental load will be 0.
Will that be a Problem?
And also we recently upgrade to April 2019 Patch1 from April 2018 Patch 1. it was working in our previous version April 2018.
Do I need to have a different approach for this April 2019 version.
Please help me.
If it was working fine before and now it has stopped working it could be a glitch in the later version. It may be a good idea to raise this with your Qlik Partner if you have one, or try emailing support@qlik.com.
Thanks Steve,
I will submit a ticket to Qlik.
Hi Steve,
I didn’t submit any ticket to Qlik. I found a alternate from one of the Qlik Forums.
Here is that. This allows all the new with the MaxID concept. But one small Problem is not considering the updates. But for my table, there is .001% of update. So this Solution works for me.
Thanks again for your support.
MaxKeyLoad:
Load Max(BRAND_ID) as MaxID
FROM [lib://$(metafiledir)/BRAND.qvd] (qvd);
let MaxID = peek(‘MaxID’,0,MaxKeyLoad);
[BRAND]:
LOAD
[BRAND_ID],
[BRAND_NAME];
SQL SELECT
BRAND_ID,
BRAND_NAME
FROM “*”.”Administrator”.”BRAND”
WHERE BRAND_ID > ‘$(MaxID)’;
Hi Steve,
What if we won’t have a date field. But yes we will have a composite key which will give us unique values.
Could you please show us a simple example. I have searched many places but couldn’t find a simple example.
Regards,
Sukant
Hi Sukant,
The challenge you will have there is not the incremental load, any unique key is fine for that following the approach above. The issue is finding out which fresh data you need to pull.
You could deal with this in your SQL database by creating a date stamp, even if one doesn’t already exist. If you can not amend the source table you could create this using database triggers, maintaining a separate table of Unique ID and Last Updated (Google for how to do that in SQL).
If you can not write to the database at all you will have no choice but to download all of the keys all of the time. If you put a primary key index on that column this should be quite quick. You can then do a WHERE EXISTS against your existing QVD to get a list of new IDs. You can then loop round each of those, either concatenating together a WHERE statement for the SQL or making calls against the database one row at a time (I would probably lean towards the first option).
As with all of these things there are many approaches that might work. You will need to find the one that works best for you.
Good luck!
Hi i want to implement the incremental load on the qvd
and the data is coming from sql sever.
load
po_header_id,po_creation_date,po_type ;
SQL
SELECT
PH.ATTRIBUTE2 AS PO_TYPE,
PH.CREATION_DATE AS PO_CREATED_DATE,
PH.PO_HEADER_ID AS PO_HEADER_ID
FROM
APPS.PO_HEADERS_ALL PH
WHERE PH.CREATION_DATE >=TO_DATE (’01-1-2018 00:00:00′, ‘MM-DD-YYYY HH24:MI:SS’)
;
so how can I implement the incremental load to this?
Hi Krishna,
Everything you need to know is in the article above. In this case, provided that PO_HEADER_ID is unique you can store the resultant table to a QVD then on next load pick up from SQL and then concatenate the contents of the QVD using WHERE NOT EXISTS (PO_HEADER_ID). If it is not unique you can load from the previous day, using equals rather than greater than or equals, and only load each day once. Again you will need to concatenate from the previously stored QVD. In both cases you will need a variable to inject the date into the SQL statement, so after doing an initial load from 2018 you simply load more recent loads.
Hope that helps.