Loading Multiple QVD files

Hi, all of my Qlik applications have QVD data staged and prepared in advance. Here I will go through how I import them efficiently into QlikView.

For now I’ll assume you’re extracting your data each day and incrementally storing them to monthly QVD files.

  • Fact 2014-01.qvd
  • Fact 2014-02.qvd
  • Fact 2014-03.qvd
  • Fact 2014-04.qvd
  • Fact 2014-05.qvd
  • Fact 2014-06.qvd

 

Lets say the requirement for the dashboard is to have the last two months of complete data plus the current month to date. So if today were the 15th of June 2014 the Dashboard should contain data from 1st April to 15th June 2014.

You could load the data like this (replacing the dates to min / max variables and use dollar expansion):

Example

Example

 

Issues with this approach:

  • We’re using a wildcard on the QVD file so we’re attempting to load from files we know there’s no corresponding dates.
  • The where clause will also prevent the QVDs being an optimized load.

 

This is how I would tackle the requirement.

  1. First I’d load %KeyDates from my Calendar using the flags to select the date range required
  2. I’d then load a curser table containing the QVD names I need to load
  3. Looping through the curser table loading the data using a where exists on the %KeyDates loaded in the first step

 

1) Load %KeyDates

DatesToLoad

DatesToLoad

 

Here I am loading from my Calendar QVD (you can find this resource on my Blog and on the Qlik Community).

 

DatesToLoadDetail

DatesToLoadDetail

 

The resulting table has all the %KeyDate(s) for the time period in numeric format. You need to have an identical field in the FACT data as we’ll be using this field in the where exists statement.

We use the QVDLoadSuffix field in the next load statement

 

2) Load a Curser Table containing QVD Names

QVDToLoad

QVDToLoad

Here I create a new table with distinct values

QVDToLoadDetail

QVDToLoadDetail

This tell us the QVD files we need to look at when loading the data

 

3) Looping through the curser table

Loading Fact Data

Loading Fact Data

Here we perform a loop reading each row of ‘QVD_Files_To_Load’ (in this example it had three rows as shown earlier; 2014-04, 2014-05 and 2014-06)

The Field Value ‘QVDLoadSuffix’ is read into a variable and that is used within the filename of the QVD we’re loading from.

The where exists ensures only those %KeyDates already loaded (in the first step) are brought back. You can argue on this example the where clause is not required and you’d be correct. I’ve added it to show how we tackle both of the original challenges and it certainly would be required if you were loading the last 90 days (three months) from the current date.

 

And Finally….

It’s worth noting you can only load optimized with one where exits clause. You can’t use “where exists(field1) and exists(Field2)” with0ut loosing performance although you could combine the fields in your data.

I’ve assumed the Fact Data has the same fields and therefore it will be automatically concatenated. If your data isn’t consistent you could create an inline or autogenerate a table with the complete field list and one row of null values before entering the loop and then concatenate to that. That will also ensure your loads stay optimized (there’s a useful post which talks about optimization here)

You can also pass from the QMC a variable that tells the application how many dates to load in Step 1. For example you may want to create a light version with only three days worth of data. Putting Step 1 within a conditional IF statement you can very simply control the load behavior. (I’ll post more about this technique soon)

 

As always I’d love to hear your comments

Richard

 

2 responses to “Loading Multiple QVD files

  1. I often do multiple QVD loads by doing a loop using
    for each vFile in FIleList(‘..\Data\FileType_*.qvd’)
    This avoids the biggest problem with wildcard laods, that preceding loads fail. It also allows you to be selective on which files load (with an IF statement in the loop) and critically allows for optimised loads.
    Steve

    Like

    • Thanks Steve

      I use this technique as standard and usually include a statement to check the file exists. Performance is so much better loading small files selectively in the way rather than a large qvd with many months or a even years worth of data.

      Richard

      Like

Leave a comment