Lets start with a question (and I have to thank George Beeton for this). Answer honestly as if this was a general Qlik question at the office and not part of some blog brain teaser…..
You first load TableA which has one Field [ID_A] in which has multiple rows but only three distinct values. Because you’ve loaded distinct your resulting table has only three rows… simple.
Now load a second table. Again it only has one Field [ID_A] and as such will automatically concatenate to the first table you’ve just loaded. The second tables contents are the same. Multiple rows but only three distinct values, the same distinct values as the first table you loaded. Again you load this second table distinct.
How many rows has your final table got? 3 or 6?
TableA: Load Distinct * Inline [ ID_A 1 1 2 2 3 3 ]; TableB: Concatenate ('TableA') Load Distinct * Inline [ ID_A 1 1 2 2 3 3 ];
My first guess was 6 rows (I was wrong!). The answer is 3. The Distinct argument in the second load takes values from both the new table and existing data into account.
At first glance I assumed the order which Qlik processed the data was different. I presumed Qlik loaded the data distinctly in the second load and then appended it.
So what’s the lesson here? This specific example isn’t really taught in manuals or text books, maybe the clue on the behaviour is hidden somewhere in the text if you were to search deep enough. What we take from this is we have to test new scenarios especially when the end results have to be exact, regardless of how long we’ve worked with the product. Maybe we want 3 rows, maybe we want 6. Either way we should always test the process to ensure we get the right results.
Put another way, just from pure probability, 50% of people will have got the answer right, if you did great!
Be honest though, did you know for sure, maybe through past experience or was it an educated guess? This is the point I’m trying to make, in this role we can’t really be guessing we have to know 100%
Thanks Rich for sharing.
Interesting to also know that you probably don’t need DISTINCT in both tables, it will still remove all duplicate values.
Thanks Gabriel. Yes you’re right as the second distinct takes into account both the new and existing data you don’t need the first distinct.
Trick is to test these combinations and understand exactly whats happening in the back ground…
Everyday I work with Qlik I assume less and less. Let’s say that you load that data from another source. Does it work the same way? Like when the data comes from a QVD; it is NOT sortable. A resident or precedent load it’s necessary.
In Sense there is the handicap of varibles. Sometimes a variable is called like, vVariable. Other times $(vVariable), other times $(vVariable),… You never know what is gonna work and where.
As we dig it with extensions we get into a jungle. Different versions, different interfaces, desktop, server and the Sense version itself which is new every six months.
My recommendation is not to do anything more complicated than sum and count. As soon as we complicate things a bit, we take the risk of malfunctioning or not functioning at all.
For the great finale, do not take this comment as negative, because it is not. See the positive side of it; upon this comment, someone could start fixing it. For the same reason, don’t take it personal.
Let’s don’t assume I don’t like it, I love it, that’s why I ‘d like it to WORK and PROPERLY and STABLE. Everytime it fails, it hurts my feelings so much…