Alteryx INDB Traps and Limitations Series: Filtering Nulls

I get to spend a lot of quality time with INDB tools, which represent 90% of my workflows. I love them, I can generate code almost at the speed of thoughts.Workflow INDB are also very powerful, especially when coupled with modern data bases such as Redshift or Snowflake, and they don’t hog any local resource.

Yet, I have been running over time into a series of issues and limitations of those tools, far from being obvious, and which all caused perplexity and time spent to address. I will share them through a blog series, and will start with a trap easy to describe, yet lethal in terms of its impact on data integrity: INDB Filter does not handle NULLs by default, unlike its cousin the In Memory Filter.

Let’s illustrate with a simple scenario, where I have this table in my DB:

Now if set a filter to exclude the Person in Region West:

In Memory version has a very similar logic to the INDB version

In Memory version did exclude Anna as expected, and so did the INDB version on the right, BUT Joe Blow, with a NULL Region, got lost along the way, and is nowhere to be found! Notice that there is no error message, nor any warning… INDB Filter just ate the NULL records, and that is a problem, because you are not always aware that there are some NULL values in the column…

The fact that INDB behaves with a different logic than In Memory makes me think that this is a bug or oversight, rather than a feature… And I would be in favor of addressing that bug… In the meantime, what can be done? Here is a pretty simple fix that you should apply preemptively to all your INDB Filters:

Change Filter to add a OR clause to include NULLs

And here is another way, a little bit more involved, but can be handy if the filter logic is too complex to include NULLs:

Add a Prior Filter to manage NULLs separately, and Union them afterwards

When you are missing entries at the end of a workflow, this should be the first issue to check for. Knowing that simple tip would have saved me hours of debugging… Hopefully, in the near future, I will be able to retire that post because the logic of the tool will be fixed…

Posted in Alteryx, Redshift, Snowflake, SQL | Leave a comment

TC18 Tableau Conference Highlights

I skipped the Tableau conference last year as it was taking place in Vegas, and that is usually not a good experience, even without crowd shooters. I was happy to partake again this year to TC18, which took place in New Orleans.

The conference was well attended with 17,000 attendees, yet it never felt overcrowded like it does when in Vegas. The trade-off is that the NOLA Convention Center requires a lot of walking: I clocked 9 to 10 miles a day of walking. It gets a bit frustrating to walk from one end  to the other end of the venue to attend a session, arrive on time and see it is already full, which happened to me a couple of times when the topic was APIs…

In my opinion, those APIs were elements of one of the two overarching themes of the Conference: Tableau is opening up for real and Tableau Prep is out of Beta. Both those themes reveal how pragmatic Tableau has become lately. For years, they were dismissing those customers needs and requests from the community: nah, Tableau does not need to open up, it is good  as it is, and i f you need something else, you don’t understand visual analytics. Nah, Tableau does not need data to be prepped, and Alteryx success  is such an epiphenomenon which has nothing to do with us. But eventually,  witnessing the rise of usage of the web connector, and the IPO of Alteryx, they started seeing the light and responded.  And I welcome that change! Continue reading

Posted in Snowflake, Tableau | Tagged | Leave a comment

Faster field name edits in Alteryx

From a distance, editing field names might not seem such a big issue, worth dedicating a blog post. Yet, I realize how much experience I gained over time, dealing with such minor issues, which end up affecting my productivity significantly over time. In other words, consider each time you draft a new workflow in Alteryx, and you will realize that, whether you deal with long SFDC tables with __c names or other naming conventions that leave to be desired,  you get to rename fields a lot. And you most likely have not had the best experience along the way. Read on as I will share some productivity tips and warn you about potential issues, especially in INDB.
Continue reading

Posted in Alteryx, Performance, Productivity, Redshift, Snowflake | Tagged , | 1 Comment

Weird query results on Snowflake

If, when you use Tableau or Alteryx to query a Snowflake Database, you are getting weird $ amounts, not matching what is the database, you are experiencing a known bug in the recent versions of the Snowflake ODBC driver:

https://support.snowflake.net/s/article/Regression-in-ODBC-v2-16-4-v-2

This is a big issue affecting floating number columns. Fortunately, it has been addressed with the release of ODBC Driver version 2.16.7 in August:

https://support.snowflake.net/s/article/client-release-history

Make sure you update your ODBC driver as soon as possible!

Here are the instructions:

https://docs.snowflake.net/manuals/user-guide/odbc-download.html

Posted in Alteryx, Snowflake, Tableau | Tagged , | Leave a comment

How to analyze easily Google Search Console data in Tableau

Whether you are deeply versed in the intricacies of SEO, or simply curious to optimize your web property’s organic search performance, you must likely are involved with the Google Search Console and its analytics. It is a very useful (and free!) tool to understand how users reach your website through a Google Search, and will help you ground your intuitions with facts: diagnose traffic and search position changes, find opportunities for optimizations and content ideas. Must Have for webmasters and digital marketers. Here is how it looks like for the very blog you are reading:
It even comes now with a fresher look and better filter ergonomics:

That new version lets you access up to the last 16 months worth of data, all the way until today-2 days. The legacy version was offering only 90 days. Each transaction is structured around 7 dimensions:

  1. Date
  2. Query
  3. Page
  4. Country
  5. Device
  6. Search Type
  7. Search Appearance, of little use

and 4 indicators:

  1. Impressions: How many times was the web page served into the displayed results of the Google Search
  2. Clicks
  3. CTR: Click Through Ratio calculated from Clicks/Impressions
  4. Position: How high on the Google Search results page did the page appear

Overall the Web UI is decent and responsive. Alas, it suffers from grouping and sampling issues, just like its Google Analytics peer. Let’s take an example: you can’t display a list of all the keywords in Search Analytics and their corresponding landing pages. You would have to filter query-by-query to see their associated landing pages, or page-by-page to see their queries. Here is an illustration of my top 5 Queries:
And now my Top 5 Pages:
But impossible to get the pages for my top 5 queries…

This sounds like a mission for Tableau:
I can then get all the features of Tableau to build whatever I need with the whole 16 months of data, that I can stockpile going forward:

Achieving that type of view requires to feed Tableau with good transaction data, out of the Google Search Analytics API (GSA API for short). That API will give us full control over filters and grouping, and we will be able to avoid sampling as well, since we won’t group by page : “When you group by page and/or query, our system may drop some data in order to be able to calculate results in a reasonable time using a reasonable amount of computing resources”. The strategy we will follow is to make one call by day, as detailed here. There are 2 ways to get the data to Tableau. we will cover the quick & dirty approach through Google Sheet, appropriate for a one shot analysis, and the thorough approach through API in Alteryx, appropriate for long term operations.
Continue reading

Posted in Alteryx, Marketing, Tableau | Tagged , , | 1 Comment