Have you noticed that ominous warning every time you connect to Google Analytics (abbreviated to GA for the remainder of this post)?
Yes, the Google Analytics you have used and loved (mostly because it’s free, admit it…) is about to get a major upgrade starting July 2023. Some sites have already switched to the new version, which is fully rebuilt, all the way to the API. This means that all existing connectors need to be rebuilt as well, and not everybody is ready… Alteryx offers a standard connector to the current GA, called UA for Universal Analytics. Yet, no Alteryx GA4 connector set to be available before spring 2023 at best. You are pretty much on your own if you need to import GA4 transactions earlier…
Well that’s a great opportunity to build a quick & dirty custom API connector, understand the process and access all the levers offered by the enhanced API… You can gain a lot of time by downloading the workflows from the bottom of this post, but be aware that the solution is way less polished than what Alteryx will subsequently release. It will do the job in the meantime.
You can observe that in GA4, the query commands (dimensions and metrics) in the API call are not found in the parameters anymore, but instead within a JSON array in the body payload. You will note, as well, that the naming of metrics and dimensions has evolved…
What do I need to query the GA4 API?
There are really 3 key elements you will need, to assemble to query the GA4 API in Alteryx and retrieve data sets. The good news is that 2 of them are really easy to get! Here they are, ranked by growing level of difficulty:
Over the years, I have been monitoring the analytics software market with a quite simple requirement: I am (still) looking for a tool to let me generate full data pipelines quickly from a browser with the least amount of code. As an analyst, I don’t have the luxury of time that developers and data scientists have, I need to crank out pipelines quickly and can’t waste my clients’ time futzing with a Command Line Interface. A nice UI makes the documentation of a complex process much easier. I also want the ability to look at the actual data at each stage while I develop a long process, without having to devise each time a fancy query just for testing. Last but not least, I need frictionless provisioning and deployment, à la Snowflake: just sign up on a webpage and start using without having to summon any AWS engineers nor other IT / Dev Ops rare resource. Believe it or not, that market is still up for grab, as I haven’t found a decent solution yet. All the products I tried have significant flaws. Here are a few, I have evaluated:
Alteryx Designer Cloud
Domo Magic ETL
Databricks Delta Lake
I would say the ones coming closer to that vision are Matillion, Dataiku and Domo in its recent iterations with Magic ETL. All the others fail, in their current state, by being too expensive, too limited or way too complex (looking at you Delta Lake). There is still room for new entrants as this market segment is far from being satisfied. Analysts are usually happy to pay within reason for solutions that boost their productivity.
I have been paying close attention to the solutions available, monitoring fairly easily the relevant ones going to market with Snowflake, through their Partner Connect awesome feature. Yet, I recently came across one of those new entrants, a solution that is Snowflake centric, but not yet present on Snowflake Partner Connect, called Datameer. It offers what I deem a compelling positioning statement:
Working from a browser without having to install any application, nor patches, that is refreshing. Now, is Datameer delivering on the promise? Read on for my test of an early release…
Suppose you have a set of transactions, such as Superstore orders for instance, each with one to many line items, with each line item belonging to a product category:
Note that this is a single order with actually 14 line items, but my data set has 10,000 records with 5,009 distinct orders. My goal is to obtain one row per order, a count of line items, and most important, a single field with all categories found in that order to enable filter by keyword:
What I mean by history table is what is commonly called a Slowly Changing Dimension, or SCD table of type 2, which retains efficiently all the previous values I decide to track, for accurate historical reports. Why do we care for historical values? In a SFDC CRM context, suppose you are reporting on the number of opportunities processed by each rep over a year, based on the opportunity creation date. If the opportunity owner changes over time before close, you will assign the opportunity to the bucket of the most recent owner, instead of the legitimate first owner, who won’t get the credit. A SCD type 2 setup will archive all the owner values each opportunity took over time, and will store those records only if there was a change. That represents much less storage and processing resource than a daily snapshot table. I need to decide early which columns I want to track if they are likely to change, here in blue for that example:
Now suppose you have a simple setup of type: Fivetran copies SFDC table Opportunity to Snowflake, and overwrites all changes originating from SFDC. How do you build that history table?
One of the many appeals of using Snowflake for me, is the user interface in the browser, which allows not only to perform many admin tasks, but also to poke around data. Yet, that legacy environment, while the envy of many competitors (Hi Redshift and BigQuery!), is still lacking many features necessary to perform analytical tasks, most notably the visualization aspects: no graph possible. Nevertheless, pursuing its vision towards becoming a one stop shop data cloud, Snowflake introduced in June 2020 a free preview for an exciting feature: Snowsight.
Snowsight originates from Snowflake’s first acquisition in March 2019, Numeracy. the goal is to enhance the experience by easing data exploration, SQL coding and provide visualization, still within the browser. I have to say, while this is still a preview, it is already very compelling… Let’s take a quick tour after selecting the preview from your regular Snowflake browser UI: