Unboxing Tableau Prep 2018.1

A year and a half after the first demo of Project Maestro, Tableau has finally released his latest product: Tableau Prep.

Tableau Prep is positioned as a standalone tool to “combine, shape, and clean data for analysis”. That puts Tableau Prep squarely in the same category as Alteryx, whose core functionality is data prep. The twist with the launch of Tableau Prep is the pricing strategy: it looks like Prep is bundled with existing Tableau Desktop & Server licenses and subscriptions.

Why is Tableau bundling a separate product for essentially free? Why not incorporating the Prep features into the core Tableau Desktop experience, if Prep is not intended to generate a new line of revenue? My pure speculation is that at that stage, Prep as a V1 is not ready to stand on his own, and is still too weak to prevail in competitive situation. It will eventually mature into a solution worth paying for, but today, it is still too limited. I have been kicking the tires and will share my first impressions and hopefully justify my assessment.

Continue reading

Posted in Alteryx, Tableau | Tagged , | Leave a comment

How to compare your customers by visual cohort chart

When trying to get a better grip on the performance of a large number of customers (or accounts, or pupils for the matter), and  progression over time of their count, their revenue  or their conversion from one status to another, it really helps to organize them in cohorts, to compare them fairly.  Namely, you would want to compare the 2017 revenue generated by customers who have been transecting since 2014, with the revenue of those who transact since 2016, as they are at different stage in their customer journey, hopefully. For illustration, the Tableau SuperStore data set has 793 customers and 5,009 orders over 4 years:

From that perspective, the growth of the number of customers looks steady and healthy, but is it really? Are we acquiring customers in 2017 as we did in 2015? Critical details, such as the number of orders per customers are buried within that mass of data.

A better angle would be to depict the progression of those customers by cohorts, taking for reference the date of their first order to group them, and plot how are they growing in subsequent years.

Continue reading

Posted in Marketing, Performance, Sales, SFDC, Tableau | Tagged | Leave a comment

How to filter transactions by a calculated category using SQL Window Functions

Have you ever found yourself in a situation where you need to filter a set of transactions based on categories which need to be calculated from that same set?

Let’s say you have a data set containing all the customer responses triggered by your Marketing activity and you need to rank Marketing campaigns performance based on the number of first touch responses they originated.
Another example more familiar to Tableau users: you have a full list of orders with line items from the Superstore data set and you need to filter customers based on the largest item’s product family they bought on their most recent order… Or how about showing on a map the revenue of customers who have been inactive for a year or more, based on the shipping address of their last order?

Sounds like an awful lot of SQL steps, doesn’t it?
All those scenarios revolve around the same meta problem: the need to switch between different levels of aggregated data within the same process. In other words, the filter is computed on a subset of the transactions, and we need to run that filter without destroying any information, nor storing the same information multiple times, which could work, but would be resource intensive and frankly inelegant.
The better approach requires a Data Base engine (sorry Excel cave folks!): by combining a series of INDB tricks and SQL functions, this can be done in a couple of easy clicks with great performance, at no additional storage cost. Continue reading

Posted in Alteryx, Marketing, Performance, Redshift, Snowflake, SQL | Tagged , | Leave a comment

How to Load Data in Bulk to Snowflake with Alteryx

Why a post about Snowflake? What does Alteryx and Tableau have to do with Snowflake?

Here is some context: as I was planning to launch my Data Lake consulting business, which is now launched, I was looking for a place to store and share prototypes, which need to be queried from Tableau with high performance. Cloud databases offer the flexibility in terms of capacity, performance, pricing and portability that I was looking for.

The leaders in that field, AWS and Azure, offer a lot of solutions, but I decided to take a closer look at Snowflake, not only because it is a hot startup, but also because it promises Uncompromising simplicity, read: no need for a DB Admin! I consider that feature a huge advantage over Redshift, which takes quite some technical skills to get even started…

My experience testing the platform proved that that specific promise of easy administration is met, even though that simplicity does not extend to the overall experience. To put it another way, Snowflake is not Citizen Analyst ready yet, but with a bit of tweaking, and the research I will share below, it comes pretty close!

To illustrate the step by step process, I will use the business scenario of a previous post: storing 27k rows of review data for a Word Cloud analysis.
Continue reading

Posted in Alteryx, Performance, Snowflake, SQL, Tableau | Tagged , , , , | Leave a comment

Access and Process Leads Activity History with Marketo Bulk Extract APIs

Marketo’s core expertise is Marketing Automation, a software category it pioneered along with Eloqua, Pardot, Hubspot, Adobe and now 208 additional vendors and counting. Marketo still holds a solid market share, especially on the West Coast, especially in Tech companies. For analysts and executives who love data, Marketing Automation platforms offer an attractive perspective besides operations: they capture large amounts of information on the individuals who comprise customer and prospect organizations, from the details of the websites and other assets they consume, to the forms they populate, the emails they receive, open, forward and many more indicators. Marketing Analytics are a fast growing field part of the sprawling Martech industry, necessary to allocate Marketing resource and improve performance. Alas, Analytics with Marketo have been mostly a tantalizing promise until June 2017. If all this data is indeed stored on the Marketo instance, users who stay within the Marketo interface can query the Lead database only, and NOT their Activities DB, that is the details of the transactions in time stamped sequences of identified individuals. The analyst willing to go beyond those restrictions was left with two options: either invest in the Marketo costly analytics upsell solution (no blending of external data allowed!), or leverage the APIs to get the data his organization owns, out to a serious analytics platform of choice.

Marketo has been offering access to the data generated in those databases through a SOAP API, then a REST API, better suited to extract volumes of transactions. The current version of the Marketo connectors supplied out of the box by Alteryx, as of V11, and I believe by Tableau as well, are still using those transactional REST APIs, which are really meant by Marketo to be used to sync with a CRM, and not for analytics. As a result, as I mentioned in the comments for my previous post on Marketo APIs, the traditional REST APIs  have been gradually stripped over 2016 of the capabilities to extract any decent volume of data at once. So far, most users of those APIs and connectors willing to extract more than a handful of records per session have been facing a dreaded series of errors:

1. Marketo REST API Error code 606 (Rate limit): Marketo limited throughput of records extracted within 20 seconds to 100, and kills the connection if higher
2. Marketo REST API Error code 615 (Concurrency limit): Marketo limits the number of concurrent requests to their API to 10, limiting parallel processing
3. Marketo REST API Error code 6XX (Daily Quota): 10,000 API calls maximum per day.

Furthermore, to definitely discourage the use of Analytics outside of the Marketo moat,  authentication tokens are only valid for 60 minutes, killing any extraction job running >60 minutes. As an indication of scale, in my own experience in my org, one hour is what it takes to extract 6 days of website visits, that is just one Activity type, out of the 18, I had planned to work on. And last but not least, Marketo does not offer a DATE TO filter for transactions in that API, effectively preventing from running updates on data sets. For instance, for a daily refresh of 2016 transactions, a full update for all of 2016 data was required… I wrote WAS because, since June 2017, Marketo has finally released Bulk Extract APIs for People and Activities, with surprisingly little fanfare for a Marketing company…

This is great news, but Marketo has not suddenly turned into a land of milk and honey.  The new bulk approach is not as easy as one could expect and is still riddled with hoops to jump through. However, if you have read that far, you must already get why it’s a much bigger deal than Marketo makes it sound, (Campaigns Golden Path analysis anyone?) and you will be able by the end of this post to take immediate advantage of the new API. You will finally get your hands on stacks of your data , instead of just trickles.

Continue reading

Posted in Alteryx, Marketing, Tableau | Tagged , | 2 Comments