As far as Alteryx major releases go, V11 was appealing with many feature enhancements, in no particular order:
- A modern web based job scheduler
- A modern formula editor, which makes Excel blemish with its autocomplete and data preview window
- Addition of SAP Hana, Microsoft Azure SQL Database, Microsoft Analytics Platform System, and Netezza as IN-DB connectors. Welcome to the IN-DB club, you will love it!
- Global Search, which helps finding all relevant information from within Alteryx
- Data Profiling, which is an odd way to describe the new Browse Tool. Browse has learned new tricks! That page is supposed to describe those, but I suggest you watch the video, the text won’t tell you much.
Using that version of Alteryx for over 100 days now, my experience turns out to be very different than what I was expecting after going over the description of those enhancements. I feel that one of them is a sleeper hit for data analysts.
If you are still manipulating data in Excel, why should you care about In-DB and Redshift? Whether you are accessing your Redshift data through Alteryx or through Tableau, knowing how to prepare your data within Redshift, with In-DB tool in Alteryx or with a prep query in the Tableau connector to Redshift, will give you access to another level of performance and flexibility. Even though it might sound to you like science fiction, it is not… These are just Redshift functions you can run through ODBC…
Having a Database to read from and write to, is really nice when using Alteryx, even though you can make do without, with local .YXDB files. The key benefit of a dedicated database is the ability to use In-DB tools. Working In-DB saves the time it takes to import and export data to and from the Alteryx engine memory, and leverages the power of the DB engine. Shifting an analyst’s data Field of Play to a Massively Parallel Processing (MPP) Database such as Redshift, brings additional performance improvements. I would deem those gains massive: I have seen 6 minutes Alteryx workflows run in 15 seconds In-DB. Note that besides Redshift, the following databases are supported In-DB by Alteryx:
- Cloudera Impala
- Microsoft Azure SQL Data Warehouse
- Microsoft SQL Server
For all the wonders of the Alteryx data engine, which never seems to choke on anything sent its way for processing, it cannot really compete with the parallel processing of Redshift, when used In-DB. From an analyst perspective, there are some trade-offs for the additional performance, scale and efficiency. Main one is the steep initial learning curve, since it requires the adoption of a wholly different set of tools. Furthermore, you will quickly find that those tools have limitations, inherent to the underlying SQL operations generated through ODBC. The 10 tricks of this post will hopefully alleviate some of those difficulties.
This year’s conference took place in Austin, TX. Yet, our Tableau friends must have felt right at home, as if in Seattle. The constant drizzle and low ceiling were certainly more reminiscent of a Pacific Northwest scenery. It took until Thursday at noon, tail end of the event, to catch a glimpse of sun:
Posted in Tableau
Like it or not, looking at you Slack, email remains the preferred way to disseminate information in the workplace. Volume is still growing, especially business emails volume growing at a cool steady annual rate of 7%. Is usage growing at the same pace? Meaning: are recipients actually reading those emails? That’s another issue…
Among your users, chances are that some are still heavily reliant, not to say tethered to email as they primary vehicle for sharing data. Even if you send a nice link to a perfectly crafted Tableau viz, they won’t click it, complaining it is too difficult to have to authenticate first, and wait for the browser to display, especially on a mobile device.
Therefore, there is still a case to be made for broadcasting data by email, as admitted very recently by Tableau. Starting with Tableau V10, you can finally subscribe on behalf of other users, whereas previous versions only let users subscribe for themselves. Tableau’s implementation is elegant and simple, as usual, but still comes with limitations:
- Each recipient must be licensed on the Tableau Server, which can get quite costly in a situation where you only need to send a sales updates to a large number of sales reps
- You are limited in terms of formatting options
- You cannot leverage existing Distribution Lists (DLs) maintained on your Exchange or other mail server
There is a better way, described in that post, that offers a much wider set of options, with a bit of R&D and very minimal coding. We will use a simple scenario: a large number of sales reps must receive a sales report by email at the start of their business day. This operation entails the following steps:
- Gather timely images of the reports out of your Tableau server, to be inserted in the email
- Put together a script to assemble the email and send it to a Distribution List
- Automate and schedule the task (optional)
Many business dashboards track headcount figures, since it is quite a vital indicator for most organizations. Conceptually, headcount is a straight forward concept. Technically, it is actually very complex to render for analysis, as it should not be stored, but calculated for a set of circumstances: Time, employee role, location, gender, …
Headcount is indeed very close technically and conceptually to inventory management, with flows of employees coming in and out the organization, and the required snapshot of where it stands. Headcount computing can be also compared to the approach needed for ROI calculation, as addressed in one of my previous posts. At a higher level of abstraction, a very good read by Keith Helfrich is diving into the similarity of those data sets. However, whereas Keith is using Custom SQL and Table Calcs in Tableau to reshape the data, I will take the Alteryx route, which I find cleaner, easier to explain and document. Continue reading