Create a delta extractor for MongoDB with Alteryx

Did you get excited by the MongoDB + Tableau announcement about the availability of a BI Connector to extract from MongoDB and convert BSON documents on the fly to the relational format Tableau requires? Then you might have been a bit disappointed, like me, after I found out that the BI Connector requires MongoDB version 3.2, and worse, requires the MongoDB Enteprise Advanced version to operate… This is stated in small font at the bottom of page 8 in that document.

So an easier way to  parse the BSON data stored in MongoDB to a relational format to feed Tableau or most reporting tools, is to use Alteryx, as already explained in that earlier post you can refer to, to get started. If you still feel fuzzy about the reason why MongoDB data format cannot be used by Tableau straight out of the box, MongoDB provides here a great write up that will remind you of the differences between relational databases and MongoDB BSON document format.

One significant enhancement to that earlier post I add, with this post, is delta capability. The goal is to enable automatic extraction of only incremental transactions since the last extract ran. No need to re-extract and re-process the whole year’s worth of data to update only a day!

The approach taken is to create a relational repository accessible to Alteryx, where most transactions will be stored in relational format, with a clear date. This repository is then used to scan when it was last updated using a Last Run Date column, that date is converted to a MongoDB date format the Alteryx MongoDB connector understands, to populate the date from field, extract the incremental transactions and union them to the repository:
Mongo Delta WorkflowsThe first workflow in green is used to initialize the repository, here in a local .yxdb file, but can be stored in any RDBMS really. The second workflow is reading from that repository to identify the date to feed for incremental transactions, then it passes that date to a macro under the red arrow, which will input that date to the MongoDB connector and return the resulting incremental transactions. Ultimately, the workflow adds the increments to the history transactions already extracted and processed to tabular, through the Union tool. Here is the how that simple macro under the red arrow looks like:
Macro Since Last DateEnsure that you highlight the Filter Value row in the Action configuration before you save the macro, otherwise it will fail to pass the proper parameter. Notice also the format of the date, which can be converted on the fly within the workflow using this formula:
Date Conversion FormulaI posted a demo workflow containing the MongoDB Date format to Date and Date to MongoDB Date format formulas here.

This entry was posted in Alteryx, Tableau and tagged , , , , , . Bookmark the permalink.