When playing with Alteryx on a Amazon Redshift database with INDB tools, have you ever come up with the impenetrable error:
Error: Browse In-DB (36): Error SQLPrepare: [Amazon][RedShift ODBC] (30) Error occurred while trying to execute a query: ERROR: UNION types character varying and boolean cannot be matched
Redshift is kindly trying to tell you that some fields of the same name are formatted as Boolean on one side, and as VARCHAR / V_WSTRING on the other side. Now you are left with your own devices to identify which fields, which can be fun when dealing with tables of 100s of fields, the specialty of Redshift… Add to that challenge that the INDB Select tool cannot present the Data Type of the fields. On top of it, a weird bug, most likely in the ODBC layer, is presenting Boolean fields in Redshift as V_String in Alteryx… Are you about to toss the computer out the window yet? Wait, there is a way out of that misery…
Diagnose: Identity the troublemaker fields
Which fields can I blame for the trouble? If you have only a few fields, you can try to eyeball it, but looking at the INDB Select will not help:
This is due to the nature of the INDB process: Alteryx does not have access to the Field types until the query is kicked off in Redshift. Therefore, a better place to find that info is into a Browse INDB tool, under the Metadata button, after a first run of the workflow:
Now if you are dealing with several 100s fields, such as this attached example:
How do you find those mismatched types? There is a tool for that…
The Field Info tool will generate a cool list of those fields with their type, just like the one you saw in the Browse INDB Metadata view:
Let’s deploy that tool in the workflow, preceded by the Data Stream Out tool to import the data from Redshift into Alteryx memory:
And the let’s join first on Name:
Then for the Joined Field, let’s find those with the same Name but different Field Types, by joining on both Name and Type:
Upon running the workflow, the second Join will isolate the problematic fields in the Left and Right Outputs, within Alteryx:
The 4 Unjoined are the Fields needing a fix, in order for Redshift to perform the Union properly.
Repair: Change the Field Type INDB
In order to change Field Types within an INDB workflow, we can stay within Alteryx, while processing in Redshift, using its SQL, instead of the Alteryx code. We will use the CAST function of Redshift SQL, which abbreviates conveniently following this syntax: expression :: type. We therefore insert a formula before the Union, on the side that needs adjustment:
This will convert the field improperly type as Text String into a Boolean, and confirm its type, and upon restarting the workflow:
Voila! No more error…