In a previous post, I explained how to build word clouds easily with a combination of Alteryx and Tableau. But what happens without Tableau? It turns out that Alteryx can leverage the power of existing R libraries to easily generate pretty nice word clouds too. I realized that potential after completing my previous post, thanks to the great work of Dan Magnus, who posted an Alteryx app you can use for free here, even without Alteryx.
Nevertheless, if you want more control, you should learn how to build your own workflow, and benefit from extensive controls on your output. You may want to integrate other software systems such as Kubernetes to help manage your workflow systems so they are efficient for any job that comes up on the system. Websites such as https://www.mirantis.com/software/docker/security/ can help with this as well as managing the security of applications designed and made for the use of consistent workflow.
I will start the walk through from the same workflow built on the previous post for Tableau and slightly modify it.
The only difference, besides removing the creation of Tableau TDE output files, is that more processing needs to take place in Alteryx, such as the filtering of Stop Words:
I paste the list of Stop Words into a Text Input tool (682 in the example), and perform a Left Join to take those words out. That filter is reducing the number of words to almost 1/3 of the initial set, from 3.3M to 1.3M. Next step is to aggregate the words into counts:
Which is taking down the number of rows down to 38,361 from 1.3M. Finally, the output will be a R tool, which I will configure with the wordcloud package from CRAN, with those instructions:
Here is the code ready to copy/paste:
data <- read.Alteryx("#1", mode="data.frame") d <- data library(wordcloud) write.Alteryx(d,1) AlteryxGraph(2) colnames(d) <- "Words" colnames(d) <- "Count" wordcloud(words = d$Words, freq = d$Count, min.freq = 3, max.words=200, random.order=FALSE) AlteryxGraph(3) wordcloud(words = d$Words, freq = d$Count, min.freq = 10, max.words=50, random.order=FALSE)
And that’s about it, the workflow completes a run in 17 seconds without filters. You have now a wide variety of options offered by the library, as demonstrated here. You will need to insert additional filters in your workflow to select specific businesses or reviewers, but it works quite well. Here is the word cloud that workflow produced for the same German restaurant as the previous post: