I’ve been pretty quiet lately for a good reason. I’ve been totally immersed in a new and exciting project of migrating many of our systems over to a new in-memory database system called Exasol. They claim to be the world’s fastest in-memory database system and so far they haven’t let me down. Complex data shaping tasks that used to take 1.5 hrs in SQL now complete in 7 minutes or less! In this post I’ll walk you through some of the reasons we chose Exasol and what my Continue reading
At my company we often work with panel data and need to be able to easily group individual panelists into buckets or profiles. On a recent project I needed to identify which panelists were Light, Medium, Heavy or Extreme buyers within specific product categories based on how many times they purchased that category each year.
The criteria for what constitutes a certain buyer group was different for each category so I couldn’t just use a simple mathematical formula to figure out which group each buyer falls into. Instead I needed to compare my panelist data with a criteria file to identify which group a user falls into. Since this kind of task is commonly encountered and cannot be solved via a simple join, I thought I’d write a quick post explaining how I solved this issue with Alteryx.
Below is a screenshot of my panelist data showing the transaction count for Continue reading
I’ve been busy on a bunch of big projects lately as you can probably tell by my neglected blog. Today I tried a new tool and thought I’d share how it works and how it can be improved on using a batch macro. My goal was to compute a bunch of correlation factors using Alteryx. Below is a sample of my data:
As you can see, each product has five weeks of data. I wanted to correlate each Continue reading
In my work at TABS Analytics, I often find myself building re-usable alteryx workflows for processing sales data coming from syndicated sources such as Nielsen or IRI. The goal is to build tools that can be applied to data coming from multiple different clients. Though the basic content of the data is the same, the field names and format are often different depending on the client. A basic measure such as sales dollars, could come with many different titles such as (Dollars, Sales Dollars, DOLLARS, Value Sales, etc.) In this example I’m going to walk though how I use the dynamic rename tool to “iron out” these variations up front to avoid errors downstream.
Imagine that we have two different data sources with slightly different field names as shown below.
The goal is to build a workflow that can load both data Continue reading
Anyone working with Excel data inevitably encounters the situation where data needs to be “filled down” to blank rows below values.
Suppose, for example, that you wanted to convert this data:
I just upgraded to Alteryx 10.0 today and wanted to share my findings. Overall the upgrade was quite seamless though they do require an uninstall of the prior version effectively preventing true side-by-side comparisons on the same box. Below are some key new features that jumped out at me.
Probably the most popular feature everyone was looking forward to was the new Continue reading
One challenge most data integrators eventually face is the need to spread data down to different time periods. This could mean spreading weekly data down to days, or multi-week events down to individual weeks. This can be challenging with traditional Continue reading
Ok, ok, maybe this isn’t technically hacking, but it sure feels like it! While at the Alteryx Inspire conference this year I learned that you can actually access the XML behind any workflow tool effectively bypassing the GUI entirely. Granted, you won’t need to do this very often, since they do have a good GUI, but there are cases where it sure comes in handy. In one of my recent posts, I showed how this can be a big help with the select tool, and today I’ll show how I just used it on Continue reading
I just got back from a great week in Boston at the Inspire conference. It was exciting being around 800+ other data enthusiasts, though I have to say it was humbling too, seeing all the great work everyone else is doing.
Here’s a snapshot of me and my coworkers at the conference:
I especially enjoyed a talk by Chris Love and Team from the Information Lab, showcasing Continue reading
In today’s post and video I’m going to share a quick way to rapidly populate the select tool in Alteryx if you need to change the data type or rename a bunch of fields in your data stream. While Alteryx’s visual interface does result in extremely simple and intuitive workflows, one downside for hard-core developers is the loss in productivity experienced by using a GUI vs. generating code directly. Fortunately there are some workarounds…
Imagine that you have a dataset with over 100 generic column names. You already know the final set of column names that you’d like to use, the question is how can you quickly overlay the new names over the old column names in you data. Below is a screenshot showing the normal manual method for renaming each field or change the data type using the select tool.
Of course that’s all well and good, except you don’t want to have to manually Continue reading