At my company we often work with panel data and need to be able to easily group individual panelists into buckets or profiles. On a recent project I needed to identify which panelists were Light, Medium, Heavy or Extreme buyers within specific product categories based on how many times they purchased that category each year.
The criteria for what constitutes a certain buyer group was different for each category so I couldn’t just use a simple mathematical formula to figure out which group each buyer falls into. Instead I needed to compare my panelist data with a criteria file to identify which group a user falls into. Since this kind of task is commonly encountered and cannot be solved via a simple join, I thought I’d write a quick post explaining how I solved this issue with Alteryx.
Below is a screenshot of my panelist data showing the transaction count for each user ID/Category combination.
Continue reading “Fuzzy Joins for Grouping Values Into Buckets”