One challenge most data integrators eventually face is the need to spread data down to different time periods. This could mean spreading weekly data down to days, or multi-week events down to individual weeks. This can be challenging with traditional databases since spreading typically involves generating an unspecified number of new rows of data which SQL is not really designed to do.
The scenario I’ll show today is one we often encounter while processing promotional data as part of our Promo Advantage(TM) service. We often need to spread multi-week promotional spending data down to a weekly level in order to align it with weekly sales data. Thus if a client spent $10,000 on a four week event ending 8/22/15, we need to split that transaction into four separate rows of weekly data assigning a quarter of the total spending to each week.
Below is a snapshot of some of the spending data we receive.
The challenge is that each event consists of a different number of weeks. Thus Event1 needs to be split into six rows of data, whereas Event5 only needs to be split into two. This would be quite cumbersome to do in SQL, but is actually very easy to do in Alteryx using their handy “Generate Rows” tool. Below is the completed workflow.
The key to the whole process lies with the second tool show, the Generate Rows tool. Below is the configuration screen for this tool. Essentially the generate rows tool is like a Loop While condition for those familiar with programming. It will keep generating new rows of data until a condition is met. It requires three inputs. The initialization expression which in this case is zero, the Loop Expression (at the bottom) which tells it how to increment throughout each iteration, and the Condition which tells it when to stop.
In my case, I’m using this tool to create additional rows with a week offset value that is based on the number of weeks in each spending event. Thus, if I have a four week spending event, I want to end up with four rows of data with offset values of 0, -1, -2, and -3. These offset values will later be used to create actual dates for each week.
Below you can see that the loop expression is set to decrement by 1 or each loop, and the condition tells it to stop when the value reaches 1 greater than the negative number of weeks. (I know this sounds confusing, but the next screenshot should make sense).
Below is the output of this tool. Notice how Event1 has now been duplicated into six rows, and the WeekOffset field has been added with values from 0 to -5. Each event has likewise been split into multiple rows based on the number of weeks in the event.
The final step is now accomplished using a formula tool with two formulas. The first calculates a WeeklySpend amount which is simply a fraction of the total spend based on the number of weeks. Thus with a four week event, each week gets 1/4 of the total spend.
The second formula converts the WeekOffset values into actual dates using the DateTimeAdd function. Essentially I count back from the EventWeekEnd value based on the number of days specified by the week offset field. Since the DateTimeAdd function doesn’t support incrementing by weeks, I have to multiply my week offset by 7 and increment by days.
Below is the final result. You can see the new WeekEnding column which goes back five weeks from the original EventEndDate, as well as the WeeklySpend column which contains each week’s share of total spending. This dataset is now ready to be joined onto any other weekly data stream, or rolled up into calendar months, etc.
This data spreading approach can be used in all types of scenarios. If you need to convert four-week periods into calendar months, for example, you can first spread the data down to days and then roll it back up into calendar months or whatever other type of time aggregate you need. Happy spreading!