Anyone working with Excel data inevitably encounters the situation where data needs to be “filled down” to blank rows below values.
Suppose, for example, that you wanted to convert this data:
If I were working in Excel, I’d use the handy “Fill Down” tool in ASAP Utilities (A must-have, free Excel add-in). In alteryx, the typical approach would be to use the Multi-Row tool (discussed more in this post) with a formula such as IIF(ISNULL([Field1]),[Row-1:Field1],[Field1]) which essentially returns the value of the row above whenever the current row is null. This works great, however if I wanted to do this for 20 columns, I’d have to use 20 different multi-row tools, adjusting the formula for each field. That becomes annoyingly time-consuming and cumbersome.
This sent me on a quest to develop a macro to simplify the process. I originally intended to accomplish this via a double-macro approach in which an outer macro calls a batch macro which sequentially applies the function to each field. After posting the question on Altery’s forum, however, Chris Love and others pointed me in the direction of transposing the data so that all the fields could be processed as if they were a single column.
This actually proved quite simple to build. Below is a basic overview of the workflow.
The process involves using a List Box macro input to ask the user what fields they would like to fill down. The macro then uses this list box to select a subset of the data and get a list of the field names. These are joined onto the transposed data so that only the relevant fields are passed through the multi-row function which handles the actual fill step. I was originally concerned that the multi-row function wouldn’t work on transposed data since I thought the order would be messed up, but it worked fine as long as I used the Group-By feature of the multi-row function.
Once all the nulls are filled, the data is cross-tabbed back into its original layout. The only remaining problem I encountered, was that the field order got all messed up due to the transpose and join step. To solve this, I had to hack the system a bit using the union tool.
One interesting feature of the union tool, is that the order of the fields is determined by the first stream of data connected to the union tool. This holds true, even if that first stream doesn’t actually return any rows of data. Just the metadata alone is used to configure the field order. My method, then was to union my processed data back to a stream from the original input, making sure that the original data stream was connected to my union tool first. In order to avoid duplication of the data, however, I had to ensure that the original data stream didn’t actually return anything. I accomplished this through a logically impossible filter (Where RecordID = 1 AND RecordID = 0). This filter guaranteed that no data actually came through the first stream.
The final result, was a macro that can be easily dropped into any data stream and used to fill down over nulls for an indefinite number of columns. When you first drop the tool on the canvas and connect it with the stream, you’ll be presented with a list of the fields in the stream and asked which ones you’d like to fill down. Just check off the ones you want, and you’re done! The macro seamlessly handles the transpose and fill operation.
Unfortunately you do notice some performance hit if you have a large number of records due to the cost of transposing and cross-tabbing back again but for most tasks it’s thankfully not very significant.
You can download a copy of the finished macro here. Any feedback or ideas for enhancements are welcome!