Contoso seems to be the standard Power BI sample data-set. However, I find it helpful to use a smaller data-set when learning a concept.
Requirement
I want a sales transaction table for a set of products. Each product is sold several times for the same amount. I’ll start with 1 row for each product:
Factors I’ll explore in this post:
- Repeat each row the same or different amount of times?
- Should I Use Power Query or DAX to repeat rows?
Files
A: PQ, Constant Row Repeat
create new table & enter data
I typed in single row product & sale amount data manually in a new table:
load into power query
Load into power query and create steps to repeat each row:
add custom column
On ribbon select ‘Add Column’, ‘Custom Column’, add M code =List.Random(5) and select ‘OK’.
This menu driven step creates this M code:
= Table.AddColumn(#"Changed Type", "RandRepeat", each List.Random(5))
expand to new rows
In top right corner click the double arrows, select ‘Expand to New Rows’. Each single row is now five rows! You can remove column ‘RandRepeat’.
B: PQ, Variable Row Repeat
create new table & enter data
Same as previous method but with extra manually entered column ‘RowsToRepeat’. Some rows will be repeated more than others.
add custom column
On ribbon select ‘Add Column’, ‘Custom Column’, add M code =List.Random(5) and select ‘OK’.
This menu driven step creates this M code:
= Table.AddColumn(#"Changed Type", "RowsToRepeatDelimiter", each Text.Repeat("^,",[RowsToRepeat]-1))
The result looks like this:
split column by delimiter
With column ‘RowsToRepeatDelimiter’ selected follow these steps:
- On the ribbon select Transform, Split Column, By Delimiter
- Select ‘Custom’ from drop down, enter ^
- Select ‘Each occurrence of the delimiter’
- Select ‘Advanced Options’, select ‘Rows’, select ‘OK’
If step ‘Changed Type1’ is automatically added delete it (it converts ‘Product Code’ into a number. We don’t want to lose the leading zeros).
Right click columns ‘RowsToRepeatDelimiter’ and ‘RowsToRepeat’, select remove (they are no longer required).
These menu driven steps have created this M code:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TZDLEcQwCEN78TkH84daMum/jQWctXNhhhdZErnvMQmdTcY1hGuM50ooyQhyB6ec3BSCTLVUMGXT6QgqtWPwocyGSk39+JqzYr+NHLpgaKZpSUuJK8s87ayyjLYUyJF7ry/090SXTuINIVRMsYS4PaeDevf3Pm1BdTJZ6XpMKYi8igKem0ATlujTM48Mln64czjIuzt+WhrmT31Z6p4f", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Product Code" = _t, #"Sales Amount" = _t, RowsToRepeat = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Sales Amount", Int64.Type}, {"RowsToRepeat", Int64.Type}}), #"Added Custom" = Table.AddColumn(#"Changed Type", "RowsToRepeatDelimiter", each Text.Repeat("^,",[RowsToRepeat]-1)), #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Added Custom", {{"RowsToRepeatDelimiter", Splitter.SplitTextByDelimiter("^", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "RowsToRepeatDelimiter"), #"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter",{"RowsToRepeatDelimiter", "RowsToRepeat"}) in #"Removed Columns"
‘Source’ has a long code. It means that I typed in the original 3 columns. It’s helpful to study the M code. At some point it’ll be necessary to write M directly as not everything is possible via the menus.
C: DAX, Constant Row Repeat
This DAX is long/ but easy to copy/paste/modify to add more rows. On the ribbon click ‘Table tools’, ‘New table’ and copy/paste this DAX:
C: DAX, Constant Row Repeat = SELECTCOLUMNS( UNION( ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0328475", "Sales Amount", 54), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0584731", "Sales Amount", 183), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "1937664", "Sales Amount", 105), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0821655", "Sales Amount", 294), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0447263", "Sales Amount", 298), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0784625", "Sales Amount", 49), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0968476", "Sales Amount", 28), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "1782947", "Sales Amount", 173), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "1382473", "Sales Amount", 7), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0782853", "Sales Amount", 24), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "1965762", "Sales Amount", 72), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0816864", "Sales Amount", 81), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0683756", "Sales Amount", 286), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0393385", "Sales Amount", 12), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0163386", "Sales Amount", 8), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "1263945", "Sales Amount", 3), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0493847", "Sales Amount", 2), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0772584", "Sales Amount", 2) ), "Product Code",[Product Code], "Sales Amount",[Sales Amount])
D: DAX, Variable Row Repeat
Similar to DAX above but GENERATESERIES 2nd parameter varies. Example: ‘Product Code’ 0328475 has 5 rows, 0584731 has 4 rows.
D: DAX, Variable Row Repeat = SELECTCOLUMNS( UNION( ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0328475", "Sales Amount", 54), ADDCOLUMNS(GENERATESERIES(1,4,1), "Product Code", "0584731", "Sales Amount", 183), ADDCOLUMNS(GENERATESERIES(1,4,1), "Product Code", "1937664", "Sales Amount", 105), ADDCOLUMNS(GENERATESERIES(1,4,1), "Product Code", "0821655", "Sales Amount", 294), ADDCOLUMNS(GENERATESERIES(1,5,1), "Product Code", "0447263", "Sales Amount", 298), ADDCOLUMNS(GENERATESERIES(1,6,1), "Product Code", "0784625", "Sales Amount", 49), ADDCOLUMNS(GENERATESERIES(1,2,1), "Product Code", "0968476", "Sales Amount", 28), ADDCOLUMNS(GENERATESERIES(1,6,1), "Product Code", "1782947", "Sales Amount", 173), ADDCOLUMNS(GENERATESERIES(1,3,1), "Product Code", "1382473", "Sales Amount", 7), ADDCOLUMNS(GENERATESERIES(1,3,1), "Product Code", "0782853", "Sales Amount", 24), ADDCOLUMNS(GENERATESERIES(1,2,1), "Product Code", "1965762", "Sales Amount", 72), ADDCOLUMNS(GENERATESERIES(1,3,1), "Product Code", "0816864", "Sales Amount", 81), ADDCOLUMNS(GENERATESERIES(1,3,1), "Product Code", "0683756", "Sales Amount", 286), ADDCOLUMNS(GENERATESERIES(1,2,1), "Product Code", "0393385", "Sales Amount", 12), ADDCOLUMNS(GENERATESERIES(1,1,1), "Product Code", "0163386", "Sales Amount", 8), ADDCOLUMNS(GENERATESERIES(1,3,1), "Product Code", "1263945", "Sales Amount", 3), ADDCOLUMNS(GENERATESERIES(1,3,1), "Product Code", "0493847", "Sales Amount", 2), ADDCOLUMNS(GENERATESERIES(1,2,1), "Product Code", "0772584", "Sales Amount", 2) ), "Product Code",[Product Code], "Sales Amount",[Sales Amount])
Summary
Repeating rows using Power Query and DAX was great practice! I learned a lot of syntax in both M and DAX.
Why do this? When I’m learning a new concept I like to use small data-sets so I can study the results. My next post is about ABC Analysis (aka 80/20). I’ll use the table that I created here.