Repeating Rows to create sample data

Share The Knowledge

Contoso seems to be the standard Power BI sample data-set. However, I find it helpful to use a smaller data-set when learning a concept.

Requirement

I want a sales transaction table for a set of products. Each product is sold several times for the same amount. I’ll start with 1 row for each product:

I want to repeat these rows

Factors I’ll explore in this post:

  • Repeat each row the same or different amount of times?
  • Should I Use Power Query or DAX to repeat rows?

Files

A: PQ, Constant Row Repeat

create new table & enter data

I typed in single row product & sale amount data manually in a new table:

load into power query

Load into power query and create steps to repeat each row:

add custom column

On ribbon select ‘Add Column’, ‘Custom Column’, add M code =List.Random(5) and select ‘OK’.

This menu driven step creates this M code:

= Table.AddColumn(#"Changed Type", "RandRepeat", each List.Random(5)) 

expand to new rows

Each row has a list of 5 random values

In top right corner click the double arrows, select ‘Expand to New Rows’. Each single row is now five rows! You can remove column ‘RandRepeat’.

B: PQ, Variable Row Repeat

create new table & enter data

Same as previous method but with extra manually entered column ‘RowsToRepeat’. Some rows will be repeated more than others.

Column ‘RowsToRepeat’ has the row repeat value

add custom column

On ribbon select ‘Add Column’, ‘Custom Column’, add M code =List.Random(5) and select ‘OK’.

This menu driven step creates this M code:

= Table.AddColumn(#"Changed Type", "RowsToRepeatDelimiter", each Text.Repeat("^,",[RowsToRepeat]-1))

The result looks like this:

“^” character repeated based on ‘RowsToRepeat’ value

split column by delimiter

With column ‘RowsToRepeatDelimiter’ selected follow these steps:

  • On the ribbon select Transform, Split Column, By Delimiter
  • Select ‘Custom’ from drop down, enter ^
  • Select ‘Each occurrence of the delimiter’
  • Select ‘Advanced Options’, select ‘Rows’, select ‘OK’

If step ‘Changed Type1’ is automatically added delete it (it converts ‘Product Code’ into a number. We don’t want to lose the leading zeros).

Right click columns ‘RowsToRepeatDelimiter’ and ‘RowsToRepeat’, select remove (they are no longer required).

These menu driven steps have created this M code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TZDLEcQwCEN78TkH84daMum/jQWctXNhhhdZErnvMQmdTcY1hGuM50ooyQhyB6ec3BSCTLVUMGXT6QgqtWPwocyGSk39+JqzYr+NHLpgaKZpSUuJK8s87ayyjLYUyJF7ry/090SXTuINIVRMsYS4PaeDevf3Pm1BdTJZ6XpMKYi8igKem0ATlujTM48Mln64czjIuzt+WhrmT31Z6p4f", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Product Code" = _t, #"Sales Amount" = _t, RowsToRepeat = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Sales Amount", Int64.Type}, {"RowsToRepeat", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "RowsToRepeatDelimiter", each Text.Repeat("^,",[RowsToRepeat]-1)),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Added Custom", {{"RowsToRepeatDelimiter", Splitter.SplitTextByDelimiter("^", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "RowsToRepeatDelimiter"),
    #"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter",{"RowsToRepeatDelimiter", "RowsToRepeat"})
in
    #"Removed Columns"

‘Source’ has a long code. It means that I typed in the original 3 columns. It’s helpful to study the M code. At some point it’ll be necessary to write M directly as not everything is possible via the menus.

C: DAX, Constant Row Repeat

This DAX is long/ but easy to copy/paste/modify to add more rows. On the ribbon click ‘Table tools’, ‘New table’ and copy/paste this DAX:

C: DAX, Constant Row Repeat = 
SELECTCOLUMNS(
UNION( 
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0328475",
               "Sales Amount", 54),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0584731",
               "Sales Amount", 183),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "1937664",
               "Sales Amount", 105),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0821655",
               "Sales Amount", 294),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0447263",
               "Sales Amount", 298),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0784625",
               "Sales Amount", 49),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0968476",
               "Sales Amount", 28),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "1782947",
               "Sales Amount", 173),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "1382473",
               "Sales Amount", 7),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0782853",
               "Sales Amount", 24),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "1965762",
               "Sales Amount", 72),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0816864",
               "Sales Amount", 81),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0683756",
               "Sales Amount", 286),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0393385",
               "Sales Amount", 12),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0163386",
               "Sales Amount", 8),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "1263945",
               "Sales Amount", 3),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0493847",
               "Sales Amount", 2),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0772584",
               "Sales Amount", 2)         
),
"Product Code",[Product Code],
"Sales Amount",[Sales Amount])

D: DAX, Variable Row Repeat

Similar to DAX above but GENERATESERIES 2nd parameter varies. Example: ‘Product Code’ 0328475 has 5 rows, 0584731 has 4 rows.

D: DAX, Variable Row Repeat = 
SELECTCOLUMNS(
UNION( 
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0328475",
               "Sales Amount", 54),
    ADDCOLUMNS(GENERATESERIES(1,4,1),
               "Product Code", "0584731",
               "Sales Amount", 183),
    ADDCOLUMNS(GENERATESERIES(1,4,1),
               "Product Code", "1937664",
               "Sales Amount", 105),
    ADDCOLUMNS(GENERATESERIES(1,4,1),
               "Product Code", "0821655",
               "Sales Amount", 294),
    ADDCOLUMNS(GENERATESERIES(1,5,1),
               "Product Code", "0447263",
               "Sales Amount", 298),
    ADDCOLUMNS(GENERATESERIES(1,6,1),
               "Product Code", "0784625",
               "Sales Amount", 49),
    ADDCOLUMNS(GENERATESERIES(1,2,1),
               "Product Code", "0968476",
               "Sales Amount", 28),
    ADDCOLUMNS(GENERATESERIES(1,6,1),
               "Product Code", "1782947",
               "Sales Amount", 173),
    ADDCOLUMNS(GENERATESERIES(1,3,1),
               "Product Code", "1382473",
               "Sales Amount", 7),
    ADDCOLUMNS(GENERATESERIES(1,3,1),
               "Product Code", "0782853",
               "Sales Amount", 24),
    ADDCOLUMNS(GENERATESERIES(1,2,1),
               "Product Code", "1965762",
               "Sales Amount", 72),
    ADDCOLUMNS(GENERATESERIES(1,3,1),
               "Product Code", "0816864",
               "Sales Amount", 81),
    ADDCOLUMNS(GENERATESERIES(1,3,1),
               "Product Code", "0683756",
               "Sales Amount", 286),
    ADDCOLUMNS(GENERATESERIES(1,2,1),
               "Product Code", "0393385",
               "Sales Amount", 12),
    ADDCOLUMNS(GENERATESERIES(1,1,1),
               "Product Code", "0163386",
               "Sales Amount", 8),
    ADDCOLUMNS(GENERATESERIES(1,3,1),
               "Product Code", "1263945",
               "Sales Amount", 3),
    ADDCOLUMNS(GENERATESERIES(1,3,1),
               "Product Code", "0493847",
               "Sales Amount", 2),
    ADDCOLUMNS(GENERATESERIES(1,2,1),
               "Product Code", "0772584",
               "Sales Amount", 2)         
),
"Product Code",[Product Code],
"Sales Amount",[Sales Amount])

Summary

Repeating rows using Power Query and DAX was great practice! I learned a lot of syntax in both M and DAX.

Why do this? When I’m learning a new concept I like to use small data-sets so I can study the results. My next post is about ABC Analysis (aka 80/20). I’ll use the table that I created here.


Share The Knowledge