top of page
Jihwan Kim

ADDCOLUMNS + SUMMARIZE VS. SUMMARIZE and Calculated Column VS. Calculated Table in Power BI

Updated: Sep 20, 2022

In this blog post, I want to share what I have incorrectly understood in Power BI. I think there are more than 100 things I still do not 100% understand, but at this moment of time, I found two. 😢



My previous thinking about ADDCOLUMNS+SUMMARIZE and SUMMARIZE.


Option 1.

ADDCOLUMNS ( SUMMARIZE ( Sales, 'Calendar'[Month-Year], 'Calendar'[End of Month], Category[Category] ), "@QTY", CALCULATE ( SUM ( Sales[Quantity] ) ) )


and


Option 2.

SUMMARIZE ( Sales, 'Calendar'[Month-Year], 'Calendar'[End of Month], Category[Category], "@QTY", SUM ( Sales[Quantity] ) )


I always used ADDCOLUMN function together with SUMMARIZE. I cannot remember the reason, but I read from a book that only using SUMMARIZE sometimes creates unexpected result, and recommendation was always try using with ADDCOLUMNS and create context transition.


Until today, when I used ADDCOLUMNS together with SUMMARIZE, I did not have any performance issue. Perhaps it was because I did not use extremely large dataset. My sample dataset that I used today is not that big but it showed the significant size difference that I did not expect.



My previous thinking about Calculated Table VS. Calculated column

I sometimes compare Calculated column VS. Measure, and then I select an efficient way from the performance point of view. This time, I tried to compare the below two when creating a calculated table.


Option 1.

SUMMARIZE ( Sales, 'Calendar'[Month-Year], 'Calendar'[End of Month], Category[Category], "@QTY", SUM ( Sales[Quantity] ) )


Option 2.

SUMMARIZE ( Sales, 'Calendar'[Month-Year], 'Calendar'[End of Month], Category[Category] )


and then, add a calculated column separately.


QTY CC = VAR _currentmonthyear = 'Add onebyone NewTable'[Month-Year] VAR _currentcategory = 'Add onebyone NewTable'[Category] RETURN CALCULATE ( SUM ( Sales[Quantity] ), 'Calendar'[Month-Year] = _currentmonthyear, Category[Category] = _currentcategory ) If I needed to create a calculated table, I always tried to create it in one-go. I thought this was having faster performance than creating new columns one by one separately. In this blog post, I will share what is the result of each table's size by using my simple dataset.



In this writing, I used the below sample, that contains quantity information by category and by date. The data has five categories (A, B, C, D, E), and it has a date range from 1910/01/01 to 2021/12/31.

A fact table (Sales) looks like below. (total 204,540 rows)


Based on the fact table (Sales), I created a simple data model like below.


And then, I created two calculated tables like below.



OneGo SummaryTable = ADDCOLUMNS ( SUMMARIZE ( Sales, 'Calendar'[Month-Year], 'Calendar'[End of Month], Category[Category] ), "@QTY", CALCULATE ( SUM ( Sales[Quantity] ) ) )



OneGo SummaryTable V2 = SUMMARIZE ( Sales, 'Calendar'[Month-Year], 'Calendar'[End of Month], Category[Category], "@QTY", SUM ( Sales[Quantity] ) )


In DAX Studio, the size of each table looks like below.

Size wise, without using ADDCOLUMNS wins.



I also wanted to compare between, creating calculated table by one-go and creating summarized calculated table + adding a new column separately, like below.


QTY CC (Calculated Column) = VAR _currentmonthyear = 'Add onebyone NewTable'[Month-Year] VAR _currentcategory = 'Add onebyone NewTable'[Category] RETURN CALCULATE ( SUM ( Sales[Quantity] ), 'Calendar'[Month-Year] = _currentmonthyear, Category[Category] = _currentcategory )

Adding one by one separately VS. OneGoSummaryTable V2 -> It shows the same size.




To summarize, at least in a dataset that looks like the sample,

firstly, try not to use ADDCOLUMNS with SUMMARIZE. And then, must check the result whether it shows correct answers. Secondly, when creating a calculated table, adding new columns separately does not affect badly to the size of a table.


I hope this short writing helps motivating and having fun in testing whatever PowerBI concepts you already know.



Addings: after a few more testing, I could see the below information from DAX Studio.


When I keep adding a new table with using the exactly same DAX formula, everything becomes to have the same size, except the last one that I create. I am not sure whether the size of the calculated table changes or not, but it seems like ADDCOLUMNS + SUMMARIZE VS. Only SUMMARIZE shows no winner from the perspective of size.


178 views0 comments

Recent Posts

See All

Comments


bottom of page