I have the following PowerBI setup and experience terrible loading times (usually run into the request too large error) and look for ways to improve the user experience
- Direct query report across three tabs (first one is an intro page, tabs two to four actually have reporting visuals). Direct query is needed, since the data changes daily across various deployment tiers
- Tab 2 has a waterfall chart with various slicers
- Tab 3 has a matrix with various slicers (data on a product level)
- Tab 4 has a matrix with various slicers (but is on a product-store level)
- In general, the data consists of store-product-metrics for ~1,000 stores and ~40,000 products. There are multiple time periods as well, so the data contains metrics for each month. Given this granularity, my main table is >100m rows.
- The data model is set up as follows
- One table contains the data on the product level (for tab 3), including various calculated metrics
- One table contains the data on the product-store level (for tab 4), including various calculated metrics
- Various other mapping tables are joined to those tables with 1:many connections and used for the slicers (for instance a mapping that contains region/district/store, another one that contains category/subcategory, etc. I have the store-product-level table linked to those as well as the product-level table
- Some of these joins are done on string columns
Let me know if you need to know anything else to help point me in the right direction.
I tried the following:
- Assumed that the 1:many:1 mapping might lead to duplicates, but that is not the case
- Removed existing slicers and added slicers that are not based on these lookup tables, but on the actual column in the data tables > nothing changed
- Remove unnecessary columns (there were a few string columns that weren’t used) > nothing changed
- Remove rows and only show one time period > this makes the report usable, but isn’t really helping since the report itself becomes useless
- Remove cross-tab filters (i.e. before a filter on tab 2 would also change tab 4) > nothing changed
Possible next steps:
- Change data model to only join on integers (e.g., storeID rather than storeDesc)
- Metrics optimization. Ran the performance analyzer and pulling the data had a terrible score. Threw the queries into ChatGPT and it didn’t find any material issues, so not sure how much there is here
Any other thoughts before I refactor the data model? Confidential data, so can’t share files, but happy to answer questions if helpful.