I have daily data for a period of time for the same set of assets, stored in 100,000 csv files. I want to aggregate the data for any asset selected by the user, and chart various outputs. I chose powerbi for this task.
These 100,000 files are stored in a single table in powerbi (call it ‘Table 1’)
There is another csv file that has asset level attributes, stored as a different table (call it ‘Table 2’). Thus, my powerbi project has two tables – Table 1 and Table 2.
I have created a custom query and charts based on these tables.
What I’m finding is that if I update anything in the single csv file linked to Table 2, the project refreshes all the csvs linked to Table 1, and recalculates all the table transformations.
My question is – am I correct in assuming that ‘Table 1’ actually doesn’t exist as a schema in powerbi like it would be access or sql? That it is only stored in the dynamic memory, and when I close PBi and open it again, all of the files get loaded and transformed anew? If so, then PBi would be a piss poor solution for anything involving a lot of data, so I’m wondering where I’m going wrong.