My dataset have #obs=3000~ and it looks like this: this is just me and my fiction data that I just trying to be better so I can become an data analyst, and let’s say that we have a data set like this?
country country avg(value) how many we have data years
DE Germany 0.11 14
CA Canada 0.2 15
NO Norway 2.0 16
FR France 0.11 15
TR Turkey 0.11 15
UA Ukraine 0.11 3
World World 1.00 16
ID Indonesia 0.5 9
IN India 0.5 15
GB United Kingdom 0.8 15
Non-EU Europe Non-EU Europe 1.0 15
CH Schweiz 0.8 9
EU European Union 1.0 15
See? Germany and France are in EU but we have “in total” a row that have the name EU, isn’t that a bias? How to solve that?
And for Ukraine example, it only have data for 3years, how to solve that?
And how about the row “World” all of the countires above is in the world, how to solve that?