Machine learning trying to fit real world data into mathematical equations
In supervised learning, for example, we train a model based on data from real life and try to find the best fit mathematical representation. My question is why would real life data (apprantly random) would fit a mathematical equation y=mx+c
, y=c1x1+c2x2
, or any complex form? Does it come from the fact that most things in the real world can be generalized mathematically?
minimize KL divergence is same as minimize NLL through dirac
I am reading Probabilistic Machine Learning: An Introduction
In chapter 4 Statistics pages 109: