Ah! Ok. Now it all makes sense.
In the previous blog post I wasn’t quite sure about throwing out a specific independent variable just because it was slightly above the Significance Level. Intuitively, I would have kept it.
Quick recap: We were predicting a startup’s profit based on admin expenses, R&D expenses, marketing expenses and location. Admin expenses and location were dismissed unquestionably, whereas marketing was a bit of a struggle.
But, following procedure, marketing had to go.
A bit later, the instructors explained R-Squared and Adjusted R-Squared, the math and the meaning for the model. It represents “Goodness of fit” as a number, more precisely: the closer R-Squared to 1, the better your model is fitted.
# I won’t go into detail here, but Adjusted R-Squared is actually the parameter that is more suitable.
By looking at the Adjusted R-Squared numbers of our MLR model, the best model (with ARSq closest to 1) seemed to be the one including both, R&D and marketing, even though marketing was slightly above the SL (that we set ourselves, btw).
Sign (has to positive)
Magnitude (comparing magnitude in units of the independent variables)
Note: Of course the example data and the numbers are made up, still good to do some thinking and not just copy-paste.
Very happy about this. Let’s move on.