I know we are supposed set the baseline to the level with the most observations, instead of the first level alpha-numerically. I am having trouble with the why; here is what I found so far
Question: Why do we relevel?
Ans: In order to produce more stable coefficient estimates and more accurate significant tests regarding the coefficients.
Question: But why does releveling make it more stable and more accurate?
Answer: A baseline with a low amount of observations could potentially lead to a level’s effect on the response to be insignificant solely due to the low sample size/potential high variability of the baseline.
Question: What does the sample size of the base line have to do with the level’s effect? Why does the order of the factors matter?
Can someone explain this in layman’s terms please?