Can someone explain why I can't code dummy variables this way?
I understand that If you have 4 groups, you need only 3 dummy variables, Otherwize you will have multicolinearity. You would normally code the dummies like this:
Eg:
3 groups A,B,C
D(1) = 1 for A and 0 for B and C
D(2) = 1 for B and 0 for A and C
D(3) = 1 for C and 0 for A and B
But what if I could code it this way?
D(1) = 0 AND D(2) = 0 for A
D(1) = 0 AND D(2) = 1 for B
D(1) = 1 AND D(2) = 0 for C
This way I use only 2 variables to represent 3 groups, so I gain degrees of freedom right??
But I haven't come across any good explanation for why I should not code the variables this way.
This is fine - if you don't really want to include A.
The problem with this is that you can't distinguish between A and the complement of (B or C). You could tweak the numbers a little to get around this, but fundamentally if you only have m dummy variables for n>m groups, you can only recover some set of m linear combinations of the n groups; you can't resolve this uniquely into n separate effects. So if you have n groups of interest, you really do need n dummy variables.
Chapter 14: Mult. Reg: Dummy variables and Interaction
[simpleaffiliate source="amazon" results="10"]dummy variable[/simpleaffiliate]
[simpleaffiliate source="cj" results="10"]dummy variable[/simpleaffiliate]
[simpleaffiliate source="clickbank" results="10"]dummy variable[/simpleaffiliate]
No comments:
Post a Comment