I am planning to estimate an OLS regression model, to gauge the relationship between various sociodemographic (Census) features and political data at the census tract level. As an example, this model will regress voter turnout on education level, income, age composition, and racial composition. Both the dependent and predictor variables will be continuous. This model will include data from several cities and I would like to estimate city-level differences to see if the relationships between variables differ across cities. I gather that the best approach is to estimate a single regression model and include dummies for the cities.
The problem is that the sample size for each city varies very widely (n = 200 for the largest city, but only n = 20 for the smallest).
I have 2 questions:
The problem is that the sample size for each city varies very widely (n = 200 for the largest city, but only n = 20 for the smallest).
I have 2 questions:
- Would estimating city-level differences be impossible with the disparity in subsample sizes?
- If so, I could use block groups instead of census tracts. This would increase the sample sizes (n = 800 for the largest city, n = 100 for the smallest). Would this still be problematic due to the disparity between the two?
Comment