Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression on whether change in a variable is affected by another variable over time

    I have data sets for 6 years that have two categorial variables. One of the is race and one of them is homeownership. I want to regress homeownership on whether the changes over time are affected by race.
    I have my dependant variable which is homeownership, you either own a home or you don't.
    My independant variable is race, White, Black, Asian.
    I want to regress the effect the race has on the change of homeownership over time. (Did minorities become renters at a faster rate than the majority during the 2008 recession is what I'm trying to see)
    Would be very much appreciated.


  • #2
    Hi Richard,

    What is the question? Are you unsure of what kind of regression you should be using? If you estimated a linear model using OLS then you would be estimating a linear probability model. Given that the dependent variant is binary you might prefer to use a model which accounts for this (it bounds the outcome var between 0 and 1), such as a logistic or probit model. These can be estimated in Stata using logit or probit but if you have never used either then I suggest doing some research into it.

    Sorry if your question was something else, it might be useful to clarify further.

    Best,
    Rhys

    Comment


    • #3
      Originally posted by Rhys Williams View Post
      Hi Richard,

      What is the question? Are you unsure of what kind of regression you should be using? If you estimated a linear model using OLS then you would be estimating a linear probability model. Given that the dependent variant is binary you might prefer to use a model which accounts for this (it bounds the outcome var between 0 and 1), such as a logistic or probit model. These can be estimated in Stata using logit or probit but if you have never used either then I suggest doing some research into it.

      Sorry if your question was something else, it might be useful to clarify further.

      Best,
      Rhys
      Yeah that is what I was basically asking. I already used logistic regression in the past for the same research but I only had one independent variable and that was a binary variable as well, you're either a minority or not, but I wanted to know if you can do the same when you have two independent variables and they can take on multiple values/categories, not just 1 or 2

      Comment


      • #4
        Yes, I would suggest a logit (or probit) model again here. Your dependent variable seems to be binary (you either own a house or not) so that is fine. In terms of the independent variables, it doesn't matter if they take on multiple values/categories, just ensure you enter it as a factor variable: "i.var" so that Stata knows it is discrete and not continuous.

        Best,
        Rhys

        Comment


        • #5
          Originally posted by Rhys Williams View Post
          Yes, I would suggest a logit (or probit) model again here. Your dependent variable seems to be binary (you either own a house or not) so that is fine. In terms of the independent variables, it doesn't matter if they take on multiple values/categories, just ensure you enter it as a factor variable: "i.var" so that Stata knows it is discrete and not continuous.

          Best,
          Rhys
          Also, it is a time series data as I have data for 7 years. How should I go about doing a regression on all of them? Should I create a new variable in each data set that just gives the year of the observation and just include that variable as well in regression?

          Comment


          • #6
            Are these individual observations on people and their purchase within a given country? Do you know which state/region/county they're purchasing in?

            If you're doing a pooled estimation then you can include as just a trend by including "year" as a control variable or by including time fixed effects if you think that your mechanism might differ in different years.

            Best,
            Rhys

            Comment


            • #7
              Originally posted by Rhys Williams View Post
              Are these individual observations on people and their purchase within a given country? Do you know which state/region/county they're purchasing in?

              If you're doing a pooled estimation then you can include as just a trend by including "year" as a control variable or by including time fixed effects if you think that your mechanism might differ in different years.

              Best,
              Rhys
              Yes ,I have the ethnicity and location of each individual and whether or not they own their home. So should I just add a new variable, call it YEAR, and give a value 1-6, so lets say all the data from 2006 gets value 1 for variable YEAR, and all observations from year 2011 get value 6, and just use it as a control variable with homeownership being a binary dependant variable?

              Comment


              • #8
                Yep that sounds fine. You could even just let Year = 2006, 2007, 2008 etc without needing to recode to 1-6

                Comment


                • #9
                  Originally posted by Rhys Williams View Post
                  Yep that sounds fine. You could even just let Year = 2006, 2007, 2008 etc without needing to recode to 1-6
                  Also, would it be more meaningful to create a new dummy variable for each ethnicity so for example for a white person it would be: WHITE 1 BLACK 0 ASIAN 0 OTHER 0? And same for their region?

                  Comment


                  • #10
                    Not at all, if you use Stata's factor variable notation "i.ethnicity" then Stata effectively creates new dummy variables for you.

                    Comment


                    • #11
                      Originally posted by Rhys Williams View Post
                      Not at all, if you use Stata's factor variable notation "i.ethnicity" then Stata effectively creates new dummy variables for you.
                      Also, if I make dummy variables that are like interaction variables, so lets say interaction variable POSTWHITE is equal to 1 only if the observed individual is white and the observation happened after 2008....doe the resulting coefficients mean anything? I would like to assume if I did the same for all ethnicities I would be able to look at the size of the coefficient and determine whether some ethnicities where affected more by 2008....but it just doesn't sound logical...or is it?

                      Comment


                      • #12
                        I think you need to think more carefully about what econometric methodology you want to apply that is best suited to your question of interest.

                        Your suggestion of postwhite sounds like you want to look at some kind of dif-in-dif model with differences across ethnicities? If you set the model up appropriately then yes, this would be possible

                        Comment


                        • #13
                          Logistic regression Number of obs = 1,422,167
                          LR chi2(32) = 36110.05
                          Prob > chi2 = 0.0000
                          Log likelihood = -940775.25 Pseudo R2 = 0.0188

                          ------------------------------------------------------------------------------
                          TEN1 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                          1.WHITE | 2.453621 .0366737 60.05 0.000 2.382784 2.526563
                          1.POS08 | 1.118408 .0268547 4.66 0.000 1.066993 1.172301
                          |
                          WHITE#POS08 |
                          1 1 | .8999048 .0193903 -4.89 0.000 .8626918 .938723
                          |
                          1.BLACK | .881083 .0192605 -5.79 0.000 .8441304 .9196531
                          |
                          BLACK#POS08 |
                          1 1 | .8385081 .0264014 -5.59 0.000 .7883266 .891884
                          |
                          1.ASIAN | 2.30365 .0418028 45.99 0.000 2.223158 2.387057
                          |
                          ASIAN#POS08 |
                          1 1 | .9322237 .0240025 -2.73 0.006 .8863469 .9804751
                          |
                          Year | .9486771 .0020399 -24.50 0.000 .9446874 .9526836
                          POS08 | 1 (omitted)
                          1.NORE | .7671352 .0148891 -13.66 0.000 .7385011 .7968795
                          |
                          NORE#POS08 |
                          1 1 | .9722931 .0158421 -1.72 0.085 .9417336 1.003844
                          |
                          1.NORW | .9388538 .0175828 -3.37 0.001 .905017 .9739557
                          |
                          NORW#POS08 |
                          1 1 | .9436468 .0136144 -4.02 0.000 .9173368 .9707115
                          |
                          1.MERS | .7622664 .0168705 -12.27 0.000 .7299077 .7960596
                          |
                          MERS#POS08 |
                          1 1 | 1.054149 .0236808 2.35 0.019 1.008742 1.1016
                          |
                          1.YOHU | .8699577 .0165543 -7.32 0.000 .8381094 .9030162
                          |
                          YOHU#POS08 |
                          1 1 | .9824224 .0150215 -1.16 0.246 .9534176 1.012309
                          |
                          1.EASM | .9351591 .0187676 -3.34 0.001 .8990893 .9726759
                          |
                          EASM#POS08 |
                          1 1 | .9272235 .0166278 -4.21 0.000 .8951998 .9603928
                          |
                          1.WESM | .9032103 .0173062 -5.31 0.000 .8699198 .9377748
                          |
                          WESM#POS08 |
                          1 1 | .8954941 .0139077 -7.11 0.000 .8686462 .9231717
                          |
                          1.EAST | .9574156 .0186856 -2.23 0.026 .9214841 .9947481
                          |
                          EAST#POS08 |
                          1 1 | 1.0158 .0168379 0.95 0.344 .9833286 1.049343
                          |
                          1.LOND | .5142871 .0097574 -35.05 0.000 .4955142 .5337712
                          |
                          LOND#POS08 |
                          1 1 | .943466 .0142507 -3.85 0.000 .9159446 .9718144
                          |
                          1.SOUE | 1.021406 .0189112 1.14 0.253 .9850051 1.059152
                          |
                          SOUE#POS08 |
                          1 1 | .9518513 .0133686 -3.51 0.000 .9260066 .9784173
                          |
                          1.SOUW | .9391315 .0181032 -3.26 0.001 .9043117 .9752919
                          |
                          SOUW#POS08 |
                          1 1 | .977764 .0155584 -1.41 0.158 .9477407 1.008738
                          |
                          1.WALE | .8722455 .0161878 -7.36 0.000 .8410882 .904557
                          |
                          WALE#POS08 |
                          1 1 | 1.002142 .0141358 0.15 0.879 .9748155 1.030234
                          |
                          SCOT | .8737091 .0158135 -7.46 0.000 .8432585 .9052594
                          1.NORI | 1 (omitted)
                          |
                          NORI#POS08 |
                          1 1 | 1.011829 .0267934 0.44 0.657 .9606546 1.06573
                          |
                          _cons | .8986184 .0206509 -4.65 0.000 .8590414 .9400188
                          ------------------------------------------------------------------------------
                          Note: _cons estimates baseline odds.

                          Comment


                          • #14
                            I don't know how clear it is but can you explain to me what the odds ratio of .8999048 mean on the interaction term of an individual being white and the observation occurring after 2008, when looking at the dependant variable of being a home owner = 1 or renting = 0

                            Comment


                            • #15
                              What was the regression code used here? The coefficient on this variable will be that a white individual after 2008 receives a higher odds of being a home-owner than the counterfactual. I can't see clearly what the counterfactual is from the output presented

                              Comment

                              Working...
                              X