Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help with Panel data and time series.

    Hi.
    I need help regarding my dataset that I have for my thesis. I'm trying to see how broadband effects employment and I have data for 5 years in 290 areas in a country. I have several x variables in percentage form, for example 5.5 (5.5%). I want to see if there has been an effect on employment between 2009-2013. How to do this? Do I need to make dummy variables for each year or should i put in lags? Note that I am quite new to Stata.

    Below you can find a part from my dataset: t is time, id is the area within the country, bredband is a percentage of the population in decimal form who have access to a connection of100mbit or more, unemployed is a percentage of unemployment in the area and forkom is a percentage of the area's population working within the municipality of the area.

    t id Bredband Educ unemployed forkom
    2009 1 0,493913 2287 5,5 23,5
    2010 1 0,429549 2368 6,8 23,4
    2011 1 0,416813 2483 5,3 22,8
    2012 1 0,426394 2635 5 23,2
    2013 1 0,47025 2732 5,5 23,5
    2009 2 0,51968 4394 5,7 30,4
    2010 2 0,184592 4554 7,4 29
    2011 2 0,178897 4699 5,6 28,3
    2012 2 0,224836 4864 5,5 28,3
    2013 2 0,410844 5022 6,3 28,4
    2009 3 0,242747 1405 4,7 25,8
    2010 3 0,271303 1432 7,9 25,4
    2011 3 0,286793 1477 7 24,9
    2012 3 0,348375 1553 7,8 25,3
    2013 3 0,401239 1578 8,9 25,5
    Thank you for your help! /teo

  • #2
    First of all in Stata you'll have to declare your data to be panel :
    Code:
    xtset id t
    Then you will have to use panel models, corresponding to the regression command starting by -xt- in Stata (e.g. xtreg for OLS regression) .
    This way you won't need to add inidvidual dummies, as you could specify a fixed effect option (,fe) .

    There is a lot of ressources on panel data on the web, check around for those to have a better idea of what model you could built with your data.

    Best
    Charlie

    Comment


    • #3
      Originally posted by Charlie Joyez View Post
      First of all in Stata you'll have to declare your data to be panel :
      Code:
      xtset id t
      Then you will have to use panel models, corresponding to the regression command starting by -xt- in Stata (e.g. xtreg for OLS regression) .
      This way you won't need to add inidvidual dummies, as you could specify a fixed effect option (,fe) .

      There is a lot of ressources on panel data on the web, check around for those to have a better idea of what model you could built with your data.

      Best
      Charlie
      Thank you for quick answer.
      I have done the commands: xtset id t. so when I run a regression, the time variable will be included? will the regression show the effect over time or just for a certain year?

      Is there any way to show how much it affects employment each year?

      Comment


      • #4
        Should i include the time variable t in my xtreg? Does that show how time effects my Y?
        also, what is the main difference between using fixed effects (fe) and not using it? I guess fixed effects are the correct way in this regression?

        Comment


        • #5
          Teodor:
          Charlie has alraedy pointed you out to the right track (please, see -help xt- and related entries in Stata 13.1 .pdf manual, I would also recommend you to search on the web information about a decent panel data econometrics textbook).
          As far as your questions are concerned:
          - after having -xtset- your data, both cross-sectional and time series dimensions will be automatically included. However, please note that you should run a panel data regression (regression command with xt prefix) and not a regression;
          - panel data regression will take into account all the years the time-series dimension is composed of;
          - provided that they will not be omitted due to collinearity, you may add year dummies (please, see -help fvvarlist- that will create the dummies for you) among predictors to investigate whether different years show any effect on unemployment,.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Originally posted by Carlo Lazzaro View Post
            Teodor:
            Charlie has alraedy pointed you out to the right track (please, see -help xt- and related entries in Stata 13.1 .pdf manual, I would also recommend you to search on the web information about a decent panel data econometrics textbook).
            As far as your questions are concerned:
            - after having -xtset- your data, both cross-sectional and time series dimensions will be automatically included. However, please note that you should run a panel data regression (regression command with xt prefix) and not a regression;
            - panel data regression will take into account all the years the time-series dimension is composed of;
            - provided that they will not be omitted due to collinearity, you may add year dummies (please, see -help fvvarlist- that will create the dummies for you) among predictors to investigate whether different years show any effect on unemployment,.
            Thank you very much. I really appreciate the answers, I think I'm on the right track now
            Cheers!

            Comment


            • #7
              Well I'll try to answer your question, maybe not in the same order you asked them, but I'll try to be clear.

              Using fixed effects will purge your estimation (e.g. the coefficients associated to broadband) from time or individual specific trend, hence, you'll have the "proper" effect of x over the variation of y. So the regression shows the overall effect of x on y, and not for a specific year.
              There is no correct way to run this regression since there is no a priori only one way to look at it. It all depends on what you want to look at, and what are your assumptions, which depend much of your subject, and I am not a labour market specialist.

              So fixed effects are great to answer some issues, but they don't report how time influences the dependent variable, and you won't be able to add the variable t in the regression as it will be perfectly correlated to the fixed effects (and thus be omitted by Stata.)

              If you want to see the time influence on your Y, don't put the ,fe option, but add -i.t- among explicative variables. This will incorporate time dummies to your model, without demanding you to generate them all manualy.

              However, I dind't exactly get what you are looking for
              Is there any way to show how much it affects employment each year?
              If you want to see how for each year, the variable x affects y, you'll have to either launch separated regression or use interaction terms but adding time dummies as you said won't help.

              I hope this is clearer now.
              For precise differences between random effects models and fixed effects model, please take a look (among other resources) to Richard William note : https://www3.nd.edu/~rwilliam/stats3/Panel04-FixedVsRandom.pdf

              Comment


              • #8
                Originally posted by Charlie Joyez View Post

                If you want to see the time influence on your Y, don't put the ,fe option, but add -i.t- among explicative variables. This will incorporate time dummies to your model, without demanding you to generate them all manualy.
                What do you mean by adding -i.t- ? Do I need to generate the variable? Can't find any help in Stata. Note that I'm using Stata 11.2
                Last edited by Teodor Bostrom; 28 Apr 2015, 07:34. Reason: Edit. Found out how to do it. Was not aware of that you could add i before a variable without generate it. Thank you for all the advice :)

                Comment


                • #9
                  Teodor:
                  if you run separate regressions to investigate the effect on employment caused by broadband for each year, you act as the observations are independent. This is not true, because you have a panel dataset and observations within each panel are not independent.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    xtreg forprivp bredband educ tatort i.t

                    Random-effects GLS regression Number of obs = 1450
                    Group variable: stad Number of groups = 290

                    R-sq: within = 0.5764 Obs per group: min = 5
                    between = 0.0521 avg = 5.0
                    overall = 0.0651 max = 5

                    Wald chi2(7) = 1578.19
                    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

                    ------------------------------------------------------------------------------
                    forprivp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                    bredband | -.0001529 .0021227 -0.07 0.943 -.0043132 .0040075
                    educ | -1.17e-07 1.56e-07 -0.75 0.455 -4.23e-07 1.90e-07
                    tatort | .0009178 .0001964 4.67 0.000 .0005328 .0013028
                    |
                    t |
                    2010 | .0072583 .0006078 11.94 0.000 .006067 .0084496
                    2011 | .0172831 .00057 30.32 0.000 .016166 .0184002
                    2012 | .0157713 .0005512 28.61 0.000 .0146909 .0168518
                    2013 | .0140587 .0005419 25.94 0.000 .0129965 .0151209
                    |
                    _cons | .5908645 .0147106 40.17 0.000 .5620323 .6196967
                    -------------+----------------------------------------------------------------
                    sigma_u | .05247162
                    sigma_e | .00632932
                    rho | .98565857 (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------

                    .

                    This is what I did. I assume that bredband is not significant. Bummer...

                    Comment


                    • #11
                      Teodor:
                      via -xtsum- and -xttab- you can have a clearer picture about what happened with your -xtreg, re- specification-
                      I would also run -xtreg, re- with -vce(cluster)- standard errors (SEs) to investigate potential differences with default SEs.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        hello good evening all, I am sorry for just entering the discussion in such a manner. I am a student of Masters working on my dissertation. I am badly caught up with Panel Data analysis and my mentor is not able to guide me well on this. So it would be really helpful if someone could help me on the following:

                        I have a strongly balanced Panel, which i simply run in Stata using Random effects. The literature behind my model strongly puts a Case for Random effects model. What will happen if i impose dummy (both interactive and individual) on the basic model like we do in LSDV, but run it with Random effect approach?

                        1.example my basic models command is
                        xtreg y x1 x2 x3, re
                        2.With dummy
                        xtreg y x1 x2 x3 d1 d2 x1d1 x1d2 x2d1x2d2 x3d1 x3d3, re

                        I am new to this, so my question may be very naive, but I am in need of help.

                        Thanks
                        Regards
                        Geetanjali

                        Comment

                        Working...
                        X