Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression Analysis with year dummy

    Hello. I am very new to the Stata and I would like to ask a very simple question. I have 30 years(from 1930 to 1960) of data and I want to know how my variables X1, X2, X3, X4, X5 influence on my Y variable in 3 different periods(1930-1940, 1940-1950, 1950-1960) by using year dummies. Also, I don't want to analyze it separately, which means that first analyzing the time period of 1930-1940 and then 1940-1950 and so on. I want to use the year dummy and I want to know how results will be different during the 3 different periods. What is the Stata command for doing it? Thanks very much... I attached the data for better understanding.

    1. How my X1, X2, X3, X4, X5 variables influence on Y during the time period of 1930-1940?
    2, How my X1, X2, X3, X4, X5 variables influence on Y during the time period of 1940-1950?
    3, How my X1, X2, X3, X4, X5 variables influence on Y during the time period of 1950-1960?
    Attached Files
    Last edited by Zol Jargal; 28 Mar 2021, 23:41.

  • #2
    Zol:
    unfortunately, the data excerpt you posted is not helpful in its current format.
    Please use -dataex-.Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      input int year float(ecogrowth population employment trade capital stock)
      1930 4.6 3.4 5.6 4.6 6.3 5.6
      1931 4 4.6 5.5 4 6.7 5.5
      1932 4 5 5.5 4 6.7 5.5
      1933 4 5 5.8 4 6.5 5.8
      1934 4 5 6.1 4 6.9 6.1
      1935 4 5 6.4 4 6.5 6.4
      1936 4.7 5 5.2 4.7 6.5 5.2
      1937 4.7 5.4 5.5 4.7 6.8 5.5
      1938 4.8 5.6 5.7 4.8 5.7 5.7
      1939 5.2 5.5 6.1 5.2 6.1 6.1
      1940 5.5 5.5 6.5 5.5 6.5 6.5
      1941 5.7 5.8 7.5 5.8 6.3 7.5
      1942 6.1 6.1 7.1 6.1 6.7 7.1
      1943 6.5 6.4 6.8 6.4 6.7 6.8
      1944 7.5 6.3 7.4 6.3 6.5 7.4
      1945 7.1 6.7 6.7 6.7 6.9 6.7
      1946 6.8 6.7 7.2 6.7 6.5 7.2
      1947 7.4 6.5 7.5 6.5 6.5 7.5
      1948 6.7 6.9 6.5 6.8 6.8 6.5
      1949 7.2 6.5 6.5 5.7 5.7 6.5
      1950 7.5 6.5 6.8 6.1 6.1 6.8
      1951 7.6 6.8 5.7 6.5 6.5 5.7
      1952 7.6 5.7 6.1 7.5 6.3 6.1
      1953 7.6 6.1 6.5 7.1 6.7 6.5
      1954 7.6 6.5 7.5 6.8 6.7 7.5
      1955 7.6 7.5 7.1 7.6 6.5 7.1
      1956 7.6 7.1 6.8 7.6 6.9 6.8
      1957 7.6 6.8 7.4 7.6 6.5 7.4
      1958 7.6 7.4 7.4 7.6 6.5 7.4
      1959 7.6 7.4 6.8 7.6 6.8 7.4
      1960 7.6 7.6 7.4 7.6 5.7 7.4

      Comment


      • #4
        Zol:
        set aside any comment about your model specification, you may be interested in something along the following lines:
        Code:
        . gen period_dummy=0 if year<=1940
        (20 missing values generated)
        
        . replace period_dummy=1 if year >1940 & year<=1950
        (10 real changes made)
        
        . replace period_dummy=2 if year >1950 & year<=1960
        (10 real changes made)
        
        . reg ecogrowth population employment trade capital stock i.period_dummy
        
              Source |       SS           df       MS      Number of obs   =        31
        -------------+----------------------------------   F(7, 23)        =     52.18
               Model |  58.1350772         7  8.30501102   Prob > F        =    0.0000
            Residual |  3.66040548        23  .159148064   R-squared       =    0.9408
        -------------+----------------------------------   Adj R-squared   =    0.9227
               Total |  61.7954826        30  2.05984942   Root MSE        =    .39893
        
        ------------------------------------------------------------------------------
           ecogrowth |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
          population |   .0738482   .1576754     0.47   0.644    -.2523281    .4000246
          employment |   .0169794   .7342494     0.02   0.982    -1.501931     1.53589
               trade |   .5855483   .1857799     3.15   0.004     .2012333    .9698632
             capital |  -.3751873   .2232571    -1.68   0.106    -.8370298    .0866552
               stock |  -.1575674   .7290106    -0.22   0.831    -1.665641    1.350506
                     |
        period_dummy |
                  1  |    1.35022   .3763539     3.59   0.002      .571673    2.128768
                  2  |   1.464204   .5148225     2.84   0.009      .399212    2.529195
                     |
               _cons |   4.740965   1.905256     2.49   0.021     .7996435    8.682286
        ------------------------------------------------------------------------------
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thanks for your help Carlo~. I really appreciate it. There is one question. For example in the variable of employment, how can I know that whether it has positive or negative effect during 3 periods? Based on above result, it shows that 0.982. Does it include all years or any specific period? Thank you again~

          Comment


          • #6
            Zol:
            0.982 p-value (event though, like many othres on this forum, I prefer taking a look at 95% CIs) is telling you that, when adjusted for the remaining predictors (including -time- considered as a categorical variable), the coefficient of -employment- cannot be thougt as different from 0.
            What I get from the OLS run on your data excertp is what follows:
            - you have a likely quasi-extreme multicollinearity problem (sky-rocketing R-sq: too many non-significant coefficients; see -estat vce, corr-);
            - variations in regressand seem to be explained by time (other things being equal).
            Last edited by Carlo Lazzaro; 29 Mar 2021, 08:15.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Carlo, thanks for your useful information. I offer my sincere apology for asking again and again. If possible, can you tell me any independent variable's p-value in those 3 different periods? For example, in the case in variable of employment, what is the p-value in period_dummy1 and period_dummy2? I really want to know it. Thanks again.

              Comment


              • #8
                Zol:
                if you're looking for that information, you should interact employment (or whatever) with -time-.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Thanks for your reply. I really appreciate it. In the case of interact employment with time(period_dummy1,2,3) based on above, Could you tell me the Stata command? For example, If I do the regression analysis with gdp and employment over the 3 periods, what would be the coding? Is it correct one? reg ecogrowth employment##period_dummy1 employment##dummy2 employment##dummy3 ?

                  Comment


                  • #10
                    Zol:
                    the code should be:
                    Code:
                    reg ecogrowth population employment##i.period_dummy trade capital stock
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Thanks for your kind reply. I really appreciate it~

                      Comment


                      • #12
                        Carlo, I wrote the command and it does not work out. I attach the picture. Could you please give me some advice to solve this problem? Thanks in advance~
                        Attached Files

                        Comment


                        • #13
                          Zol:
                          now it should work:
                          Code:
                          . reg ecogrowth population c.employment##i.period_dummy trade capital stock
                          
                          Source | SS df MS Number of obs = 31
                          -------------+---------------------------------- F(9, 21) = 38.41
                          Model | 58.2561959 9 6.47291066 Prob > F = 0.0000
                          Residual | 3.53928669 21 .168537462 R-squared = 0.9427
                          -------------+---------------------------------- Adj R-squared = 0.9182
                          Total | 61.7954826 30 2.05984942 Root MSE = .41053
                          
                          -------------------------------------------------------------------------------------------
                          ecogrowth | Coef. Std. Err. t P>|t| [95% Conf. Interval]
                          --------------------------+----------------------------------------------------------------
                          population | .079021 .1721732 0.46 0.651 -.2790327 .4370747
                          employment | .2497761 .80499 0.31 0.759 -1.424292 1.923844
                          |
                          period_dummy |
                          1 | 2.583021 3.312847 0.78 0.444 -4.306421 9.472463
                          2 | 3.476106 2.43173 1.43 0.168 -1.580954 8.533166
                          |
                          period_dummy#c.employment |
                          1 | -.2133548 .4944666 -0.43 0.671 -1.241654 .8149448
                          2 | -.3266391 .3853113 -0.85 0.406 -1.127938 .4746597
                          |
                          trade | .5838241 .1915101 3.05 0.006 .1855571 .9820911
                          capital | -.3770207 .2299296 -1.64 0.116 -.8551855 .1011441
                          stock | -.1751086 .753634 -0.23 0.819 -1.742376 1.392159
                          _cons | 3.484288 2.483919 1.40 0.175 -1.681305 8.649881
                          -------------------------------------------------------------------------------------------
                          
                          .
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Thank you very much ~

                            Comment


                            • #15
                              Carlo, I have one more question. From the result, what is the p-value of 'employment'? In my opinion, p-value of employment during the 'period 1(from 1940-1950)' is 0.671 and 'period 2(1950-1960)' is 0.406. Is it right? If it's right, then what is the p-value of employment during the 'period 0' which is frpm 1930-1940? I offer my sincere apology for asking a question again and again. your answer will be highly appreciated. Thank you. I attached the result.
                              Attached Files

                              Comment

                              Working...
                              X