Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "time variable must contain only integer values" with weekly

    Dear all,

    I have the following weekly panel dataset:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str10 date byte id float(mean_return_weekly case_key_market_return_weekly date2)
    "2015-01-23"  0  .0041432767  -.064748704 20111
    "2017-04-14"  0   -.19679497  -.018455634 20923
    "2014-03-10"  1    .07127273    .02030232 19792
    "2018-06-25"  1    .05313396   -.04045292 21360
    "2014-03-29"  2    .09463324    .05680729 19811
    "2015-01-17"  2    .07829325   -.05228545 20105
    "2018-03-17"  2  -.014469092  -.014861934 21260
    "2019-03-23"  3   .018260526  -.026457224 21631
    "2017-10-28"  4    .08790786  -.016547952 21120
    "2018-02-17"  4  -.028749533    .01312601 21232
    "2015-07-05"  5   .008704542 -.0007317966 20274
    "2014-08-31"  6   .069394946     .0502498 19966
    "2018-05-06"  6   .005442227  .0015735645 21310
    "2020-02-09"  6 -.0023563064    .02675025 21954
    "2016-01-07"  9   .032150116    .02361343 20460
    "2016-05-12"  9  .0017465983   -.03200009 20586
    "2017-03-16"  9   -.03948626   -.02414604 20894
    "2021-08-19"  9   .011072878   -.07161644 22511
    "2018-10-04" 10   -.02794765  -.006522959 21461
    "2018-06-24" 12   -.09037378   -.04516252 21359
    "2019-08-18" 15  -.011371915   .020680206 21779
    "2017-12-15" 16   .019289013 -.0007621952 21168
    "2017-09-30" 18    .03859815   -.04091068 21092
    "2020-06-06" 18   .008079143    .16526787 22072
    "2019-03-10" 20   -.03274173    .00966191 21618
    "2019-09-22" 20   .015375464 -.0088528795 21814
    "2021-05-16" 20   -.02298662     .0746131 22416
    "2020-10-22" 21   -.07719935    .04579298 22210
    "2021-07-11" 25   -.04076784 -.0038614746 22472
    "2021-06-25" 28   -.09694575   -.02367404 22456
    end

    Where there are 7 days between observations within the same panel.
    I try to define my dataset panel using:

    Code:
    generate date2 = date(date,"YMD")
    xtset id date2, delta(7)
    And get a float variable for date2 as a result.
    I can perfectly run regressions such as:

    Code:
    regress mean_return_weekly case_key_market_return_weekly if id == 1
    
    xtreg case_key_market_return_weekly mean_return_weekly, re
    However when I run a time-series regression such as
    Code:
    arima mean_return_weekly if id== 0, ar(1)
    I get the following error code: "time variable must contain only integer values r(451)"
    So I created a new variable:

    Code:
    gen int date3 = date2
    But the results are the same and I am really not sure what is the problem.
    Furhtermore the "count if !missing" command shows that all my observations are missing, however this is quite strange given I can run non-time-series regression.

    Any help would be really appreciated!
    Many thanks in advance!


  • #2
    The error message is puzzling to me too, but the implication of your data example is that

    1. While your data are weekly there are lots of gaps. Try plotting each value against the previous week’s value as a check.

    2. You have only two observations for id 0 so that arima is unlikely to work well.

    If the data example gives the wrong picture we need to pursue other explanations.

    Comment


    • #3
      Thanks very much for the reply Nick!

      I have 430 observations for id 0, what I showed was a randomized version of my dataset.
      These are the first 20 observations for id 0 :

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str10 date byte id float(mean_return_weekly case_key_market_return_weekly date2)
      "2013-08-23" 0  -.25441635    -.3824093 19593
      "2013-08-30" 0 -.014348716  -.004550938 19600
      "2013-09-06" 0 -.018980326    -.0906524 19607
      "2013-09-13" 0   .00559811  -.016289417 19614
      "2013-09-20" 0  -.16963047   -.05429949 19621
      "2013-09-27" 0  -.09034977   -.06984894 19628
      "2013-10-04" 0   .02069215  -.013272272 19635
      "2013-10-11" 0  -.04099157   -.02602958 19642
      "2013-10-18" 0 -.028318597  -.004581506 19649
      "2013-10-25" 0  .032609142 -.0014139274 19656
      "2013-11-01" 0 -.011180526  -.017484834 19663
      "2013-11-08" 0  -.08764891     .1231283 19670
      "2013-11-15" 0  -.05039433   -.13071641 19677
      "2013-11-22" 0   .13170518   .012256823 19684
      "2013-11-29" 0   -.0194418  .0039334935 19691
      "2013-12-06" 0   .09750947   -.02274056 19698
      "2013-12-13" 0   .20713177   .015577204 19705
      "2013-12-20" 0   -.3593221  -.012660686 19712
      "2013-12-27" 0    .1523801   .027290743 19719
      "2014-01-03" 0 .0016981976  -.018591883 19726
      end
      1. I have checked the acf and pacf of the variable within panels and they seemed fine to me.

      Although it could be possible that there are some gaps in the data I have checked it with Python and there does not seem to be gaps.

      Also when I use xtset it does not explicitly say that I have gaps in the data.

      Additionally, when I use tsreport I get the following results:

      Code:
      tsreport
      
      Panel variable: id
      Time variable:  date3
      -----------------------
      Starting period = 19593
      Ending period   = 22605
      Number of obs   = 9,646
      Number of gaps  =    36 (includes panel changes)
      and I have 36 panels in my dataset.

      Comment


      • #4
        All sounds good, so the explanation is something else and I don’t have a suggestion.

        Comment


        • #5
          I can't reproduce the problem in Stata 17.

          Code:
          . xtset id date2, delta(7)
          
          Panel variable: id (strongly balanced)
           Time variable: date2, 19593 to 19726
                   Delta: 7 units
          
          . 
          . arima mean_return_weekly if id == 0, ar(1)
          
          (setting optimization to BHHH)
          Iteration 0:   log likelihood =  13.921272  
          Iteration 1:   log likelihood =  14.013077  
          Iteration 2:   log likelihood =  14.023664  
          Iteration 3:   log likelihood =  14.032815  
          Iteration 4:   log likelihood =  14.033945  
          (switching optimization to BFGS)
          Iteration 5:   log likelihood =  14.034088  
          Iteration 6:   log likelihood =  14.034089  
          
          ARIMA regression
          
          Sample: 19593 thru 19726                        Number of obs     =         20
                                                          Wald chi2(1)      =       2.45
          Log likelihood = 14.03409                       Prob > chi2       =     0.1174
          
          ------------------------------------------------------------------------------------
                             |                 OPG
          mean_return_weekly | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
          -------------------+----------------------------------------------------------------
          mean_return_weekly |
                       _cons |  -.0219802   .0252929    -0.87   0.385    -.0715534     .027593
          -------------------+----------------------------------------------------------------
          ARMA               |
                          ar |
                         L1. |  -.3674208   .2346377    -1.57   0.117    -.8273023    .0924607
          -------------------+----------------------------------------------------------------
                      /sigma |   .1195218   .0183996     6.50   0.000     .0834593    .1555844
          ------------------------------------------------------------------------------------
          Note: The test of the variance against zero is one sided, and the two-sided
                confidence interval is truncated at zero.

          Comment


          • #6
            Many thanks for the replies once again!

            It might be due to the fact that there is only one id variable in the latest data sample I provided.
            When using tsset for a single panel the command works perfectly well, I am only having issues when there are multiple panels.

            Furthermore, I tried to use a rather inelegant way and defined a new variable where the difference in the date identifier between each observation is one unit, instead of the 7 used above.
            This means that I would not be able to differentiate a Wednesday from a Friday on the same week across separate panels, but xtset in this case worked.

            I would assume therefore that it is the delta(7) bit that is causing the issue.

            Comment


            • #7
              I tried with two or more panels in place.

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input str10 date byte id float(mean_return_weekly case_key_market_return_weekly date2)
              "2013-08-23" 0  -.25441635    -.3824093 19593
              "2013-08-30" 0 -.014348716  -.004550938 19600
              "2013-09-06" 0 -.018980326    -.0906524 19607
              "2013-09-13" 0   .00559811  -.016289417 19614
              "2013-09-20" 0  -.16963047   -.05429949 19621
              "2013-09-27" 0  -.09034977   -.06984894 19628
              "2013-10-04" 0   .02069215  -.013272272 19635
              "2013-10-11" 0  -.04099157   -.02602958 19642
              "2013-10-18" 0 -.028318597  -.004581506 19649
              "2013-10-25" 0  .032609142 -.0014139274 19656
              "2013-11-01" 0 -.011180526  -.017484834 19663
              "2013-11-08" 0  -.08764891     .1231283 19670
              "2013-11-15" 0  -.05039433   -.13071641 19677
              "2013-11-22" 0   .13170518   .012256823 19684
              "2013-11-29" 0   -.0194418  .0039334935 19691
              "2013-12-06" 0   .09750947   -.02274056 19698
              "2013-12-13" 0   .20713177   .015577204 19705
              "2013-12-20" 0   -.3593221  -.012660686 19712
              "2013-12-27" 0    .1523801   .027290743 19719
              "2014-01-03" 0 .0016981976  -.018591883 19726
              end
              
              expand 2 
              replace id = 1 in 21/L 
              
              xtset id date2, delta(7)
              
              arima mean_return_weekly if id == 0, ar(1)
              No problem noticed.

              You could try

              Code:
              scatter id date2

              Comment


              • #8
                Sorry for the late response.

                I am not exactly sure what the problem is then, I tried scatter id date2 but not much can be seen as there are many observations for one id.
                I would guess it is something further down in my dataset that Stata does not like.

                Comment


                • #9
                  You need to be able to identify where the error is coming from. It's not obviously coming from arima itself but I am no expert.


                  If you go

                  Code:
                  log using problem.log, replace 
                  set trace on 
                  set tracedepth 1  
                   arima mean_return_weekly if id== 0, ar(1)  log close
                  you may get a sense of where the problem arises. Warning: Such log files can get very long. Warning: A trace depth of 1 may not be enough, but start small and bump up the number if there is not enough detail. The worst scenario I can imagine is that you may need to send your dataset to StataCorp technical support for a diagnosis. The best scenario I can imagine is that arima is reacting to some small quirk in your dataset, but at this point I am not optimistic on that.

                  Comment

                  Working...
                  X