Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Threshold regression for balanced/unbalanced panel data

    Hello all,

    My current project aims to test whether gdp growth rate varies with different regimes of foreign direct investment. For this reason, I am running a simple threshold model.

    $$
    gdpgr = \alpha + \beta_1 (fdi < \gamma) + \beta_2 (fdi > \gamma) + \epsilon
    $$

    My choices to run the above model specifications are XTHREG, THRESHOLD, and Hansen's THRESHOLDREG, based on how my data is organized.

    Now, my sample comprises of a list of 32 countries with different start years and different end years, and no gaps. see the description below.

    Code:
    . // describe the final data 
    . xtset ccode year 
           panel variable:  ccode (unbalanced)
            time variable:  year, 1996 to 2017
                    delta:  1 unit
    
    . tsset ccode year 
           panel variable:  ccode (unbalanced)
            time variable:  year, 1996 to 2017
                    delta:  1 unit
    Code:
    . xtdescribe // list available time-series, by country 
    
       ccode:  1, 2, ..., 52                                     n =         32
        year:  1996, 1997, ..., 2017                             T =         22
               Delta(year) = 1 unit
               Span(year)  = 22 periods
               (ccode*year uniquely identifies each observation)
    
    Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                            10      15      19        21        22      22      22
    
         Freq.  Percent    Cum. |  Pattern
     ---------------------------+------------------------
           15     46.88   46.88 |  1111111111111111111111
            6     18.75   65.63 |  111111111111111111111.
            2      6.25   71.88 |  .......111111111111111
            2      6.25   78.13 |  11111111111111111111..
            1      3.13   81.25 |  .......1111111111.....
            1      3.13   84.38 |  .....11111111111111111
            1      3.13   87.50 |  ....11111111111111111.
            1      3.13   90.63 |  ....111111111111111111
            1      3.13   93.75 |  11111111111111111.....
            2      6.25  100.00 | (other patterns)
     ---------------------------+------------------------
           32    100.00         |  XXXXXXXXXXXXXXXXXXXXXX
    
    . 
    . xtpatternvar, gen(pattern) // list available time-series, count 
    
    . tab pattern 
    
                   pattern |      Freq.     Percent        Cum.
    -----------------------+-----------------------------------
    .......1111111111..... |         10        1.56        1.56
    .......111111111111111 |         30        4.67        6.23
    .....11111111111111111 |         17        2.65        8.88
    ....11111111111111111. |         17        2.65       11.53
    ....111111111111111111 |         18        2.80       14.33
    11111111111111111..... |         17        2.65       16.98
    111111111111111111.... |         18        2.80       19.78
    1111111111111111111... |         19        2.96       22.74
    11111111111111111111.. |         40        6.23       28.97
    111111111111111111111. |        126       19.63       48.60
    1111111111111111111111 |        330       51.40      100.00
    -----------------------+-----------------------------------
                     Total |        642      100.00
    
    .
    It is my understanding that XTHREG definitely cannot be used here, since the data is NOT balanced.

    However, I am NOT sure whether THRESHOLD regression specification would aid my purpose. When I give it a try, I get the following "gaps not allowed" error.

    Code:
    . // THRESHOLD regression 
    . threshold pcig, threshvar(fdi2gdp)
    gaps not allowed
    r(198);
    Is THRESHOLD specification also for perfectly balanced data only?


    As a check, I ran the THRESHOLD specification for a subset of my sample that is PERFECTLY BALANCED. However, I land with the same error!

    Code:
    . // describe the final data 
    . xtset ccode year 
           panel variable:  ccode (strongly balanced)
            time variable:  year, 1996 to 2017
                    delta:  1 unit
    
    . tsset ccode year 
           panel variable:  ccode (strongly balanced)
            time variable:  year, 1996 to 2017
                    delta:  1 unit
    . 
    . xtdescribe // list available time-series, by country 
    
       ccode:  1, 4, ..., 52                                     n =         15
        year:  1996, 1997, ..., 2017                             T =         22
               Delta(year) = 1 unit
               Span(year)  = 22 periods
               (ccode*year uniquely identifies each observation)
    
    Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                            22      22      22        22        22      22      22
    
         Freq.  Percent    Cum. |  Pattern
     ---------------------------+------------------------
           15    100.00  100.00 |  1111111111111111111111
     ---------------------------+------------------------
           15    100.00         |  XXXXXXXXXXXXXXXXXXXXXX
    
    . 
    . xtpatternvar, gen(pattern) // list available time-series, count 
    
    . tab pattern 
    
                   pattern |      Freq.     Percent        Cum.
    -----------------------+-----------------------------------
    1111111111111111111111 |        330      100.00      100.00
    -----------------------+-----------------------------------
                     Total |        330      100.00
    
    . 
    . // THRESHOLD regression 
    . threshold pcig, threshvar(fdi2gdp)
    gaps not allowed
    r(198);
    What am I missing?

    P.S: The analysis runs flawlessly Hansen's THRESHOLDREG program specification. However, it does not support any post estimation options.

  • #2
    You didn't get a quick answer. You've provided lots of code and output. However, providing the code in code delimiters and sample data using dataex would help - without this we cannot replicate your problem.

    The error is "gaps not allowed" - this may mean you've got missing data for some observations within a time series. I guess it could be any of the variables threshold needs. I'd start by double checking the data. Then, I'd try running precisely the same model provided in the documentation using the code and data from the documentation. Then I'd try exactly the same model using your data. I might see if adding a regressor changes something.

    I wonder if there are not other tools that would serve the same purpose. Your model has no regressors and no region variables so you're really just estimating the optimal division of pcig by fdi2gdp. There might be a tool in the multivariate documentation or in the base reference that does this (discrim or cluster for example).

    Comment


    • #3
      Phil Bromiley Thanks for the timely response. Still new here and I apologize for NOT including the data using dataex. Will remember to do so in future.

      Here are a few thoughts that I want to add:
      1. I wanted to run a base specification (without regressors) first and then try the models with regressors. I did not expect to run into problems due to lack of regressors. I do understand how the setup with no regressors might warrant other procedures like "discrim" or "cluster."
      2. I think this particular point is basic, but based on several notes/articles that I read, I have finally come to the conclusion that THRESHOLD is only for time-series estimation of thresholds.
      3. For my project, I have decided to use the balanced panel of 15 countries (330 observations) and estimate the threshold using XTHREG.

      Comment


      • #4
        Hello,
        I am new to Stata. I am currently working on a problem where i need to use xthreg command in stata. I need a sample balanced panel data file and syntax to run on that data file. Can someone please help urgently.

        Comment


        • #5
          Rob Tom I won't claim to be an expert in XTHREG, but these are the three steps I would suggest. (1) set your panel variables, (2) find the number of thresholds, and (3) estimate the region-specific coefficients.



          First, set and describe your data.
          Code:
          //==============================================================================
          // set panel variables
          //==============================================================================
              xtset ccode year
              tsset ccode year
          
              xtbalance, range($startyear $endyear)
              xtbalance, range($startyear $endyear) miss(_all)
          
          //==============================================================================
          //    DESCRIBE DATA
          //==============================================================================
          // list available time-series, by id
              xtdescribe
          
          // list available time-series, count    
              xtpatternvar, gen(pattern)
                  tab pattern
          
              nmissing // list missing obs, by variable
          
              npresent // list non-missing obs, by variable


          Second, check for the number of thresholds. I use an arbitrary initial setting of 3.
          Code:
          //==============================================================================
          //    NUMBER OF THRESHOLDS - SET TO 3
          //==============================================================================
              xthreg $depvar $ctrlvars , rx($threshvar) ///
                          qx($threshvar) thnum(3)  ///
                          bs($bootstrp $bootstrp $bootstrp) ///
                         grid(400) trim(0.05 0.05 0.05) ///
                         nobslog vce(robust)
          The output (shown below) is an example and would suggest how many thresholds one might need to ultimately investigate.

          HTML Code:
          Threshold effect test (bootstrap = 1000 1000 1000):
          -------------------------------------------------------------------------------
           Threshold |       RSS        MSE      Fstat    Prob   Crit10    Crit5    Crit1
          -----------+-------------------------------------------------------------------
              Single |    0.1537     0.0005      19.42  0.0230  13.4707  16.0417  22.3746
              Double |    0.1510     0.0005       5.50  0.6800  12.1896  14.4904  19.2961
              Triple |    0.1479     0.0005       6.47  0.4930  12.3300  14.7827  20.3306
          -------------------------------------------------------------------------------
          For my sample, there is only one threshold (p-value < 5% for Single Threshold in the threshold effect test).



          Finally, you run XTHREG with the number of thresholds suggested in the previous step.
          Code:
          //==============================================================================
          //    NUMBER OF THRESHOLDS - SET TO 1
          //==============================================================================
              xthreg $depvar $ctrlvars , rx($threshvar) /// 
                          qx($threshvar) thnum(1) ///
                          bs($bootstrp) grid(400) trim(0.05) ///
                          nobslog vce(robust)
          Hope this helps!
          Last edited by Suresh Paul; 25 Mar 2019, 00:42.

          Comment


          • #6
            dear Suresh Paul

            im trying to follow the steps proposed by you but for the first step i can not get balanced data

            i get the following msg
            Click image for larger version

Name:	Capture.PNG
Views:	2
Size:	3.5 KB
ID:	1495027


            i hope you can help me with that
            best regards

            Comment

            Working...
            X