Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • unbalanced panel error when using sdid (the package by Daniel-Pailanir)

    Hi, I am trying to run a staggered synthetic diff-in-diff framework, using the sdid package created by Daniel-Pailanir available at https://github.com/Daniel-Pailanir/sdid.

    When I define my panel it is strongly balanced:
    Code:
    . xtset id year
    panel variable: id (strongly balanced)
    time variable: year, 1968 to 2003
    delta: 1 unit
    I then get the error:

    Code:
    . sdid total_hours_per_year id year treatment if total_hours_per_year != ., vce(placebo) method(sdid)
    Panel is unbalanced.
    r(451);
    This is in contradiction to stata's earlier assessment. Any ideas what might be causing the issue? Many thanks and much appreciated.

  • #2
    Hello Laurence Jones ! Yes, you have a balanced panel by construction, but you are actually missing values in your dependent variable. This means you do not have a balanced panel for that particular variable.

    The methodology requires that you observe your dependent variables in every period.

    Comment


    • #3
      Hello Daniel PV ! Many thanks for the response. I did suspect that at first, but hoped I could get around it with the ignore missing values addition "if total_hours_per_year != .". I take it this is not sufficient?

      Will I need to redefine my panel after dropping any missing observations? What should my next step be?

      Many thanks and much appreciated!

      Comment


      • #4
        One way to handle this is to estimate SDID over the units you observe in your entire data panel, sort of like counting and ensuring that only those "ids" that are not missing are included in your estimate. Another option, if you are only missing a few values, is perhaps to impute some values (this should be carefully considered).

        Comment


        • #5
          Hi,

          I am running in a similar issue.
          I balanced my panel in a "simple" way:

          Code:
          *deleting missing values:
              reg $yit $tit
              capture gene sample2 =e(sample)
              drop if sample2 != 1
                
          *Treatment group :
                  capture drop streated
                  gen streated = 0
                  //replace streated=1 if oldght==2 & i_year>$date    
                  //replace streated=1 if oldght==1 & i_year>$date  
                  replace streated=1 if oldght==3 & i_year>$date            
            
          * balancing the panel:
                  count
                  bysort id (i_year): gene nb = _N
                      sum nb
                      keep if nb==`r(max)'
                      drop nb
                  count
                  xtset id i_year
                  
          *checking for missing values in yit and xit
                  foreach v in id i_year $yit $tit{
                      sum `v' if `v' == .    
          }
                  xtreg $yit $tit, fe
                  
                  sdid $yit id i_year streated if oldght!=1 & oldght!=2, method(sdid) vce(bootstrap) reps(50) covariates($tit)
          A first dataset passed all those balancing program and tests. SDID runs normaly.
          But a second dataset, with a change in level of data agregation do not let me use sdid, with the infamous
          Panel is unbalanced.
          , even though the
          Code:
          xtreg $yit $tit, fe
          outputs:

          Number of obs = 9,740
          Number of groups = 974

          Obs per group:
          min = 10
          avg = 10.0
          max = 10
          Suggesting a highly balanced panel

          and

          Code:
                
           foreach v in id i_year $yit $tit{
                      sum `v' if `v' == .    
          }
          seems to suggest that there are 0 missing values for id, i_year, $yit and $tit.

          As it should with a balanced panel.

          Do you see any improvement possible in my code? Or any SDID functionality that I am not aware of ?
          Last edited by Loic Guerin; 24 Jan 2024, 04:21.

          Comment


          • #6
            If you can simulate a dataset that has this, I'll look at it. Thus far I can't tell what the issue is

            Comment

            Working...
            X