Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Treatment Group

    Hello all
    I want to use the difference-in-difference model (DiD) and work with individual-level data (which combines three waves)
    I’m not sure how to set up the treatment group properly,
    · The control group will contain individuals who have never worked from
    · The treatment group will include individuals who transitioned from in-person pre-COVID to remote post-COVID.
    The Covid-19 start on April 2020 "nd_monthly == 723"

    i try to use this
    Code:
    gen covid_start = 722  
    gen post_covid = (nd_monthly >= covid_start)  
    
    bysort pidp: gen num_waves = _N  
    
    bysort pidp wave: gen wfh_status = wfh[1] 
    
    bysort pidp: egen max_wfh = max(wfh_status)
    bysort pidp: egen min_wfh = min(wfh_status)
    
    gen treatment_group = 0
    replace treatment_group = 1 if min_wfh == 0 & max_wfh == 1
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long pidp float(nd_monthly wfh) str1 wave
       22445 699 1 "j"
       22445 723 . "l"
       22445 747 0 "n"
       29925 702 0 "j"
       29925 728 0 "l"
       29925 751 1 "n"
       76165 698 0 "j"
       76165 723 0 "l"
       76165 746 1 "n"
      280165 707 1 "j"
      280165 731 . "l"
      280165 755 . "n"
      333205 700 0 "j"
      469205 701 0 "j"
      469205 724 0 "l"
      469205 748 0 "n"
      599765 696 0 "j"
      599765 720 0 "l"
      599765 744 1 "n"
      665045 700 0 "j"
      665045 723 0 "l"
      732365 703 . "j"
      732365 727 . "l"
      732365 751 . "n"
     1587125 704 . "j"
     1587125 727 . "l"
     1587125 751 0 "n"
     1697285 699 0 "j"
     2888645 698 0 "j"
     2888645 723 . "l"
     2888645 749 0 "n"
     3663845 748 . "n"
     4849085 699 1 "j"
     4849085 723 1 "l"
     4849085 746 1 "n"
     4853165 708 0 "j"
     4853165 733 0 "l"
     4853165 757 0 "n"
    68006127 698 . "j"
    68006127 748 . "n"
    68008847 701 0 "j"
    68009527 698 0 "j"
    68009527 722 0 "l"
    68009527 747 0 "n"
    68010887 698 0 "j"
    68010887 722 0 "l"
    68010887 746 1 "n"
    68011567 696 1 "j"
    68014287 700 . "j"
    68014291 701 0 "j"
    68020564 698 . "j"
    68021765 700 0 "j"
    68028575 698 . "j"
    68028575 721 . "l"
    68028575 746 . "n"
    68029927 698 . "j"
    68029927 722 . "l"
    68029927 746 . "n"
    68029931 698 0 "j"
    68029931 749 . "n"
    68035367 699 0 "j"
    68035367 722 0 "l"
    68035367 744 1 "n"
    68037407 697 . "j"
    68037407 724 0 "l"
    68041487 699 0 "j"
    68041487 721 1 "l"
    68041487 745 1 "n"
    68041491 699 0 "j"
    68041491 720 . "l"
    68041491 745 . "n"
    68041495 744 . "n"
    68042167 697 0 "j"
    68042167 722 0 "l"
    68042171 697 0 "j"
    68044207 697 0 "j"
    68044207 721 1 "l"
    68044211 697 . "j"
    68044211 722 . "l"
    68045567 696 0 "j"
    68045567 720 1 "l"
    68045567 748 1 "n"
    68045571 697 . "j"
    68045571 722 . "l"
    68046927 699 0 "j"
    68049647 696 0 "j"
    68049651 696 0 "j"
    68049651 720 0 "l"
    68051007 696 0 "j"
    68051007 720 . "l"
    68051011 696 0 "j"
    68051011 720 0 "l"
    68051011 744 0 "n"
    68056447 697 0 "j"
    68056447 720 0 "l"
    68056447 744 0 "n"
    68056451 697 . "j"
    68056451 720 0 "l"
    68056451 744 0 "n"
    68056455 696 0 "j"
    end
    format %tm nd_monthly

    Any insights or suggestions on how to approach this would be appreciated!

  • #2
    It depends, you can use never treated or not yet treated individuals as control group. This is a staggered DiD setting. You may want to look at Wooldridge (2021, 2023), or Callaway and Sant'Anna (2021)

    Comment


    • #3
      If you have a sequence (pre: 0 1 0 1) and (post: 0 1 0 1), then checking whether someone either worked from home pre and post will indicate a transition. But in reality, the same pattern was maintained pre and post. Do you want to tag individuals with the pattern (pre: 0 0 0 0) and (post: 1 1 1 1)? If so:

      Code:
      gen pre= nd_monthly<tm(2020m4)
      bys pidp pre (wfh): gen tag1 = pre & !wfh[_N]
      by pidp: egen inperson_pre= max(tag1)
      bys pidp pre (wfh): gen tag2 = !pre & wfh[1]==1 & wfh[_N]==1
      by pidp: egen home_post= max(tag2)
      bys pidp nd_monthly: gen wanted= inperson_pre & home_post
      Otherwise, you need to think more about how you want to define a transition if not in this strict sense or consider a staggered DID as Maxence suggests.
      Last edited by Andrew Musau; 03 Dec 2024, 06:39.

      Comment


      • #4
        Originally posted by Maxence Morlet View Post
        It depends, you can use never treated or not yet treated individuals as control group. This is a staggered DiD setting. You may want to look at Wooldridge (2021, 2023), or Callaway and Sant'Anna (2021)
        I really appreciate your reply. However, my question was not clear enough, in my case I have only one treatment time (Covid-19), and I feel the control group is almost clear to me which I will use if they are not working from home at all (this never treated, right ) However, my struggling with treatment which to identifying them in Stata! How to set the treatment group in Stata as "starting working from home after COVID-19 and pre covid-19 they were working in person" (I can track these individuals who shift to work from home between waves by other variables "stay with same employer or job" ) if this necessary to set up the treatment group

        I'm not sure if you mean something else and I did not got that !!!

        Comment

        Working...
        X