Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with calculating employment (unemployment) duration until first unemployment (employment)

    Hello everyone,

    I have trouble working with duration data.

    I am working with spell-type data. The data is a panel that contains the employment status in each month. I grouped (concat) the 12 variables that correspond to the status observed each month to obtain the variable Empl_status which allows me to see the trajectory of the individual for each year. So, for each month individual can be either employed (=1), unemployed(=2) or nonparticipating(=3). For example the individual personid==1 is employed from January to August 2010 (8 months) and exit to employment and become unemployed at September 2008 to May 2011 (9 months). For each individual, I just want the first two durations.
    Code:
    clear all
    input double personid year begdat str12 Empl_status spellE_1 spellE_2 spellU_1 spellU_2 
          1 2010  "111111112222"  8 0 9 0
          1 2011 "222221111111" . . . .
          1 2012 "111111111111" . . . .
          1 2013 "111122222222" . . . .
          2 2010 "222222222222" 0 17 0 14
          2 2011 "221111111111" . . . . 
          2 2012 "111111122222" . . . .
          2 2013 "111111333333" . . . .
    end
    Where Empl_status denotes employment status each month.

    I want to calculate spellE_1 spellE_2 spellU_1 spellU_2.

    spellE_1 = total employment duration if the spell begin in employment. The beginning correspond to the first time the individual was observed (i.e 2010). This variable measures duration on employment until the first unemployment after employment.

    spellE_2= measures total duration on employment after duration on unemployment (i.e the individual begins in unemployment status)

    spellU_1= measures total duration on unemployment after duration on employment (i.e the individual begins in employment status)

    spellU_2 =measures total duration on employment until the first duration on employment after unemployment(i.e the individual begins in unemployment status)

    Could you please help me to write a code for such durations?

    Thanks for your time.

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double personid float year str12 Empl_status
    1 2010 "111111112222"
    1 2011 "222221111111"
    1 2012 "111111111111"
    1 2013 "111122222222"
    2 2010 "222222222222"
    2 2011 "221111111111"
    2 2012 "111111122222"
    2 2013 "111111333333"
    end
    
    //  UNPAK THE EMPL_STATUS VARIABE
    assert length(Empl_status) == 12
    forvalues i = 1/12 {
        gen status`i' = real(substr(Empl_status, `i', 1))
    }
    
    //  GO LONG
    reshape long status, i(personid year) j(month)
    by personid (year month), sort: gen spell = sum(status != status[_n-1])
    by personid spell (year month), sort: gen duration = _N
    
    //  CREATE REQUESTED VARIABLES
    by personid (spell year month), sort:gen d1 = duration[1] //DURATION FIRST SPELL
    by personid (spell year month): egen d2 = max(cond(spell == 2, duration, .)) // DURATION 2ND SPELL
    
    by personid (spell year month): gen spellE1 = d1 if status[1] == 1
    by personid (spell year month): gen spellE2 = d2 if status[1] == 2
    by personid (spell year month): gen spellU1 = d2 if status[1] == 1
    by personid (spell year month): gen spellU2 = d1 if status[1] == 2
    
    
    //  RETURN TO ORIGINAL DATA LAYOUT
    by personid year, sort: keep if _n == 1
    drop spell status duration d1 d2
    //  ELIMINATE RESULTS FOR ALL BUT FIRST OBS PER PERSON
    foreach v of varlist spell* {
        by personid (year): replace `v' = . if _n > 1
    }
    Your mistake was in concatenating the employment status variables into a single string: the first step to solving the problem is undoing that.

    I don't understand your description of what is wanted in spellE_1 through spellU_2, so I just followed the results you showed in your example data--but I don't think they are the same thing.

    Something was wrong with your -dataex- example. The -input- line refers to a variable begdat for which the actual data is not present. Either you didn't really use -dataex- to create this and just wrote what you thought the -dataex- output would look like, or you edited the -dataex- output after you actually ran it. Either way, what you showed could not be used as is to create a replica of your example in Stata. Please, when posting examples, always use -dataex- (don't try to fake it) and show what it gives you without any additional editing.

    Comment


    • #3
      This is interesting to me at the moment because I am at present writing a short piece on how strings can sometimes be useful for showing a history. But if you do that you have to go all the way. Holding years in observations and the pattern for months in strings mixes two techniques and cannot fail to be awkward. That was the first point made by Clyde Schechter -- and as usual all of his advice is excellent, so I stop there.

      Comment


      • #4
        Thanks Clyde Schechter, it is exactly the same thing that I would to have.



        Yes indeed I did not use dataex for the output. The reason is that the data is confidential and that I can not bring out an extract, unfortunately. That's why I simulated my problem using this data. I would have liked not to simulate, but I had no choice.

        This is an error for the begdat variable and a mistake to concatenating.

        Thanks again for your time

        Comment


        • #5
          We address the issue of confidentiality in the FAQ Advice. Clearly it is a compelling reason not to show real data and the answer is just to show fake but realistic data.

          Comment

          Working...
          X