Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Plugging values of a variable back into Stata

    Hi users.
    I am using a variable called activity status which comes in codes such 11, 12, 21, 31, 41, 42, 51 and 81. The code 81 gives number of unemployed people, and the code 11 to 81 gives number of labour force. I want to calculate unemployment rate which is percentage of unemployed out of labour force. I want to calculate the unemployment rate and plug the values back into Stata in a variable say Unemployment rate.
    How ca i write the function? Is there way to make it easy?

  • #2
    Data example please https://www.statalist.org/forums/help#stata

    Comment


    • #3
      Hi professor, here is a random sample of the dataset.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str2 upss float state_tab
      "51"  1
      "31"  7
      "31" 18
      "21" 22
      "11" 15
      "11"  3
      "11" 15
      "11"  8
      "11" 14
      "31" 13
      "11" 19
      "51" 21
      "11" 19
      "21" 22
      "11" 17
      "31" 18
      "81" 23
      "11" 13
      "31"  9
      "31"  1
      "81"  9
      "31" 14
      "11" 13
      "11" 19
      "51"  9
      "31" 18
      "21"  6
      "11" 20
      "21" 10
      "51"  3
      "51" 15
      "21"  8
      "11" 11
      "51" 16
      "51" 19
      end
      label values state_tab tab_sta
      label def tab_sta 1 "Andhra Pradesh", modify
      label def tab_sta 3 "Bihar", modify
      label def tab_sta 6 "Gujarat", modify
      label def tab_sta 7 "Haryana", modify
      label def tab_sta 8 "Himachal Pradesh", modify
      label def tab_sta 9 "Jammu & Kashmir", modify
      label def tab_sta 10 "Jharkhand", modify
      label def tab_sta 11 "Karnataka", modify
      label def tab_sta 13 "Madhya Pradesh", modify
      label def tab_sta 14 "Maharashtra", modify
      label def tab_sta 15 "Orissa", modify
      label def tab_sta 16 "Punjab", modify
      label def tab_sta 17 "Rajasthan", modify
      label def tab_sta 18 "Tamilnadu", modify
      label def tab_sta 19 "Uttar Pradesh", modify
      label def tab_sta 20 "Uttarakhand", modify
      label def tab_sta 21 "West Bengal", modify
      label def tab_sta 22 "N.E.States", modify
      label def tab_sta 23 "Others", modify

      I want to calculate unemployment rate by state_tab and plug the values back into Stata, so that I can use it for further analysis. As for usual purpose, I use the following command to generate unemployment rate:

      gen unemployment_rate=1 if upss=="11"| upss=="12"| upss=="21"| upss=="31"| upss=="41"| upss=="42"| upss=="51"| upss=="71"| u
      > pss=="72"| upss=="81"

      . replace unemployment_rate=2 if upss=="81"
      (5,579 real changes made)

      . ta unemployment_rate

      unemploymen |
      t_rate | Freq. Percent Cum.
      ------------+-----------------------------------
      1 | 172,281 96.86 96.86
      2 | 5,579 3.14 100.00
      ------------+-----------------------------------
      Total | 177,860 100.00



      Comment


      • #4
        This may help:

        Code:
        . egen wanted = mean(100 * (upss == "81")), by(state_tab)
        
        . egen tag = tag(state_tab)
        
        . 
        . list state_tab wanted if tag 
        
             +-----------------------------+
             |        state_tab     wanted |
             |-----------------------------|
          1. |   Andhra Pradesh          0 |
          2. |          Haryana          0 |
          3. |        Tamilnadu          0 |
          4. |       N.E.States          0 |
          5. |           Orissa          0 |
             |-----------------------------|
          6. |            Bihar          0 |
          8. | Himachal Pradesh          0 |
          9. |      Maharashtra          0 |
         10. |   Madhya Pradesh          0 |
         11. |    Uttar Pradesh          0 |
             |-----------------------------|
         12. |      West Bengal          0 |
         15. |        Rajasthan          0 |
         17. |           Others        100 |
         19. |  Jammu & Kashmir   33.33333 |
         27. |          Gujarat          0 |
             |-----------------------------|
         28. |      Uttarakhand          0 |
         29. |        Jharkhand          0 |
         33. |        Karnataka          0 |
         34. |           Punjab          0 |
             +-----------------------------+

        Comment


        • #5
          Thank you Nick Cox . But this is not giving me correct values. I confirmed by running the usual command I follow, which I wrote above. To calculate unemployment rate I simply write :
          gen unemployment_rate=1 if upss=="11"| upss=="12"| upss=="21"| upss=="31"| upss=="41"| upss=="42"| upss=="51"| upss=="71"| upss=="72"| upss=="81"

          replace unemployment_rate=2 if upss=="81"

          and then I use the tabulate command.
          In the code you wrote, there are no other codes of upss, except upss=="81".
          To rewrite my inquiry, we need to create a variable wanted equal to the percentage of unemployed (upss==81) out of total labour force (upss<="81").

          Comment


          • #6
            This is the logic behind my calculation. To get the percentage unemployed, get the mean of the expression upss == "81" which is 1 if true and 0 if false, and multiply by 100. This is just standard logic with indicators, as at say https://www.stata.com/support/faqs/d...rue-and-false/

            If your code gives you different results it is because the codes you give are not the complete story. There are codes you didn't mention or you have missing values that you want to exclude. If so, you need to spell that out. We can't see your data, so you have to give us all the information we need otherwise.

            This example shows that -- if the codes you mentioned are the only possibilities -- my calculation checks out;,

            Code:
             clear 
            
            . set obs 100 
            number of observations (_N) was 0, now 100
            
            . set seed 2803 
            
            . gen str2 test = word("11 12 21 31 41 42 51 71 72 81", runiformint(1, 10))
            
            . tab test 
            
                   test |      Freq.     Percent        Cum.
            ------------+-----------------------------------
                     11 |          7        7.00        7.00
                     12 |         13       13.00       20.00
                     21 |          9        9.00       29.00
                     31 |         14       14.00       43.00
                     41 |         11       11.00       54.00
                     42 |          8        8.00       62.00
                     51 |          6        6.00       68.00
                     71 |         14       14.00       82.00
                     72 |          5        5.00       87.00
                     81 |         13       13.00      100.00
            ------------+-----------------------------------
                  Total |        100      100.00
            
            . 
            . egen wanted = mean(100 * (test == "81")) 
            
            . 
            . su wanted 
            
                Variable |        Obs        Mean    Std. Dev.       Min        Max
            -------------+---------------------------------------------------------
                  wanted |        100          13           0         13         13
            
            .
            Code:
            
            

            Comment


            • #7
              Professor, it was my fault. UPSS has codes "11 12 21 31 41 42 51 71 72 81 91 92 93 94 95 97 99 " . I did not list codes beyond "81". Please consider the following dataset which has all relevant codes as in my dataset.
              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input str2 test
              "91"
              "31"
              "91"
              "11"
              "42"
              "93"
              "93"
              "21"
              "12"
              "31"
              "72"
              "21"
              "31"
              "71"
              "42"
              "95"
              "93"
              "91"
              "93"
              "97"
              "42"
              "72"
              "31"
              "95"
              "31"
              "72"
              "12"
              "42"
              "42"
              "51"
              "92"
              "71"
              "71"
              "71"
              "72"
              "72"
              "97"
              "95"
              "11"
              "95"
              "93"
              "97"
              "12"
              "91"
              "51"
              "72"
              "72"
              "71"
              "51"
              "42"
              "99"
              "42"
              "92"
              "71"
              "71"
              "12"
              "42"
              "95"
              "51"
              "99"
              "97"
              "81"
              "72"
              "91"
              "51"
              "72"
              "12"
              "99"
              "95"
              "31"
              "95"
              "99"
              "71"
              "42"
              "31"
              "95"
              "31"
              "71"
              "92"
              "31"
              "93"
              "81"
              "95"
              "31"
              "21"
              "95"
              "51"
              "71"
              "51"
              "31"
              "92"
              "99"
              "93"
              "97"
              "21"
              "99"
              "93"
              "92"
              "71"
              "91"
              end
              Thanks a lot.

              Comment


              • #8
                OK. Consider this


                .
                Code:
                 egen wanted = mean(cond(test <= "81", 100 * (test == "81"), .))
                
                . su wanted
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                      wanted |        100    3.333333           0   3.333333   3.333333
                
                . count if test == "81"
                  2
                
                . count if test <= "81"
                  60
                
                . di 200/60
                3.3333333
                Note that you can extend the calculation e.g. to


                Code:
                egen wanted2 = mean(cond(test <= "81", 100 * (test == "81"), .))  , by(state_tab) 
                
                egen wanted3 = mean(cond(test <= "81", 100 * (test == "81"), .))  , by(state_tab year)

                See also https://www.stata-journal.com/articl...article=dm0055 especially Section 9.

                Comment


                • #9
                  Thank you so much. It has worked. helped a lot.

                  Just in case I want to add weights analytic weights, where should i insert it ?

                  Comment


                  • #10
                    You need to calculate numerator and denominator separately, I imagine. Say

                    Code:
                      
                     egen double numer2 = total(cond(test <= "81", weight * (test == "81"), .))  , by(state_tab)   
                     egen double denom2 = total(cond(test <= "81", weight , .))  , by(state_tab)  gen double wanted2 = 100 * numer2 / denom2

                    Comment


                    • #11
                      Hi. Sorry, I think my query was misplaced. I already have a weight variable in my dataset, which is adjusted to its sampling design. It's say 'weight combined' , so given this, where should I insert the weight variable in the egen command.?
                      hope it clears .

                      Comment


                      • #12
                        Put the name of your weight variable where weight is in my code. Note that egen doesn't support [weight=exp] syntax, as is explicit in its syntax diagram.
                        Last edited by Nick Cox; 12 Nov 2020, 10:46.

                        Comment


                        • #13
                          It is so perfect now. I have gained a lot from your replies. Thank you so much.

                          Comment


                          • #14
                            Continuing on this Nick.. request for your help.. I want to calculate the share of wage workers (upss1==2|upss1==3) in each industry (nic_short) across all states (state_tab). I can do this for each state separately, and then feed the values in Stata. But I need a command similar to the above (egen function) to plug back the values of shares of workers in each industry in all states.
                            Example data set is produced below

                            Code:
                            * Example generated by -dataex-. To install: ssc install dataex
                            clear
                            input byte(upss1 nic_short state_tab)
                            3 5 1
                            3 6 1
                            2 9 1
                            3 5 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 5 1
                            3 1 1
                            3 5 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 1 1
                            2 6 1
                            3 1 1
                            2 6 1
                            3 1 1
                            3 5 1
                            3 5 1
                            3 1 1
                            3 1 1
                            2 2 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 5 1
                            3 1 1
                            3 5 1
                            3 5 1
                            3 9 1
                            3 5 1
                            3 1 1
                            3 5 1
                            3 1 1
                            3 5 1
                            3 1 1
                            2 9 1
                            3 5 1
                            3 1 1
                            3 3 1
                            2 1 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 1 1
                            3 5 1
                            3 5 1
                            3 5 1
                            2 9 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 2 1
                            3 3 1
                            3 1 1
                            3 3 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 3 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 1 1
                            3 5 1
                            3 1 1
                            end
                            label values upss1 act
                            label def act 2 "Regular worker", modify
                            label def act 3 "casual worker", modify
                            label values nic_short industry11
                            label def industry11 1 "Agriculture & allied", modify
                            label def industry11 2 "Mining & quarrying", modify
                            label def industry11 3 "Manufacturing", modify
                            label def industry11 5 "Construction", modify
                            label def industry11 6 "Trade, Rest & Hotels", modify
                            label def industry11 9 "Community Social & Personal Services", modify
                            label values state_tab tab_sta
                            label def tab_sta 1 "Andhra Pradesh", modify

                            Comment


                            • #15
                              Hi. I have sorted it. Your egen command only helped me with little bit of redefining. Thanks

                              Comment

                              Working...
                              X