Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • creating a variable based on status and industry membership

    Dear Statalist,

    I have a categorical variable, status, which takes the value of 1, 2 or 3 for mandatory, voluntary and non-disclosure respectively. I also have another variable, industry, indicating which industry the firm belongs to (from 1 to 13). I want to create a dummy variable which = 1 if status==2 and the firm is in the same industry as other firms with status==3, and 0 otherwise. Or alternatively, I could use a categorical variable which = 1 if status==2 and the firm is in the same industry as other firms with status==3, and = 2 if status==3 and the firm is in the same industry as other firms with status==2. I don't know if this is possible, and if so, how to write the code for them.

    Thanks a lot in advance for any kind help!

  • #2
    Hello Flora. Here is my understanding of your question.

    status:
    1=mandatory
    2=voluntary
    3=non-disclosure

    industry: values 1-13

    You want to do this:
    Code:
    generate byte flag = status==2 if {some condition is true}
    You said that the condition = "same industry as other firms with status==3".

    If I follow, you want to generate a list of industry codes for industries that have at least one firm with status==3. Is that right?

    If it is right, then perhaps you want something like this?

    Code:
    generate byte flag = status==2 if inlist(industry,a,b,c,...)
    ...replacing a,b,c, etc with the relevant industry codes.

    In addition to indicating whether I understood correctly or not, please use -dataex- to provide a small sample dataset. (See the FAQ for details about how to do that.)

    HTH.


    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment


    • #3
      Flora:
      as an aside to Bruce's helpul reply, check if at least a part of your query can be satisfied by the -group- function available from -egen-.
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        Hi Bruce and Carlo,

        Thanks a lot for your reply. Here is an example of my data, for simplicity, I only include industries 1-3, status 2 (voluntary) and 3 (non-disclosure), and year 2009. Symbol is used to identify each individual firm. In my original data, there are 11 years (2009-2019) and 13 industries (1-13).

        I want to create a dummy variable which =1 if the firm has status==2 and at least one of its industry peer firms (other firms in the same industry) in the same year is with status==3; and 0 otherwise. I realize I do need to specify "at least one" industry peer, otherwise it would be misleading.

        Code:
        clear
        input float Year long(Symbol industry status)
        2009   2286  3 3
        2009   2013  2 3
        2009   2287  1 2
        2009   2020  1 3
        2009   2170  3 3
        2009   2321  1 3
        2009   2172  2 2
        2009   2045  2 3
        2009   2097  3 2
        2009   2114  3 3
        2009   2053  1 2
        2009   2019  1 2
        2009   2067  2 2
        2009   2006  3 2
        Thanks!

        Comment


        • #5
          Flora:
          you may want to try:
          Code:
          . bysort industry Year: gen wanted=1 if status==2 | status==3
          
          . bysort industry Year: gen wanted=1 if status==2 | status==3
          
          . replace wanted=0 if status!=2
          
          . list
          
               +--------------------------------------------+
               | Year   Symbol   industry   status   wanted |
               |--------------------------------------------|
            1. | 2009     2020          1        3        0 |
            2. | 2009     2287          1        2        1 |
            3. | 2009     2321          1        3        0 |
            4. | 2009     2019          1        2        1 |
            5. | 2009     2053          1        2        1 |
               |--------------------------------------------|
            6. | 2009     2013          2        3        0 |
            7. | 2009     2045          2        3        0 |
            8. | 2009     2172          2        2        1 |
            9. | 2009     2067          2        2        1 |
           10. | 2009     2170          3        3        0 |
               |--------------------------------------------|
           11. | 2009     2006          3        2        1 |
           12. | 2009     2286          3        3        0 |
           13. | 2009     2097          3        2        1 |
           14. | 2009     2114          3        3        0 |
               +--------------------------------------------+
          
          .
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            Thanks Carlo that works perfectly!

            Comment


            • #7
              Flora:
              happy with reading that it works for you.
              Due to a copy and paste mishap, the first line of the code was reported twice (now amended). Sorry for that.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                @Carlo Lazzaro's code can be simplified to


                Code:
                gen wanted = status==2 
                and that says nothing about the status of any other firms.

                This may get closer to what you want.

                Code:
                bysort industry year : egen any3 = max(status == 3)
                
                gen wanted = status == 2 & any3
                given that (as stated in #4 but not #1) you want comparisons to be for the same industry in the same year. If you wanted to look across years, just don't mention year in the code.

                See also https://www.stata.com/support/faqs/d...ble-recording/ for the "any" technique here.

                This would also work

                Code:
                gen any3 = status == 3
                bysort industry year (any3) : replace any3 = any3[_N]

                Comment


                • #9
                  Thanks Nick they both work!

                  Comment


                  • #10
                    If they both work that is because empirically there is always a form of status 3 whenever there is one of status 2 in the same industry and year. As said, the code in #5 pays no attention to any other firm.

                    Comment


                    • #11
                      Yes exactly... There are always both status 2 and 3 in the same industry in a year.
                      Last edited by Flora Yin; 12 Feb 2022, 08:54.

                      Comment

                      Working...
                      X