Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find changes within firms' structure in longitudinal data (ownership type & industry)

    Dear all,

    Greetings!

    I am struggling to find a way to count unique firms which could change their ownership type and industry throughout the time period in my dataset (2012-2018). I have created a unique id for each firm, but I am stuck at how I can identify whether there were some changes throughout the years.

    Any advice will be greatly appreciated!

    (okfs=ownership type; ind=industry)

    Code:
    clear
    input float(id year) byte okfs float ind
    1 2014 34 60
    1 2015 34 51
    1 2016 34 46
    1 2017 34 46
    1 2018 34 46
    2 2012 23 70
    2 2013 23 70
    2 2018 23 68
    3 2016 24 41
    3 2017 24 41
    3 2018 24 41
    4 2017 24 41
    5 2016 34 45
    5 2017 34 45
    6 2016 34 68
    6 2017 34 46
    7 2017 24 56
    7 2018 24 56
    8 2017 24 41
    9 2016 24 47
    9 2017 24 47
    10 2016 34 46
    10 2017 34 46
    10 2018 34 46
    11 2016 24 47
    12 2013 21 85
    13 2016 24 41
    14 2016 24 47
    14 2017 24 25
    14 2018 24 25
    15 2016 24 1
    15 2017 24 1
    16 2016 23 1
    16 2017 27 1
    16 2018 27 1
    17 2016 23 1
    17 2017 27 1
    17 2018 27 1
    18 2016 24 47
    18 2017 24 47
    18 2018 24 47
    19 2016 24 41
    20 2016 34 45
    20 2017 34 45
    20 2018 34 45
    21 2016 24 81
    21 2017 24 81
    21 2018 24 81
    22 2016 34 70
    23 2016 24 81
    24 2016 24 46
    25 2016 24 46
    25 2017 24 46
    26 2016 23 46
    27 2016 24 46
    27 2017 24 46
    28 2014 21 85
    28 2015 21 85
    28 2016 21 88
    28 2017 21 88
    28 2018 21 88
    29 2016 24 41
    29 2017 24 41
    30 2016 24 56
    30 2017 24 56
    30 2018 24 56
    31 2016 24 46
    31 2017 24 46
    31 2018 24 46
    32 2016 34 3
    32 2017 34 3
    33 2016 34 71
    33 2017 34 71
    33 2018 34 71
    34 2016 23 1
    34 2017 27 1
    34 2018 27 1
    35 2016 34 79
    35 2017 34 79
    35 2018 34 79
    36 2016 24 8
    37 2016 24 46
    38 2015 18 91
    38 2017 18 94
    38 2018 18 94
    39 2016 24 45
    39 2017 24 45
    39 2018 24 45
    40 2016 34 28
    40 2017 34 28
    40 2018 34 28
    41 2016 24 41
    42 2017 23 46
    43 2016 24 22
    43 2017 24 22
    43 2018 24 46
    44 2016 34 74
    44 2017 34 71
    45 2016 34 41
    45 2018 34 41
    end
    Last edited by Iuliia Svetetskaia; 13 Mar 2022, 07:16.

  • #2
    Thanks for the data example.

    See the FAQ https://www.stata.com/support/faqs/d...ions-in-group/

    There are at least two interpretations of change, (1) whether last and first values differ for each firm and (2) whether there has been any change within the period, which could mean a firm changing one way or another but ending at the same value as at first. The second is perhaps unlikely here, but here is code regardless.

    Code:
    clear
    input float(id year) byte okfs float ind
    1 2014 34 60
    1 2015 34 51
    1 2016 34 46
    1 2017 34 46
    1 2018 34 46
    2 2012 23 70
    2 2013 23 70
    2 2018 23 68
    3 2016 24 41
    3 2017 24 41
    3 2018 24 41
    4 2017 24 41
    5 2016 34 45
    5 2017 34 45
    6 2016 34 68
    6 2017 34 46
    7 2017 24 56
    7 2018 24 56
    8 2017 24 41
    9 2016 24 47
    9 2017 24 47
    10 2016 34 46
    10 2017 34 46
    10 2018 34 46
    11 2016 24 47
    12 2013 21 85
    13 2016 24 41
    14 2016 24 47
    14 2017 24 25
    14 2018 24 25
    15 2016 24 1
    15 2017 24 1
    16 2016 23 1
    16 2017 27 1
    16 2018 27 1
    17 2016 23 1
    17 2017 27 1
    17 2018 27 1
    18 2016 24 47
    18 2017 24 47
    18 2018 24 47
    19 2016 24 41
    20 2016 34 45
    20 2017 34 45
    20 2018 34 45
    21 2016 24 81
    21 2017 24 81
    21 2018 24 81
    22 2016 34 70
    23 2016 24 81
    24 2016 24 46
    25 2016 24 46
    25 2017 24 46
    26 2016 23 46
    27 2016 24 46
    27 2017 24 46
    28 2014 21 85
    28 2015 21 85
    28 2016 21 88
    28 2017 21 88
    28 2018 21 88
    29 2016 24 41
    29 2017 24 41
    30 2016 24 56
    30 2017 24 56
    30 2018 24 56
    31 2016 24 46
    31 2017 24 46
    31 2018 24 46
    32 2016 34 3
    32 2017 34 3
    33 2016 34 71
    33 2017 34 71
    33 2018 34 71
    34 2016 23 1
    34 2017 27 1
    34 2018 27 1
    35 2016 34 79
    35 2017 34 79
    35 2018 34 79
    36 2016 24 8
    37 2016 24 46
    38 2015 18 91
    38 2017 18 94
    38 2018 18 94
    39 2016 24 45
    39 2017 24 45
    39 2018 24 45
    40 2016 34 28
    40 2017 34 28
    40 2018 34 28
    41 2016 24 41
    42 2017 23 46
    43 2016 24 22
    43 2017 24 22
    43 2018 24 46
    44 2016 34 74
    44 2017 34 71
    45 2016 34 41
    45 2018 34 41
    end
    
    bysort id (year) : gen diff_o = okfs[1] != okfs[_N]
    
    by id : gen diff_i = ind[1] != ind[_N] 
    
    bysort id (okfs) : gen diff_o_2 = okfs[1] != okfs[_N] 
    
    bysort id (ind) : gen diff_i_2 = ind[1] != ind[_N] 
    
    tabdisp id, c(diff*)
    
    
    ----------------------------------------------------------
           id |     diff_o      diff_i    diff_o_2    diff_i_2
    ----------+-----------------------------------------------
            1 |          0           1           0           1
            2 |          0           1           0           1
            3 |          0           0           0           0
            4 |          0           0           0           0
            5 |          0           0           0           0
            6 |          0           1           0           1
            7 |          0           0           0           0
            8 |          0           0           0           0
            9 |          0           0           0           0
           10 |          0           0           0           0
           11 |          0           0           0           0
           12 |          0           0           0           0
           13 |          0           0           0           0
           14 |          0           1           0           1
           15 |          0           0           0           0
           16 |          1           0           1           0
           17 |          1           0           1           0
           18 |          0           0           0           0
           19 |          0           0           0           0
           20 |          0           0           0           0
           21 |          0           0           0           0
           22 |          0           0           0           0
           23 |          0           0           0           0
           24 |          0           0           0           0
           25 |          0           0           0           0
           26 |          0           0           0           0
           27 |          0           0           0           0
           28 |          0           1           0           1
           29 |          0           0           0           0
           30 |          0           0           0           0
           31 |          0           0           0           0
           32 |          0           0           0           0
           33 |          0           0           0           0
           34 |          1           0           1           0
           35 |          0           0           0           0
           36 |          0           0           0           0
           37 |          0           0           0           0
           38 |          0           1           0           1
           39 |          0           0           0           0
           40 |          0           0           0           0
           41 |          0           0           0           0
           42 |          0           0           0           0
           43 |          0           1           0           1
           44 |          0           1           0           1
           45 |          0           0           0           0
    ----------------------------------------------------------
    
    assert diff_o == diff_o_2 
    
    assert diff_i == diff_i_2
    In the data example, the answer is the same for each original variable.

    Comment


    • #3
      Thank you so much, Nick!

      May I ask another question?

      Can I make this more conditional? So, for example, can I make it so the difference variable will be equal to 1 only if there was a change of ownership type (okfs) from "okfs==31 | okfs==32 | okfs==33 | okfs==34" to "okfs==21 | okfs==22 | okfs==23 | okfs==24 | okfs==27"?

      Thank you in advance!


      Code:
      clear
      input float(id year) byte okfs float ind
       1 2014 34 59
       1 2015 34 51
       1 2016 34 46
       1 2017 34 46
       1 2018 34 46
       2 2012 23 69
       2 2013 23 69
       2 2018 23 68
       3 2016 24 41
       3 2017 24 41
       3 2018 24 41
       4 2017 24 41
       5 2016 34 45
       5 2017 34 45
       6 2016 34 68
       6 2017 34 46
       7 2017 24 55
       7 2018 24 55
       8 2017 24 41
       9 2016 24 47
       9 2017 24 47
      10 2016 34 46
      10 2017 34 46
      10 2018 34 46
      11 2016 24 47
      12 2013 21 85
      13 2016 24 41
      14 2016 24 47
      14 2017 24 25
      14 2018 24 25
      15 2016 24  1
      15 2017 24  1
      16 2016 23  1
      16 2017 27  1
      16 2018 27  1
      17 2016 23  1
      17 2017 27  1
      17 2018 27  1
      18 2016 24 47
      18 2017 24 47
      18 2018 24 47
      19 2016 24 41
      20 2016 34 45
      20 2017 34 45
      20 2018 34 45
      21 2016 24 80
      21 2017 24 80
      21 2018 24 80
      22 2016 34 69
      23 2016 24 80
      24 2016 24 46
      25 2016 24 46
      25 2017 24 46
      26 2016 23 46
      27 2016 24 46
      27 2017 24 46
      28 2014 21 85
      28 2015 21 85
      28 2016 21 87
      28 2017 21 87
      28 2018 21 87
      29 2016 24 41
      29 2017 24 41
      30 2016 24 55
      30 2017 24 55
      30 2018 24 55
      31 2016 24 46
      31 2017 24 46
      31 2018 24 46
      32 2016 34  3
      32 2017 34  3
      33 2016 34 71
      33 2017 34 71
      33 2018 34 71
      34 2016 23  1
      34 2017 27  1
      34 2018 27  1
      35 2016 34 79
      35 2017 34 79
      35 2018 34 79
      36 2016 24  5
      37 2016 24 46
      38 2015 18 90
      38 2017 18 94
      38 2018 18 94
      39 2016 24 45
      39 2017 24 45
      39 2018 24 45
      40 2016 34 28
      40 2017 34 28
      40 2018 34 28
      41 2016 24 41
      42 2017 23 46
      43 2016 24 22
      43 2017 24 22
      43 2018 24 46
      44 2016 34 74
      44 2017 34 71
      45 2016 34 41
      45 2018 34 41
      46 2016 24 41
      46 2017 24 41
      46 2018 24 41
      47 2016 24 46
      47 2017 24 46
      47 2018 24 46
      48 2016 34 46
      48 2017 34 46
      48 2018 34 46
      49 2017 23  1
      49 2018 23  1
      50 2016 24 45
      50 2017 24 45
      50 2018 24 45
      51 2016 24 47
      51 2017 24 47
      51 2018 24 47
      52 2018 24 33
      53 2016 24 96
      54 2016 24  1
      54 2017 24  1
      54 2018 24  1
      55 2016 23 46
      56 2016 24 62
      57 2016 24 41
      58 2016 24 52
      58 2017 24 52
      58 2018 24 52
      59 2016 24 47
      60 2016 24 47
      60 2017 24 47
      60 2018 24 47
      61 2016 24 80
      62 2016 24  1
      62 2017 24  1
      62 2018 24  1
      63 2016 24 55
      64 2016 24 25
      64 2017 24 25
      64 2018 24 25
      65 2016 23 68
      65 2017 23 68
      65 2018 23 68
      66 2017 34 37
      67 2016 24 46
      67 2017 24 46
      67 2018 24 46
      68 2016 24 46
      69 2016 34 46
      69 2017 34 46
      69 2018 34 46
      70 2016 24  1
      71 2016 34 62
      71 2017 34 62
      72 2016 24 55
      73 2016 34 41
      73 2017 34 41
      73 2018 34 41
      74 2017 24 41
      75 2016 34 94
      75 2017 34 94
      75 2018 34 94
      76 2016 34 46
      77 2017 24 45
      77 2018 24 45
      78 2016 24 80
      79 2016 23 46
      79 2017 23 46
      79 2018 23 46
      80 2016 34 45
      81 2017 24 41
      81 2018 24 41
      82 2016 24 45
      82 2017 24 45
      82 2018 24 45
      83 2017 34 49
      84 2014 24 45
      85 2015 23 52
      86 2018 24 80
      87 2016 24 64
      87 2018 24 64
      88 2014 23 52
      88 2015 18 52
      88 2016 24 47
      88 2017 24 47
      89 2014 24 51
      89 2015 24 51
      90 2014 34 51
      90 2015 34 51
      91 2015 24 45
      91 2016 24 45
      91 2017 24 41
      91 2018 24 45
      92 2016 24 52
      92 2017 24 52
      92 2018 24 52
      93 2015 24 51
      93 2017 24 46
      93 2018 24 46
      94 2018 24  1
      end

      Comment


      • #4
        Surely. I would construct a new variable first.

        Code:
        . gen which = cond(inrange(okfs, 31, 34), 1, cond(inlist(okfs, 21, 22, 23, 24, 27), 2, 3))
        
        . tab okfs which
        
                   |              which
              okfs |         1          2          3 |     Total
        -----------+---------------------------------+----------
                18 |         0          0          3 |         3 
                21 |         0          6          0 |         6 
                23 |         0          8          0 |         8 
                24 |         0         46          0 |        46 
                27 |         0          6          0 |         6 
                34 |        31          0          0 |        31 
        -----------+---------------------------------+----------
             Total |        31         66          3 |       100

        Comment


        • #5
          Originally posted by Nick Cox View Post
          Surely. I would construct a new variable first.

          Code:
          . gen which = cond(inrange(okfs, 31, 34), 1, cond(inlist(okfs, 21, 22, 23, 24, 27), 2, 3))
          
          . tab okfs which
          
          | which
          okfs | 1 2 3 | Total
          -----------+---------------------------------+----------
          18 | 0 0 3 | 3
          21 | 0 6 0 | 6
          23 | 0 8 0 | 8
          24 | 0 46 0 | 46
          27 | 0 6 0 | 6
          34 | 31 0 0 | 31
          -----------+---------------------------------+----------
          Total | 31 66 3 | 100

          Thank you again, Nick!

          The thing is I have tried out these codes a couple of times, but I can't achieve what I want to see specifically.

          I basically need to identify unique firms which have changed their type of ownership from "okfs==31 | okfs==32 | okfs==33 | okfs==34" into "okfs==18 | okfs==21 | okfs==22 | okfs==23 | okfs==24 | okfs==27" throughout 2012-2018 period. I have generated the "which" variable in attempt to be more specific, but the difference variable gets displayed as 1 in both cases, so whether 34 has been changed to 21, or 21 has been changed to 34 - there is still the value of 1. Therefore, I cannot differentiate these observations and count the firms which had this specific change from "okfs==31 | okfs==32 | okfs==33 | okfs==34" into "okfs==18 | okfs==21 | okfs==22 | okfs==23 | okfs==24 | okfs==27".

          Maybe there is no solution to this, but I was hoping that maybe there is something that could have been figured out. If you have any suggestions, please share them~

          Thank you in advance!

          Comment


          • #6
            I can't comment easily on what code you tried as you don't show any. But in principle once you've got your coarsened classification you can look at which firms changed. The data example is boosted by singleton panels with just one year of data and thus no scope for change. But the other firms in the data example don't change at all.


            Code:
            . gen which = cond(inrange(okfs, 31, 34), 1, cond(inlist(okfs, 21, 22, 23, 24, 27), 2, 3))
            
            * first year  
            . bysort id (year) : gen start = which[1]
            
            * last year 
            . by id: gen stop = which[_N]
            
            * tag singleton panels 
            . by id: gen singleton = _N == 1 
            
            . * we want to count firms not observations 
            . egen tag = tag(id)
            
            . * did any firms change? 
            . tab start stop if tag 
            
                       |               stop
                 start |         1          2          3 |     Total
            -----------+---------------------------------+----------
                     1 |        22          0          0 |        22 
                     2 |         0         71          0 |        71 
                     3 |         0          0          1 |         1 
            -----------+---------------------------------+----------
                 Total |        22         71          1 |        94 
            
            .* ignore singletons 
            . tab start stop if tag & !singleton 
            
                       |               stop
                 start |         1          2          3 |     Total
            -----------+---------------------------------+----------
                     1 |        17          0          0 |        17 
                     2 |         0         41          0 |        41 
                     3 |         0          0          1 |         1 
            -----------+---------------------------------+----------
                 Total |        17         41          1 |        59

            Comment


            • #7
              Originally posted by Nick Cox View Post
              I can't comment easily on what code you tried as you don't show any. But in principle once you've got your coarsened classification you can look at which firms changed. The data example is boosted by singleton panels with just one year of data and thus no scope for change. But the other firms in the data example don't change at all.


              Code:
              . gen which = cond(inrange(okfs, 31, 34), 1, cond(inlist(okfs, 21, 22, 23, 24, 27), 2, 3))
              
              * first year
              . bysort id (year) : gen start = which[1]
              
              * last year
              . by id: gen stop = which[_N]
              
              * tag singleton panels
              . by id: gen singleton = _N == 1
              
              . * we want to count firms not observations
              . egen tag = tag(id)
              
              . * did any firms change?
              . tab start stop if tag
              
              | stop
              start | 1 2 3 | Total
              -----------+---------------------------------+----------
              1 | 22 0 0 | 22
              2 | 0 71 0 | 71
              3 | 0 0 1 | 1
              -----------+---------------------------------+----------
              Total | 22 71 1 | 94
              
              .* ignore singletons
              . tab start stop if tag & !singleton
              
              | stop
              start | 1 2 3 | Total
              -----------+---------------------------------+----------
              1 | 17 0 0 | 17
              2 | 0 41 0 | 41
              3 | 0 0 1 | 1
              -----------+---------------------------------+----------
              Total | 17 41 1 | 59

              This works just right!

              Thank you so much, Nick!

              Comment

              Working...
              X