Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • calculate the number of share

    my data looks like this and I want to calculate how many times did each bank lead in the year

    for example chase bank in 2000 lead 3 times and participants one time so the end table supposed to look like the second one

    Code:
    clear
    input byte(Loan_ID Bank_ID) str15 Lender str4 Lender_Role int year
     1 60 "chase"           "Lead" 2000
     2 60 "chase"           "Lead" 2000
     3 60 "chase"           "Lead" 2001
     4 60 "chase"           "Part" 2001
     4 61 "bank of america" "Lead" 2001
     5 61 "bank of america" "Lead" 2002
     5 62 "amc"             "Part" 2002
     1 62 "amc"             "Lead" 2000
     3 67 "5fth"            "Part" 2001
     6 67 "5fth"            "Lead" 2010
     6 63 "ag"              "Part" 2010
     7 64 "rev"             "Lead" 2000
     7 69 "op"              "part" 2000
     9 64 "rev"             "part" 2009
    10 62 "amc"             "lead" 2009
    11 60 "chase"           "part" 2001
    12 60 "chase"           "Lead" 2000
    13 60 "chase"           "part" 2000
    end


    the table should look like this

    Code:
    clear
    input byte(Loan_ID Bank_ID) str15 Lender str4 Lender_Role int year byte(numberofsharelead numberofsharepart)
     1 60 "chase"           "Lead" 2000 3 1
     2 60 "chase"           "Lead" 2000 3 1
     3 60 "chase"           "Lead" 2001 1 0
     4 60 "chase"           "Part" 2001 0 2
     4 61 "bank of america" "Lead" 2001 1 0
     5 61 "bank of america" "Lead" 2002 1 0
     5 62 "amc"             "Part" 2002 0 1
     1 62 "amc"             "Lead" 2000 1 0
     3 67 "5fth"            "Part" 2001 0 1
     6 67 "5fth"            "Lead" 2010 1 0
     6 63 "ag"              "Part" 2010 0 1
     7 64 "rev"             "Lead" 2000 1 0
     7 69 "op"              "part" 2000 0 1
     9 64 "rev"             "part" 2009 0 1
    10 62 "amc"             "lead" 2009 1 0
    11 60 "chase"           "part" 2001 0 2
    12 60 "chase"           "Lead" 2000 3 1
    13 60 "chase"           "part" 2000 3 1
    end

  • #2
    Here is one of those variables. The other is just a variation. Note that the inconsistency in your data example obliges a clean-up on the fly. In your real data, check on lower and upper case, leading and trailing spaces, and so forth.

    Code:
    clear
    input byte(Loan_ID Bank_ID) str15 Lender str4 Lender_Role int year
     1 60 "chase"           "Lead" 2000
     2 60 "chase"           "Lead" 2000
     3 60 "chase"           "Lead" 2001
     4 60 "chase"           "Part" 2001
     4 61 "bank of america" "Lead" 2001
     5 61 "bank of america" "Lead" 2002
     5 62 "amc"             "Part" 2002
     1 62 "amc"             "Lead" 2000
     3 67 "5fth"            "Part" 2001
     6 67 "5fth"            "Lead" 2010
     6 63 "ag"              "Part" 2010
     7 64 "rev"             "Lead" 2000
     7 69 "op"              "part" 2000
     9 64 "rev"             "part" 2009
    10 62 "amc"             "lead" 2009
    11 60 "chase"           "part" 2001
    12 60 "chase"           "Lead" 2000
    13 60 "chase"           "part" 2000
    end
    
    bysort Bank_ID year : egen wanted1 = total(lower(Lender_Role) == "lead") 
    
    list, sepby(Bank_ID year)
    
    
         +-----------------------------------------------------------------+
         | Loan_ID   Bank_ID            Lender   Lender~e   year   wanted1 |
         |-----------------------------------------------------------------|
      1. |      13        60             chase       part   2000         3 |
      2. |       2        60             chase       Lead   2000         3 |
      3. |      12        60             chase       Lead   2000         3 |
      4. |       1        60             chase       Lead   2000         3 |
         |-----------------------------------------------------------------|
      5. |      11        60             chase       part   2001         1 |
      6. |       4        60             chase       Part   2001         1 |
      7. |       3        60             chase       Lead   2001         1 |
         |-----------------------------------------------------------------|
      8. |       4        61   bank of america       Lead   2001         1 |
         |-----------------------------------------------------------------|
      9. |       5        61   bank of america       Lead   2002         1 |
         |-----------------------------------------------------------------|
     10. |       1        62               amc       Lead   2000         1 |
         |-----------------------------------------------------------------|
     11. |       5        62               amc       Part   2002         0 |
         |-----------------------------------------------------------------|
     12. |      10        62               amc       lead   2009         1 |
         |-----------------------------------------------------------------|
     13. |       6        63                ag       Part   2010         0 |
         |-----------------------------------------------------------------|
     14. |       7        64               rev       Lead   2000         1 |
         |-----------------------------------------------------------------|
     15. |       9        64               rev       part   2009         0 |
         |-----------------------------------------------------------------|
     16. |       3        67              5fth       Part   2001         0 |
         |-----------------------------------------------------------------|
     17. |       6        67              5fth       Lead   2010         1 |
         |-----------------------------------------------------------------|
     18. |       7        69                op       part   2000         0 |
         +-----------------------------------------------------------------+

    Comment


    • #3
      Moha:
      you should first amend all the inconsistencies in the way -Lead- and -Part- are reported:
      Code:
      . *(1 variable, 1 observation pasted into data editor)
      
      . *(1 variable, 1 observation pasted into data editor)
      
      . *(1 variable, 1 observation pasted into data editor)
      
      . *(1 variable, 1 observation pasted into data editor)
      
      . *(1 variable, 1 observation pasted into data editor)
      
      encode Lender_Role, g(num_Lender_Role)
      
      .. bysort Bank_ID year num_Lender_Role: gen wanted=_n if num_Lender_Role==1
      (8 missing values generated)
      
      . bysort Bank_ID year num_Lender_Role: replace wanted=_n if num_Lender_Role==2
      (8 real changes made)
      
      . list
      
           +---------------------------------------------------------------------------+
           | Loan_ID   Bank_ID            Lender   Lender~e   year   num_Le~e   wanted |
           |---------------------------------------------------------------------------|
        1. |       2        60             chase       Lead   2000       Lead        1 |
        2. |       1        60             chase       Lead   2000       Lead        2 |
        3. |      12        60             chase       Lead   2000       Lead        3 |
        4. |      13        60             chase       part   2000       Part        1 |
        5. |       3        60             chase       Lead   2001       Lead        1 |
           |---------------------------------------------------------------------------|
        6. |       4        60             chase       Part   2001       Part        1 |
        7. |      11        60             chase       part   2001       Part        2 |
        8. |       4        61   bank of america       Lead   2001       Lead        1 |
        9. |       5        61   bank of america       Lead   2002       Lead        1 |
       10. |       1        62               amc       Lead   2000       Lead        1 |
           |---------------------------------------------------------------------------|
       11. |       5        62               amc       Part   2002       Part        1 |
       12. |      10        62               amc       lead   2009       Lead        1 |
       13. |       6        63                ag       Part   2010       Part        1 |
       14. |       7        64               rev       Lead   2000       Lead        1 |
       15. |       9        64               rev       part   2009       Part        1 |
           |---------------------------------------------------------------------------|
       16. |       3        67              5fth       Part   2001       Part        1 |
       17. |       6        67              5fth       Lead   2010       Lead        1 |
       18. |       7        69                op       part   2000       Part        1 |
           +---------------------------------------------------------------------------+
      
      .
      If you're interested in the total, Nick's helpful example is the way to go.
      Last edited by Carlo Lazzaro; 16 Jan 2022, 06:51.
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment

      Working...
      X