Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate a weight Matrix to equalize total values across codes with many to many relationship


    Hello, I have data with total values for a given code. For example, for code1="0129", the total value of transaction is (value1) 7560000. Similarly, for code2="0168", the total value of transaction is 897000.
    Code1 and Code2 have many to many relationships, at times more complex than the posted data show.
    Given the total values, is there a way to generate a weight matrix to help identify what share of 0342 (code1) is going into 0342 (code2)? In other words, is there a way to generate weights for each unique combination of code1 and code2?

    Many thanks.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float year str4(code1 code2) double(value1 value2)
    1988 "0129" "0168"   7560000   8957000
    1988 "0149" "0168" 397668016   8957000
    1988 "0149" "0173" 397668016  22163000
    1988 "0149" "0174" 397668016 114397000
    1988 "0149" "0175" 397668016 169517000
    1988 "0149" "0176" 397668016  86599000
    1988 "0149" "0179" 397668016   3329000
    1988 "0342" "0342" 142146000 127989000
    1988 "0344" "0342" 299409000 127989000
    1988 "0343" "0345"  37468000  52547000
    1988 "0342" "0345" 142146000  52547000
    1988 "0149" "0981" 397668016  10590000
    end
    Last edited by Md Rahman; 29 Mar 2024, 16:11.

  • #2
    No Mata necessary:

    Code:
    . * Example generated by -dataex-. For more info, type help dataex
    . clear
    
    . input float year str4(code1 code2) double(value1 value2)
    
              year      code1      code2      value1      value2
      1. 1988 "0129" "0168"   7560000   8957000
      2. 1988 "0149" "0168" 397668016   8957000
      3. 1988 "0149" "0173" 397668016  22163000
      4. 1988 "0149" "0174" 397668016 114397000
      5. 1988 "0149" "0175" 397668016 169517000
      6. 1988 "0149" "0176" 397668016  86599000
      7. 1988 "0149" "0179" 397668016   3329000
      8. 1988 "0342" "0342" 142146000 127989000
      9. 1988 "0344" "0342" 299409000 127989000
     10. 1988 "0343" "0345"  37468000  52547000
     11. 1988 "0342" "0345" 142146000  52547000
     12. 1988 "0149" "0981" 397668016  10590000
     13. end
    
    .
    . bys year code1 : egen w1 = total(value1)
    
    . replace w1 = value1/w1
    (12 real changes made)
    
    .
    . bys year code2 : egen w2 = total(value2)
    
    . replace w2 = value2/w2
    (12 real changes made)
    
    .
    . sort code1 code2
    
    . list , sepby(code1)
    
         +--------------------------------------------------------------+
         | year   code1   code2      value1      value2         w1   w2 |
         |--------------------------------------------------------------|
      1. | 1988    0129    0168     7560000     8957000          1   .5 |
         |--------------------------------------------------------------|
      2. | 1988    0149    0168   3.977e+08     8957000   .1428571   .5 |
      3. | 1988    0149    0173   3.977e+08    22163000   .1428571    1 |
      4. | 1988    0149    0174   3.977e+08   1.144e+08   .1428571    1 |
      5. | 1988    0149    0175   3.977e+08   1.695e+08   .1428571    1 |
      6. | 1988    0149    0176   3.977e+08    86599000   .1428571    1 |
      7. | 1988    0149    0179   3.977e+08     3329000   .1428571    1 |
      8. | 1988    0149    0981   3.977e+08    10590000   .1428571    1 |
         |--------------------------------------------------------------|
      9. | 1988    0342    0342   1.421e+08   1.280e+08         .5   .5 |
     10. | 1988    0342    0345   1.421e+08    52547000         .5   .5 |
         |--------------------------------------------------------------|
     11. | 1988    0343    0345    37468000    52547000          1   .5 |
         |--------------------------------------------------------------|
     12. | 1988    0344    0342   2.994e+08   1.280e+08          1   .5 |
         +--------------------------------------------------------------+
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Hi, Maarten. Thanks. I may have failed to explain the problem properly.

      value1 represent total value for a given code1. Value2 is defined analogously. Thus, the first step you have shown is not necessary.

      To be precise, I need one column of weights. That is weights for each unique combination of code1 and code2. For example, code1="0129" is fully absorbed in code2="0168"; therefore, weight for this combination is 1. On the other hand, value for code2="0173" is surely a fraction of the value for code1="0149", thus, weight = value2/value1. These simple methods cannot populate all the weights. Thus, I need an algorithm that will.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float year str4 code1 double value1 str4 code2 double value2 float weight
      1988 "0129"   7560000 "0168"   8957000          1
      1988 "0149" 397668016 "0168"   8957000          .
      1988 "0149" 397668016 "0173"  22163000  .05573242
      1988 "0149" 397668016 "0174" 114397000   .2876696
      1988 "0149" 397668016 "0175" 169517000   .4262777
      1988 "0149" 397668016 "0176"  86599000  .21776707
      1988 "0149" 397668016 "0179"   3329000 .008371305
      1988 "0980" 989020992 "0223" 161852000          .
      1988 "0344" 299409000 "0342" 127989000          .
      1988 "0342" 142146000 "0342" 127989000          .
      1988 "0344" 299409000 "0344" 298487008   .9969206
      1988 "0342" 142146000 "0345"  52547000          .
      1988 "0343"  37468000 "0345"  52547000          1
      1988 "0483" 164278000 "0483" 101767000   .6194804
      1988 "0488" 278159000 "0485"  63296000   .2275533
      1988 "0980" 989020992 "0546" 213323000          .
      1988 "0980" 989020992 "0567" 950438016          .
      1988 "0583"  95841000 "0581"  93251000   .9729761
      1988 "0980" 989020992 "0743"   4548000          .
      1988 "0980" 989020992 "0981"  10590000          .
      1988 "0583"  95841000 "0981"  10590000          .
      1988 "0149" 397668016 "0981"  10590000          .
      1988 "0980" 989020992 "0984" 163091000  .16490145
      1988 "0980" 989020992 "0985"  39681000   .0401215
      1988 "0980" 989020992 "0986"  30539000  .03087801
      1988 "0980" 989020992 "0989" 935064992          .
      1988 "0483" 164278000 "0989" 935064992          .
      1988 "0488" 278159000 "0989" 935064992          .
      end
      Last edited by Md Rahman; 30 Mar 2024, 08:42.

      Comment

      Working...
      X