Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Percent difference between many observations and one specific observation

    Dear all,

    I have learned how to make calculations between two observations belonging to the same variable, using the [_n] and [_N] notations.
    However I cannot figure the code when I need to make a calculation (for instance a percent difference) between several observations and a specific observation, all belonging to the same variable.

    For instance in the below dataset
    Code:
     
    cty Scenario Year Value
    Bangladesh GFDL379 2030 16,567.12
    Bangladesh IPSL379 2030 16,596.36
    Bangladesh MPI379 2030 16,578.25
    Bangladesh MRI379 2030 16,570.33
    Bangladesh REFar6_NoCC 2030 16,560.81
    Cambodia GFDL379 2030 4,387.49
    Cambodia IPSL379 2030 4,391.17
    Cambodia MPI379 2030 4,387.29
    Cambodia MRI379 2030 4,389.80
    Cambodia REFar6_NoCC 2030 4,383.41
    Bangladesh GFDL379 2050 17,152.99
    Bangladesh IPSL379 2050 17,239.88
    Bangladesh MPI379 2050 17,183.69
    Bangladesh MRI379 2050 17,139.80
    Bangladesh REFar6_NoCC 2050 17,144.19
    Cambodia GFDL379 2050 4,543.16
    Cambodia IPSL379 2050 4,549.64
    Cambodia MPI379 2050 4,542.84
    Cambodia MRI379 2050 4,551.33
    Cambodia REFar6_NoCC 2050 4,535.17
    .... for each country and each year (2030 and 2050) I need to calculate the percent difference between all the scenarios and REF_ar6_NoCC
    For instance for Bangladesh in 2030 I need to calculate the % difference between GFDL and REF, IPSL and REF, MPI and REF, and MRI and REF.
    Same for 2050, and then the same for Cambodia in 2030 and 2050.

    Any advice or suggestion is very welcome.

    thanks
    Nicola

  • #2
    EDITED: Below, "Scenario" is assumed to be a string variable. If it is a numerical variable with value labels, you need to modify the code. The technique for generating the target variable is described in https://journals.sagepub.com/doi/pdf...867X1101100210.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str10 cty str11 Scenario int Year float Value
    "Bangladesh" "GFDL379"     2030 16567.12
    "Bangladesh" "MRI379"      2030 16570.33
    "Bangladesh" "REFar6_NoCC" 2030 16560.81
    "Bangladesh" "IPSL379"     2030 16596.36
    "Bangladesh" "MPI379"      2030 16578.25
    "Bangladesh" "GFDL379"     2050 17152.99
    "Bangladesh" "MPI379"      2050 17183.69
    "Bangladesh" "MRI379"      2050  17139.8
    "Bangladesh" "IPSL379"     2050 17239.88
    "Bangladesh" "REFar6_NoCC" 2050 17144.19
    "Cambodia"   "MRI379"      2030   4389.8
    "Cambodia"   "IPSL379"     2030  4391.17
    "Cambodia"   "GFDL379"     2030  4387.49
    "Cambodia"   "REFar6_NoCC" 2030  4383.41
    "Cambodia"   "MPI379"      2030  4387.29
    "Cambodia"   "IPSL379"     2050  4549.64
    "Cambodia"   "REFar6_NoCC" 2050  4535.17
    "Cambodia"   "GFDL379"     2050  4543.16
    "Cambodia"   "MRI379"      2050  4551.33
    "Cambodia"   "MPI379"      2050  4542.84
    end
    
    
    bys cty Year: egen target= max(cond(Scenario=="REFar6_NoCC", Value, .))
    gen wanted= ((Value-target)/Value)*100 if Scenario!="REFar6_NoCC"

    Res.:

    Code:
    .  l, sepby(cty Year)
    
         +------------------------------------------------------------------+
         |        cty      Scenario   Year     Value     target      wanted |
         |------------------------------------------------------------------|
      1. | Bangladesh       GFDL379   2030   16567.1   16560.81     .038079 |
      2. | Bangladesh        MRI379   2030   16570.3   16560.81    .0574493 |
      3. | Bangladesh   REFar6_NoCC   2030   16560.8   16560.81           . |
      4. | Bangladesh       IPSL379   2030   16596.4   16560.81    .2141965 |
      5. | Bangladesh        MPI379   2030   16578.3   16560.81    .1051948 |
         |------------------------------------------------------------------|
      6. | Bangladesh       GFDL379   2050     17153   17144.19    .0513076 |
      7. | Bangladesh        MPI379   2050   17183.7   17144.19    .2298691 |
      8. | Bangladesh        MRI379   2050   17139.8   17144.19   -.0256052 |
      9. | Bangladesh       IPSL379   2050   17239.9   17144.19    .5550584 |
     10. | Bangladesh   REFar6_NoCC   2050   17144.2   17144.19           . |
         |------------------------------------------------------------------|
     11. |   Cambodia        MRI379   2030    4389.8    4383.41    .1455567 |
     12. |   Cambodia       IPSL379   2030   4391.17    4383.41    .1767129 |
     13. |   Cambodia       GFDL379   2030   4387.49    4383.41    .0929934 |
     14. |   Cambodia   REFar6_NoCC   2030   4383.41    4383.41           . |
     15. |   Cambodia        MPI379   2030   4387.29    4383.41    .0884346 |
         |------------------------------------------------------------------|
     16. |   Cambodia       IPSL379   2050   4549.64    4535.17    .3180518 |
     17. |   Cambodia   REFar6_NoCC   2050   4535.17    4535.17           . |
     18. |   Cambodia       GFDL379   2050   4543.16    4535.17    .1758739 |
     19. |   Cambodia        MRI379   2050   4551.33    4535.17    .3550645 |
     20. |   Cambodia        MPI379   2050   4542.84    4535.17    .1688354 |
         +------------------------------------------------------------------+
    Last edited by Andrew Musau; 01 Apr 2024, 15:19.

    Comment


    • #3
      thank you very much Andrew.
      If I understand correctly, the key element of the code you sent is the cond() function, which creates a new variable whose value is taken from the REFar6 observation.
      I do not have experience with cond() so ... I guess I will study it.

      many thanks again
      Nicola

      Comment

      Working...
      X