Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to drop only one of two values for one observation

    Hi all,

    I am new here, using Stata 15.1 and have the following problem.
    I merged the mv and mv1 from different sources with gvkey for several years (that's why there are duplicates). As you can see, for some gvkey's I get only the value from mv or mv1, in other cases both. I want to keep only one real mv. That is mv if mv1 is missing, mv1 if mv is missing or mv1 if both values for one gvkey pop up. I am struggling with the codes.

    Thanks in advance!

    Kind regards

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long gvkey int fyear float(mv mv1)
    1082 2003           .     11910910
    1082 2004           .     16136240
    1082 2005           .     15218610
    1082 2006           .     15483530
    1082 2007           .     21948000
    1082 2008           .      7384000
    1082 2009           .     12388120
    1082 2010           .     10256040
    1177 2004           .  17505622016
    1177 2004           .  17505622016
    1177 2005           .    178080000
    1177 2005           .    178080000
    1177 2006           .  21579479040
    1177 2006           .  21579479040
    1177 2007           .  28297619456
    1177 2007           .  28297619456
    1177 2008           .   9111335936
    1177 2008           .   9111335936
    1177 2009           .  12926969856
    1177 2009           .  12926969856
    1177 2010           .  11978993664
    1177 2010           .  11978993664
    1177 2011           .  15100663808
    1177 2011           .  15100663808
    1177 2012           .  14521162752
    1177 2012           .  14521162752
    1177 2013           .  25159049216
    1177 2013           .  25159049216
    1177 2014           .  30787817472
    1177 2014           .  30787817472
    1177 2015           .  37485248512
    1177 2015           .  37485248512
    1177 2016           .  47336411136
    1177 2016           .  47336411136
    1177 2017           .  59125190656
    1177 2017           .  59125190656
    1382 2004           .    210023584
    1382 2005           .   2798242048
    1382 2006           .   4982297088
    1382 2007           .   1824930304
    1382 2008           .    584207360
    1382 2009           .    594470080
    1382 2010           .   1818861568
    1382 2011           .    800833280
    1382 2012           .   2056528384
    1682 2004           .    249724800
    1682 2005           .    377568000
    1682 2006           .    644220800
    1682 2007           .    744832000
    1682 2008           .    713060992
    1682 2009           .    624149184
    1682 2010           .   1145254912
    1682 2011           .    713833984
    1682 2012           .    102733600
    1682 2013           .    142584000
    1906 2004           .    386010880
    1906 2005           .    458849984
    1906 2006           .    574563264
    1906 2007           .    351077824
    1906 2008           .    208278000
    1906 2009           .    179005264
    1906 2010           .    243151520
    1906 2011           .     75903752
    1906 2012           .    166135200
    1906 2013           .    426838784
    1906 2014           .    325709792
    1906 2015           .    315050400
    1906 2016           .    359991488
    1906 2017           .    405330816
    2290 2004           .    147360000
    2290 2005           .   2058126976
    2290 2006           .   3487710976
    2290 2007           .   1876631296
    2290 2008           .    341811008
    2290 2009           .    872791936
    2290 2010           .   1521983360
    2290 2011           .    392720864
    2290 2012           .    862245312
    2589 2004           .    243773936
    2589 2005           .    261925296
    2589 2006           .    224631248
    2589 2007           .    153036000
    2589 2008           .     42302880
    2589 2009           .    100380408
    2597 2004  9772832768  7.57455e+10
    2597 2005 11404477440  81678057472
    2597 2006 11276872704  8.64371e+10
    2597 2007 13231353856 1.098444e+11
    2597 2008  8117046272  41812406272
    2597 2009 11541479424  73082281984
    2856 2004           .    737031104
    2856 2005           .    747118784
    2856 2006           .    834913152
    2856 2007           .    681391232
    2856 2008           .    644104256
    2856 2009           .    637889792
    2856 2010           .    745468672
    2856 2011           .    830154880
    2856 2012           .    973862208
    2859 2004           .    981734016
    end

  • #2
    Code:
    gen wanted = mv1 if !missing(mv1)
    replace wanted = mv if missing(wanted)

    Comment


    • #3
      I wonder if the best way to handle this would to merge your two files with an mv variable in each file. Then, you will use the update and replace options for the merge command and it should do the job (you just have to be careful with the choice of the master and using files before merging)

      Comment


      • #4
        Thank you very much Clyde, it was indeed that simple!

        @Alexis I agree with you. That was the plan initially if the output from both sources were the same. That was not the case, unfortunately. Therefore, I decided to do it this way, and it worked out pretty good.

        Comment

        Working...
        X