Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sorting Panel Data to clean errors

    I am using NLSY1979 panel data. I want to sort my variable weight in ascending order, but I want to still see every weight observation for that ID. This way I can determine if the extreme weight values are mistakes or if they are reasonable weights in comparison to the ID's weights in every year. So I'm trying to get it to do something like this and I cannot figure out what commands to use.
    'sort weightchange ID' and vice versa are not working
    "sort ID year
    sort weightchange, stable " also dont work. any ideas?
    ID year Weightchange weight
    1 2002 -700 200
    1 2001 100 900
    1 2003 0 200
    2 2001 -650
    2 2002
    2 2003
    3 2001 -645
    3 2002
    3 2003

  • #2
    What do you exactly want your commands to do?
    What do you mean with "does not work"?

    Maybe you can give us an example dataset (see the FAQ in the black bar on the top of this page, on how to do that), run the commands you tried on that example dataset. Show us what Stata showed you and explain why that is not what you want
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Let me agree with the advice from Maarten Buis about providing an example of your data using dataex.

      If you are running version 17, 16 or a fully updated version 15.1 or 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      One thing dataex will do is make it clear if your Weightchange variable is stored as a numeric value or as a string. If it is stored as a string, that is probably why your sorts "are not working" - they are in fact working, but strings containing numbers do not sort in the same order as numbers. Here is an example.
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float number str4 string
       100 "100" 
       200 "200" 
        25 "25"  
        10 "20"  
      -100 "-100"
      -200 "-200"
       -25 "-25" 
       -10 "-10" 
         0 "0"   
      end
      
      sort number
      list, clean
      sort string
      list, clean
      Code:
      . sort number
      
      . list, clean
      
             number   string  
        1.     -200     -200  
        2.     -100     -100  
        3.      -25      -25  
        4.      -10      -10  
        5.        0        0  
        6.       10       20  
        7.       25       25  
        8.      100      100  
        9.      200      200  
      
      . sort string
      
      . list, clean
      
             number   string  
        1.      -10      -10  
        2.     -100     -100  
        3.     -200     -200  
        4.      -25      -25  
        5.        0        0  
        6.      100      100  
        7.       10       20  
        8.      200      200  
        9.       25       25

      Comment


      • #4
        Having done the work above, I now see that instead of posting the example as a reply to this topic, a new topic was started and responded to at

        https://www.statalist.org/forums/for...o-clean-errors

        Comment


        • #5
          Hello I have a similar question about sorting/cleansing my data. I am currently trying to see if there is a relationship between Foreign Direct Investment/Foreign Aid between Donor and Recipient countries. I would like to see the number of times a donor country is sending both Foreign Aid and Foreign Direct investment to a Recipient country for specific years. Do you guys know any codes that would help?

          Here is my dataset

          Code:
          dataex IsoCode1 IsoCode2 FDIOutflowsDonor NetODA
          clear
          input str4 IsoCode1 str3 IsoCode2 double(FDIOutflowsDonor NetODA)
          "DEU" "AFG" 2 20.29
          "DEU" "AFG" -1.7 24.78
          "DEU" "AFG" 0 15.66
          "DEU" "AFG" 0 23.09
          "DEU" "AFG" 0 17.32
          "DEU" "AFG" . 73.06
          "DEU" "AFG" 0 143.61
          "DEU" "AFG" 0 104.86
          "DEU" "AFG" 0 86.3
          "DEU" "AFG" 0 113.48
          "DEU" "AFG" 0 133.08
          "DEU" "AFG" 0 220.67
          "DEU" "AFG" 0 281.01
          "DEU" "AFG" 0 327.91
          "DEU" "AFG" 0 477.04
          "DEU" "AFG" 0 516.11
          "DEU" "AFG" 0 525.87
          "DEU" "AFG" 1.33 531.41
          "DEU" "AFG" -7.97 503.64
          "DEU" "AFG" 0 404.49
          "DEU" "AFG" 0 556.11
          "DEU" "AFG" 1.13 511.34
          "DEU" "AFG" -1.18 438.53
          "DEU" "AFG" 0 423.2
          "DEU" "AGO" 12 29.43
          "DEU" "AGO" 15.6 21.04
          "DEU" "AGO" 6.3 15.96
          "DEU" "AGO" 2.1 25.27
          "DEU" "AGO" -1.8 18.4
          "DEU" "AGO" 2.68453973566232 16.45
          "DEU" "AGO" -2.823396546044892 25.57
          "DEU" "AGO" 0 17.19
          "DEU" "AGO" 2.483346060481893 14.88
          "DEU" "AGO" 2 14
          "DEU" "AGO" 0 12.8
          "DEU" "AGO" 31.47933723677115 12.46
          "DEU" "AGO" 19 11.13
          "DEU" "AGO" 11.1 8.17
          "DEU" "AGO" 30.5 7.16
          "DEU" "AGO" . 5.44
          "DEU" "AGO" . 4.92
          "DEU" "AGO" 13.28 4.03
          "DEU" "AGO" -13.29 4.03
          "DEU" "AGO" -8.88 2.49
          "DEU" "AGO" -1.11 2.91
          "DEU" "AGO" 0 2.84
          "DEU" "AGO" -2.36 4.2
          "DEU" "AGO" -3.36 4.4
          "DEU" "BDI" -2.7 16.1
          "DEU" "BDI" 0 8.21
          "DEU" "BDI" 1.7 6.73
          "DEU" "BDI" -1.1 2.37
          "DEU" "BDI" 0 4.92
          "DEU" "BDI" 0 5.38
          "DEU" "BDI" 0 4.2
          "DEU" "BDI" 0 6.1
          "DEU" "BDI" 0 11.91
          "DEU" "BDI" 0 12.99
          "DEU" "BDI" 0 16.4
          "DEU" "BDI" 0 23.39
          "DEU" "BDI" 0 22.11
          "DEU" "BDI" 0 27.12
          "DEU" "BDI" 0 29.92
          "DEU" "BDI" 0 31.49
          "DEU" "BDI" 0 24.54
          "DEU" "BDI" 0 24.03
          "DEU" "BDI" 0 24.59
          "DEU" "BDI" -1.11 19.46
          "DEU" "BDI" 0 50.97
          "DEU" "BDI" 0 35.13
          "DEU" "BDI" 0 33.38
          "DEU" "BDI" 0 38.67
          "DEU" "BEN" -.7 26.08
          "DEU" "BEN" 0 27.12
          "DEU" "BEN" 0 43.64
          "DEU" "BEN" 0 38.36
          "DEU" "BEN" 0 35.39
          "DEU" "BEN" 0 36.2
          "DEU" "BEN" 0 37.25
          "DEU" "BEN" 0 39.92
          "DEU" "BEN" . 28.09
          "DEU" "BEN" . 31.56
          "DEU" "BEN" . 29.85
          "DEU" "BEN" . 30.06
          "DEU" "BEN" . 44.55
          "DEU" "BEN" . 41.92
          "DEU" "BEN" 0 35.21
          "DEU" "BEN" 0 47.37
          "DEU" "BEN" 0 48.58
          "DEU" "BEN" 0 49.37
          "DEU" "BEN" 0 73.93
          "DEU" "BEN" 1.11 42.88
          "DEU" "BEN" 4.43 44.06
          "DEU" "BEN" 2.26 39.83
          "DEU" "BEN" 0 42.15
          "DEU" "BEN" 0 46.68
          "DEU" "BFA" 0 47.76
          "DEU" "BFA" -.6 42.08
          "DEU" "BFA" 0 56.8
          "DEU" "BFA" 0 51.25
          end

          Comment


          • #6
            #6 That would depend on whatever variable(s) you have that represent time or date, which aren't evident in your data example.
            Last edited by Nick Cox; 20 Nov 2022, 07:10.

            Comment

            Working...
            X