Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summation using If condition to avoid negative values

    Hello,

    I want to write a code that creates a value which is the summation of several rows. I have negative values in my data and I want to ignore those values in the summation. I have run the following code but it is giving me missing values.

    local x owe1 owe2 owe3 owe4
    foreach x in `x'{
    gen loan_1998= owe1+ owe2+ owe3+ owe4 if `x'>0

    }

    Here is how my data is looking like: it is a survey data and the the value -4 corresponds to a valid skipped.
    PUBID SEX BDATE_M BDATE_Y CV_SAMPLE_TYPE RACE_ETHNICITY owe1 owe2 owe3 owe4 owe_1998
    96 Male 2982616 1982 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    97 Male 2982616 1980 Cross-sectional Hispanic -4 -4 -4 -4 .
    98 Male 2982616 1980 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    99 Male 2982616 1982 Cross-sectional Hispanic -4 -4 -4 -4 .
    100 Female 2982616 1981 Cross-sectional Black -4 -4 -4 -4 .
    101 Female 2982616 1981 Cross-sectional Black -4 -4 -4 -4 .
    102 Female 2982616 1982 Cross-sectional Hispanic -4 -4 -4 -4 .
    103 Female 2982617 1983 Cross-sectional Hispanic -4 -4 -4 -4
    104 Female 2982617 1980 Cross-sectional Non-Black / Non-Hispanic -4 1500 -4 -4 .
    105 Male 2982617 1983 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    106 Male 2982617 1981 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    107 Female 2982617 1982 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    108 Male 2982617 1984 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    109 Female 2982616 1982 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    110 Female 2982616 1984 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    111 Female 2982616 1981 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    112 Female 2982616 1984 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    113 Male 2982616 1984 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    114 Female 2982616 1982 Cross-sectional Non-Black / Non-Hispanic -4 -4 -4 -4 .
    115 Female 2982617 1984 Cross-sectional Hispanic -4 -4 -4 -4 .
    116 Male 2982617 1984 Cross-sectional Black -4 -4 -4 -4 .
    I am wondering if someone can please help me figure out what is wrong?

    Thank you for your assistance!

  • #2
    That is not going to work as second time around the loop the variable you are trying to generate already exists. So the code is immediately illegal .

    Also, your code implies using each positive value repeatedly even if it what you wished.

    You can (and should) generate the new variable just once outside the loop and then if you are minded to use a loop you can (and must) use replace inside the loop.


    Code:
    gen wanted  = 0 
    
    forval j = 1/4 { 
         replace wanted = wanted + max(owe`j', 0) 
    }
    Or it is just one line:without any loop being needed:

    Code:
    gen wanted = max(owe1, 0) + max(owe2, 0) + max(owe3, 0) + max(owe4, 0)
    where you will see that the intent and effect of
    Code:
    max()
    is to ignore negative arguments.

    As a point of style, note that you can dispense with the indirectness of putting names into a local macro and then taking them out again.

    As another point of style, you can get horrible bugs, or at least confuse the reader, by using the same name collectively for a bunch of variable names and for each individual variable name.


    Comment


    • #3
      first, please use -dataex- for data examples in the future (read the FAQ please)

      second, why not simplify things by first using -mvdecode-? see
      Code:
      help mvdecode

      Comment


      • #4
        Rich Goldstein Almost always we agree, but here the problem is ignoring negative values, not missings.

        Comment


        • #5
          Nick's advice is helpful and makes clear the logic of how to go about the sum. Rich's approach is also useful by first recoding -4 to a missing value, and then the row total becomes a little easier. With little extra code ,you could even clone the -owe1-owe4- vars to maintain the valid skip value if that's important.

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input int pubid str6 sex long bdate_m int bdate_y str15 cv_sample_type str24 race_ethnicity byte owe1 int owe2 byte(owe3 owe4)
           96 "Male"   2982616 1982 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
           97 "Male"   2982616 1980 "Cross-sectional" "Hispanic"                 -4   -4 -4 -4
           98 "Male"   2982616 1980 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
           99 "Male"   2982616 1982 "Cross-sectional" "Hispanic"                 -4   -4 -4 -4
          100 "Female" 2982616 1981 "Cross-sectional" "Black"                    -4   -4 -4 -4
          101 "Female" 2982616 1981 "Cross-sectional" "Black"                    -4   -4 -4 -4
          102 "Female" 2982616 1982 "Cross-sectional" "Hispanic"                 -4   -4 -4 -4
          104 "Female" 2982617 1980 "Cross-sectional" "Non-Black / Non-Hispanic" -4 1500 -4 -4
          105 "Male"   2982617 1983 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          106 "Male"   2982617 1981 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          107 "Female" 2982617 1982 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          108 "Male"   2982617 1984 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          109 "Female" 2982616 1982 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          110 "Female" 2982616 1984 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          111 "Female" 2982616 1981 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          112 "Female" 2982616 1984 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          113 "Male"   2982616 1984 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          114 "Female" 2982616 1982 "Cross-sectional" "Non-Black / Non-Hispanic" -4   -4 -4 -4
          115 "Female" 2982617 1984 "Cross-sectional" "Hispanic"                 -4   -4 -4 -4
          116 "Male"   2982617 1984 "Cross-sectional" "Black"                    -4   -4 -4 -4
          end
          Nick's and Rich's approach:

          Code:
          gen wanted = max(owe1, 0) + max(owe2, 0) + max(owe3, 0) + max(owe4, 0)
          
          mvdecode owe1-owe4, mv(-4=.)
          egen wanted2 = rowtotal(owe1-owe4)
          Result

          Code:
               +--------------------------+
               | pubid   wanted   wanted2 |
               |--------------------------|
            1. |    96        0         0 |
            2. |    97        0         0 |
            3. |    98        0         0 |
            4. |    99        0         0 |
            5. |   100        0         0 |
            6. |   101        0         0 |
            7. |   102        0         0 |
            8. |   104     1500      1500 |
            9. |   105        0         0 |
           10. |   106        0         0 |
           11. |   107        0         0 |
           12. |   108        0         0 |
           13. |   109        0         0 |
           14. |   110        0         0 |
           15. |   111        0         0 |
           16. |   112        0         0 |
           17. |   113        0         0 |
           18. |   114        0         0 |
           19. |   115        0         0 |
           20. |   116        0         0 |
               +--------------------------+

          Comment


          • #6
            Sorry; belay #4: in the example all negative values are -4 and so mvdecode offers a good solution.

            Comment


            • #7
              Thank you so much Nick, Rich, and Leonardo for you suggestions. I appreciate it. I will review my code.

              Comment

              Working...
              X