Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Encode + If

    I am trying to generate a new variable with my income variable being sorted into deciles. I generated a new variable and when I tried to use if it always said "type mismatch". My original variable contains income such as 2.000 and so on... so it is a numeric variable. I used encode but the generate and replace command afterwards is not working. There are always zero changes made in var10_decile

    encode var10,generate (var10_num)

    gen var10_decile=.

    replace var10_decile = 1 if var10_num <=2000

    replace var10_decile = 2 if var10_num > 2000 & <= 2999

    replace var10_decile = 3 if var10_num >2999 & <=3999

  • #2
    Thought the function does the binning automatically. Did you try this? Not seeing your your data sample, I cannot comment further.

    Code:
    sysuto auto, clear 
    xtile decimpg = mpg, nq(10)

    Comment


    • #3
      You cannot encode a numeric variable. If your original variable is a string variable and contains strings such as "2.000" then encode is still the wrong tool; you want destring then.

      Comment


      • #4
        egen cut

        might also be useful if you're setting the boundaries

        Comment


        • #5
          If you want binning with interval 1000, you can do that directly with


          Code:
          gen wanted = floor(numvar / 1000)
          which maps [0, 1000) to 0, [1000, 2000) to 1, and so forth.

          These intervals are only exceptionally decile bins. See also https://journals.sagepub.com/doi/pdf...867X1801800311

          Comment


          • #6
            daniel klein how exactly would the code look like?

            Comment


            • #7
              Nick Cox but that wouldnt work for different numbers right? Because my deciles would be household income numbers which wont be 1.000 it would rather be e.g. 1234

              Comment


              • #8
                #2 if you want deciles

                Comment


                • #9
                  sarah tews It's hard for us to know with complete confidence what you want here.

                  Your thread title says deciles; however, your code in #1 seems to show an attempt, or an intention, to bin a variable with bin width 1000.

                  (As already pointed out, encode is illegal if the variable supplied is numeric, and the wrong command otherwise if you had a string variable with values like "1234". )

                  So, that seems confused, as they are quite different problems, but perhaps you are clear on that now. Or always were.

                  If you are still confused, please give a data example with the income variable you have already -- see FAQ Advice #12.
                  Last edited by Nick Cox; 04 Jun 2024, 09:20.

                  Comment


                  • #10
                    I am a beginner... so I dont really have a clue about what I am doing - I try to start again.
                    I have a variable var10 - Monthly net | household income - which is for some reason a string variable. I now leard that I need to use the command destring to fix it. But that does not work. "var10: contains nonnumeric characters; no replace". Attached you can find a picture of my variable.
                    With that variable I want to do the following, but that only work if stata sees my varibale as a numeric variable

                    gen var10_decile=.

                    replace var10_decile = 1 if var10 <=1.130
                    replace var10_decile = 2 if var10 >=1131 & <= 1610
                    replace var10_decile = 3 if var10 >=1611 & <= 2030
                    replace var10_decile = 4 if var10 >=2031 & <= 2440
                    replace var10_decile = 5 if var10 >=2441 & <= 2890
                    replace var10_decile = 6 if var10 >=2891 & <= 3390
                    replace var10_decile = 7 if var10 >=3391 & <= 3980
                    replace var10_decile = 8 if var10 >=3981 & <= 4750
                    replace var10_decile = 9 if var10 >=4751 & <= 6020
                    replace var10_decile = 10 if var10 >=6021

                    Click image for larger version

Name:	var10.PNG
Views:	4
Size:	7.4 KB
ID:	1755330

                    Attached Files

                    Comment


                    • #11
                      Please take the time to read the FAQ, especially regarding dataex and how to provide example data.

                      From the screenshots (deprecated on Statalist), you probably want something like

                      Code:
                      destring var10 , generate(var10_numeric) ignore(",€") 
                      xtile var10_decile = var10_numeric , nquantiles(10)
                      The first line of code should transform your string variable into a numeric one. The second line will split the numeric variable into 10 quantiles, assuming that is what you want.

                      Comment

                      Working...
                      X