Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dataset: converting blank cells to numerical zero

    Here's my data set in a .csv file: ("ins3.csv"): https://www.dropbox.com/s/000rcyo4oo7o4n8/ins3.csv?dl=0 (you may close the splash screen without signing up for anything, and download the file by clicking the button with the "downwards pointing arrow" in the top-middle-right of the screen).

    In plain English, I want
    1) all blank cells (in variable "ins3") to be converted to numerical zero (0)
    2) The value 2 to be recoded to numerical zero (0)
    3) The value 9 to be dropped from the dataset


    Here's my .do file:

    Code:
    import delimited ins3.csv, varnames(1)
    tab ins3
    replace ins3="0" if trim(ins3)==""             /*     initially, ins3 is encoded as string    */
    tab ins3
    encode ins3, gen(ins3_r)        /*   convert from string to numerical (long)   */
    tab ins3_r
    recast byte ins3_r         /*    changing long to byte, may not be necessary    */
    drop if ins3_r == 9           /*    doesn't work    */
    tab ins3_r
    recode ins3_r 1=1 2=0 0=0, gen(ins3_r2)        /*  doesn't work  */
    tab ins3_r2
    ----------------------------------------------------------------------------------------


    At the start, my `tab ins3` command produces:

    Code:
           ins3 |      Freq.     Percent        Cum.
    ------------+-----------------------------------
                |         93       88.57       88.57
              1 |          8        7.62       96.19
              2 |          3        2.86       99.05
              9 |          1        0.95      100.00
    ------------+-----------------------------------
          Total |        105      100.00


    At the end my `tab ins3` command produces:

    Code:
      RECODE of |
         ins3_r |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |          8        7.62        7.62
              1 |         93       88.57       96.19
              3 |          3        2.86       99.05
              4 |          1        0.95      100.00
    ------------+-----------------------------------
          Total |        105      100.00


    What I want my `tab ins3` command to produce at the end is:
    Code:
           ins3 |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         96       91.43       91.43
              1 |          8        8.57       100.00
    ------------+-----------------------------------
          Total |        105      100.00
    ----------------------------------------------------------------------------------------


    My Question: Why doesn't my code work? And if you can share a working alternative please do.

    Thanks.
    Last edited by Lear Jett; 05 Nov 2019, 19:58.

  • #2
    Your link does not work for me. If you don't get a better answer, copy and paste of the file to here may be indicated.
    Last edited by Nick Cox; 06 Nov 2019, 03:05.

    Comment


    • #3
      Based on you 'plain english' description the following should work. I am not sure I follow your tabulations completely though.

      Code:
      import delimited ins3.csv, varnames(1)
      
      drop if ins3 == "9"
      
      gen new_ins3 = ins3 == "1"

      Comment


      • #4
        Originally posted by Nick Cox View Post
        Your link does not work for me. If you don't get a better answer, copy and paste of the file to here may be indicated.

        Am attaching the instructions on how to download (after you click on the link: https://www.dropbox.com/s/000rcyo4oo7o4n8/ins3.csv?dl=0 ):







        Attached Files
        Last edited by Lear Jett; 07 Nov 2019, 03:06.

        Comment


        • #5
          Personal rule: I never want to download from a site where I have to register first. That's a personal reason why I suggested copy and paste to here.

          As no one else has, it seems, wanted to do what you ask, that rule may be more than personal.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Personal rule: I never want to download from a site where I have to register first. That's a personal reason why I suggested copy and paste to here.

            As no one else has, it seems, wanted to do what you ask, that rule may be more than personal.


            The reason I hadn't pasted the data here is that I'm concerned there's something in the "blanks" that's not a simple blank (e.g. could be spaces or something else), and that fidelity might get lost in the pasting.

            Here's the paste of the data (not sure if the issue can be faithfully replicated with the pasted data though, as it can be with the .csv file):

            (Aside: Dropbox.com doesn't require users to register first. It suggests it, and the suggestion can be ignored by closing the "splash screen" -- noted in the screenshot above.)

            There's a single variable in the data, and a single column of data. "ins3" is the variable name (column label).

            Code:
            ins3
             
             
             
             
             
             
             
            2
             
             
             
             
             
             
             
            1
             
             
             
             
            1
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
            2
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
            2
             
             
             
            1
            1
            1
             
             
            1
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
            1
             
             
            1
             
             
             
            9
            Last edited by Lear Jett; 08 Nov 2019, 01:39.

            Comment


            • #7
              Sorry, but the link you give does nothing for me. I can think of reasons my end why that might be so, but beyond confirming that I see 12 values, variously 1,2, 9, in what you show above I can't help further. The question remains wide open for anyone else.

              Comment

              Working...
              X