Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Shifting values over one variable in a row

    Hi Everyone,

    We are cleaning data that was entered manually into a csv and have some issues where in several observations, all entries were somehow shifted over one. For example, a subset of our data might look like this:
    ID Gender Language Income v25
    1 F 3 5000
    2 M 3 300
    3 F 2 100
    4 F 1 3000
    where the values for ID 2 need to be shifted one to the left. I am wondering if there is a way to automate this shifting of values without having to replace values in variables one by one. I read in another post that you can reshape long and then use by and subscripting, but I am not sure how to reshape this dataset as there are no common variable stub names.

    Any advice would be appreciated!

    Thanks!

  • #2
    If I were confronted with this problem, my likely approach would be to open the original CSV file in Excel and deal with the problems in Excel, deleting cells so that Excel can then shift the rest of the row left. The problem with using a Stata program is that your data really requires careful review for other problems hidden by the effects of the problem you have detected. Using a program may well keep you from that careful review.

    Having imported the flawed file, you now have what should be a numeric variable (Language) treated as a string variable by Stata because of the "M" for ID 2. So along with getting things to line up correctly, you then have to destring variables suffering from that problem.
    Last edited by William Lisowski; 26 Jun 2018, 05:41.

    Comment


    • #3


      Code:
      clear
      input ID       Gender str5(Language)    Income    v25
      . 1    "F"    3    5000    
      . 2    "M"    3    300
      . 3    "F"    2    100    
      . 4    "F"    1    3000
      end
      l
      
      
      tostring *, replace 
          
      
      ds  //select everything; edit this to select subset of data
      token `"`r(varlist)'"'
      
      while `"`1'"' != "" & `"`2'"' != "" {
          tr: replace `1' = `2' if mi(`1') | `1'=="." 
          mac shift
          }
          
          destring *, replace
          drop v25
          l
      Eric A. Booth | Senior Director of Research | Far Harbor | Austin TX

      Comment


      • #4
        1. I do agree with William that it would be "safer" if Jane could be dealing with the problem within original excel file. Any suggested code in Stata, even it might be helpful, still requires very carefully rechecking and some destring works.

        2. The code of Eric is very insightful, but it seems to me that the issue of Jane is a little different than the example within the code. Particularly, if I might be correct, the original data of Jane would looks like:
        Code:
        clear
        input byte id str1(gender language) int(income v25)
        1 "F" "3" 5000   .
        2 ""  "M"    3 300
        3 "F" "2"  100   .
        4 "F" "1" 3000   .
        end
        list, noobs
          +--------------------------------------+
          | id  gender   language   income   v25 |
          |--------------------------------------|
          | 1        F          3     5000     . |
          | 2                   M        3   300 |
          | 3        F          2      100     . |
          | 4        F          1     3000     . |
          +--------------------------------------+
        wherein, v25 is unexpected variables and Jane only want to "move 1 left step" for id==2 (where gender is missing, making all the info be pushed wrongly in 1 "right step").

        Other relevant issue is that: There still might be some missing values with in the original variables (language, income), which are still expected to be moved left when gender is "wrongly" missing.

        3. Then Jane, please notice FAQ, particularly the question 12.2, and use dataex to give out here a small sample of your data. Upon your sample, the specific suggestion for your issue could be provided effectively. And quick, I believe.

        Cheers.

        Comment

        Working...
        X