Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Beginner's advice - restructuring data sets

    Hello friends,

    I am working with data from Databank; I get each year as a variable where different entries represent variables from the Databank. For example, for Ecuador, I get yr1970, the first entry will be total natural resources rents and the second GDP-per-capita. I want to restructure it so that I have three variable: Year, GDP-per-capita, and total natural resources rent, where the first entry for year is (say) 1970, and the first entries for the other variable match that.

    Thank you!

  • #2
    please provide a data example by typing -dataex- in Stata and copying the output in the results window

    Comment


    • #3
      This is, for example, for Ecuador 1970-1975.

      Click image for larger version

Name:	Screenshot 2022-01-28 132253.png
Views:	1
Size:	12.9 KB
ID:	1647122

      Comment


      • #4
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float(yr1970 yr1971 yr1972 yr1973 yr1974 yr1975)
         .6125481  .7346113 1.6069278  4.118206 8.688386  6.702191
        2020.1147 2086.0383 2128.7097 2357.6694 2549.309 2751.7446
                .         .         .         .        .         .
                .         .         .         .        .         .
                .         .         .         .        .         .
                .         .         .         .        .         .
                .         .         .         .        .         .
        end
        
        drop if mi(yr1970-yr1975)
        
        gen country = "Ecuador"
        gen entry = ""
        replace entry = "Total natural resources rents" in 1/1
        replace entry = "GDP per capita" in 2/2
        
        reshape long yr, i(country entry) j(year)

        Comment


        • #5
          Just as sort of a note, you'd want to copy the text Stata tells you to copy and reply with that, instead of posting a picture of it. This prevents us from having to manually input the example ourselves.

          Another tip I'd give is to use greshape. It does exactly what regular reshape does, just like lightning. You can install it with ssc inst gtools, replace

          it especially helps with giant datasets.

          Comment


          • #6
            Thank you Øyvind Snilsberg!
            And Jared, thanks as well; I'll copy the text next time. Regarding greshape, I did not understand what is the difference; is it simply faster and more efficient?

            Comment


            • #7
              I'll put it like this: for one paper I was reshaping COVID-19 data from John's Hopkins Github page. At this point it must be in the tens of millions of observations.

              Standard reshape (I'm almost not kidding) took AT LEAST 5 minutes, maybe 20 if I remember correctly.

              greshape does it in less than 5 seconds.

              Comment


              • #8
                In fact, try the code yourself. Try it with regular reshape, and then use greshape
                Code:
                import delim "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv", clear
                
                qui ds
                loc v : di word("`r(varlist)'", `c(k)')
                
                qui: ds v12-`v'
                
                loc vars `r(varlist)'
                
                local nwords :  word count `r(varlist)'
                
                disp `nwords'
                
                cls
                
                forv i = 1/`nwords' {
                       
                    loc a: word `i' of `nwords'
                    loc b: word `i' of `vars'
                    
                    qui: rename `b' day_`i'
                }
                
                drop if fips ==.
                
                qui: reshape long day_, i(fips) j(num) string

                Comment

                Working...
                X