Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • I'm trying to keep my variable labels after a reshape but my code won't work

    Dear Statalist,

    I tried following the solutions proposed to other users who faced the same problem than mine but the code just won't work. To sum things up, I am trying to give as variable label the content of whatever is written in the first observation. I used a classic loop for each variable that creates a local that stores the variable name but when I run my code and put the tracer one, Stata doesn't detect any local.

    Code:
            foreach va of varlist var* {
                local l`va' : variable label `va'
            }
    And then

    Code:
        foreach va of varlist var* {
                label variable `va' "`l`va''"
            }
    I get this from the tracer :

    - label variable `va' "`l`va''"
    = label variable var493 ""

    Which means the local is empty, right? There is a very long series of lines between the two blocks of code, including two reshapes (wide>long then long>wide and a use of frames. I recreated a simple dataex example summarizing my problem :

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str50(var1 var2 var3)
    "this should be considered a label" "this too" "this too"
    end
    A very important detail : Some of the soon-to-be labels contain double quotes that I would like to keep. If possible, I would like a solution to my problem that is compatible with the label having double quotes. Thank you for your help !


    EDIT : After doing a macro list command after the local command it seems that there is no local named l`var' so the problem must arise from here. What did I do wrong here ?
    Last edited by Thomas Brot; 13 Feb 2023, 14:10.

  • #2
    Perhaps the problem is that you have written your code in the do-file editor window, and then rather than running everything at once, you are running it by selecting a few lines and running them, then selecting the next few lines and running them, and so on.

    Consider the following example. In the do-file editor window, I have a two-line program that I run in its entirety.
    Code:
    . do "/Users/lisowskiw/Downloads/example.do"
    
    . local message Hello, world.
    
    . display "The message is `message'"
    The message is Hello, world.
    
    . 
    end of do-file
    
    .
    Now I run the same two lines by selecting the first line and running it, then selecting the second line and running it.
    Code:
    . do "/var/folders/xr/lm5ccr996k7dspxs35yqzyt80000gp/T//SD04017.000000"
    
    . local message Hello, world.
    
    . 
    end of do-file
    
    . do "/var/folders/xr/lm5ccr996k7dspxs35yqzyt80000gp/T//SD04017.000000"
    
    . display "The message is `message'"
    The message is 
    
    . 
    end of do-file
    
    .
    The important thing to keep in mind is that local macros vanish when the do-file within which they were created ends. If you look carefully at the results above, you'll see that when I selected a single line to run, it was copied into a temporary do-file and run, so even though both lines are in the same window in the do-file editor, they are run as separate do-files, and local macro defined in the first line vanishes at the end of that do-file, and is undefined when the second line is run.

    So to tie this to your problem, I believe you ran your code in several pieces, with the definitions of the locals in one piece and the attempt to use them in another, by which time they had been forgotten.

    Comment


    • #3
      William : Thank you for taking the time to write such an important message. I hope it will benefit those who will come across this thread.

      Actually, the problem was completely different: I asked Stata to store the labels of my variables in a local but I didn't even define the labels before. I am ashamed I didn't spot such an obvious thing and went directly to this forum but at least your message made me check the do-file one more time before going to bed, so thank you anyways

      Problem solved !

      Comment


      • #4
        You shouldn't feel ashamed at your mistake.It's my experience that I never stop making mistakes with Stata, although one can get a little faster in spotting what they are.

        On the contrary, it is good for us to get closure on a thread. Quite a few threads are never closed so that we never hear what the resolution was.

        Comment


        • #5
          You might consider giving greshape a try.

          Code:
          ssc install gtools

          Comment


          • #6
            Todd Jones has made four posts within the span of 1 or 2 minutes, all of them tacking on to old threads and recommending the use of the -gtools- package. In the other three posts, at least -gtools- is relevant because those threads all involve code using -reshape-, and perhaps -greshape- would work better there. But I fail to see how -gtools- has any relevance to this thread, and I wonder why he is so aggressively promoting -gtools-. Look, I like -gtools-, and I use it myself in most of my work with large files. But why the "campaign?"

            Comment


            • #7
              Clyde Schechter thanks for the post.

              I had the variable label/reshape question myself and realized that several other people had posted about it, so after I found what I think is a good solution, I thought it would be nice to share in case it would be helpful to people in the future. I only included -gtools- because it is required to use -greshape-.

              With that said, if you would like me to delete all these posts, please let me know and I will be more than happy to do so!

              Comment


              • #8
                No, I don't want you to delete the posts. In fact, I am, in principle, opposed to anybody ever deleting his/her post.

                With regard to this thread, my question is still what is the relevance of --gtools-. I know that -gtools- is the parent package for -greshape- and several other nice programs. But as far as I can see there is nothing in -gtools- that is applicable to the problem set out in this thread. Am I missing something?

                Even in the other threads, I'm not clear what the relevance is to the problem posed. While -greshape- has some advantages over -reshape-, especially speed, it doesn't handle variable labels any differently from -reshape-:
                Code:
                . clear*
                
                . webuse reshape1
                
                .
                . forvalues i = 0/2 {
                  2.         label var inc8`i' "Incomoe 198`i'"
                  3.         label var ue8`i' "Unemployment 198`i'"
                  4. }
                
                . des
                
                Contains data from https://www.stata-press.com/data/r18/reshape1.dta
                 Observations:             3                  
                    Variables:             8                  12 Mar 2022 15:03
                -------------------------------------------------------------------------------------------------------------------------------------
                Variable      Storage   Display    Value
                    name         type    format    label      Variable label
                -------------------------------------------------------------------------------------------------------------------------------------
                id              float   %9.0g                 
                sex             float   %9.0g                 
                inc80           float   %9.0g                 Incomoe 1980
                inc81           float   %9.0g                 Incomoe 1981
                inc82           float   %9.0g                 Incomoe 1982
                ue80            float   %9.0g                 Unemployment 1980
                ue81            float   %9.0g                 Unemployment 1981
                ue82            float   %9.0g                 Unemployment 1982
                -------------------------------------------------------------------------------------------------------------------------------------
                Sorted by:
                     Note: Dataset has changed since last saved.
                
                .
                . preserve
                
                . reshape long inc ue, i(id) j(year)
                (j = 80 81 82)
                
                Data                               Wide   ->   Long
                -----------------------------------------------------------------------------
                Number of observations                3   ->   9           
                Number of variables                   8   ->   5           
                j variable (3 values)                     ->   year
                xij variables:
                                      inc80 inc81 inc82   ->   inc
                                         ue80 ue81 ue82   ->   ue
                -----------------------------------------------------------------------------
                
                . des
                
                Contains data
                 Observations:             9                  
                    Variables:             5                  
                -------------------------------------------------------------------------------------------------------------------------------------
                Variable      Storage   Display    Value
                    name         type    format    label      Variable label
                -------------------------------------------------------------------------------------------------------------------------------------
                id              float   %9.0g                 
                year            byte    %10.0g                
                sex             float   %9.0g                 
                inc             float   %9.0g                 
                ue              float   %9.0g                 
                -------------------------------------------------------------------------------------------------------------------------------------
                Sorted by: id  year
                     Note: Dataset has changed since last saved.
                
                .
                . restore
                
                . greshape long inc ue, i(id) j(year)
                (note: j = 80 81 82)
                (note: cannot preserve labels when reshaping long)
                
                Data                               wide   ->   long
                -----------------------------------------------------------------------------
                Number of obs.                        3   ->   9                    
                Number of variables                    8  ->   5                    
                j (3 values)                              ->   year
                xij variables:
                                      inc80 inc81 inc82   ->   inc
                                         ue80 ue81 ue82   ->   ue
                -----------------------------------------------------------------------------
                
                . des
                
                Contains data from https://www.stata-press.com/data/r18/reshape1.dta
                 Observations:             9                  
                    Variables:             5                  12 Mar 2022 15:03
                -------------------------------------------------------------------------------------------------------------------------------------
                Variable      Storage   Display    Value
                    name         type    format    label      Variable label
                -------------------------------------------------------------------------------------------------------------------------------------
                id              float   %9.0g                 
                year            long    %12.0g                
                inc             float   %9.0g                 
                ue              float   %9.0g                 
                sex             float   %9.0g                 
                -------------------------------------------------------------------------------------------------------------------------------------
                Sorted by:
                     Note: Dataset has changed since last saved.
                Note that the variable labels are lost either way.

                So I remain puzzled by the exuberance of your posting today and your choice of where to post.




                Comment


                • #9
                  You make some good points. I replied to this post based on what I thought the OP was trying to do based on the post's title, but I now see that I probably shouldn't have.

                  I looked into it more, and it looks like greshape does retain variable labels (with the "j" before the label) with "greshape wide" (see code below). I did not previously think to test "greshape long". It looks like whether or not it retains variable labels depends on the situation. In the example you provided—where the variable labels are not constant across the variables—it does not. But if the labels are the same (or are those generated after a greshape wide), then they are preserved (in the latter case with a space in the front). This has been a useful exercise to understand these things - thanks for bringing it up!

                  Code:
                  sysuse auto2, clear
                  keep mpg foreign
                  bys mpg: gen j = _n
                  tempfile cars
                  save `cars'
                  
                  *greshape wide
                  greshape wide foreign, i(mpg) j(j)
                  describe
                  
                  *greshape long
                  greshape long foreign, i(mpg) j(j)
                  describe
                  
                  *redo with variable labels the same
                  use `cars', clear
                  greshape wide foreign, i(mpg) j(j)
                  keep mpg foreign1 foreign2
                  label var foreign1 "Car Origin"
                  label var foreign2 "Car Origin"
                  greshape long foreign, i(mpg) j(j)
                  describe
                  
                  *and when variable labels are different
                  use `cars', clear
                  greshape wide foreign, i(mpg) j(j)
                  keep mpg foreign1 foreign2
                  label var foreign1 "Car Origin A"
                  label var foreign2 "Car Origin B"
                  greshape long foreign, i(mpg) j(j)
                  describe
                  Here is what it produces:

                  Code:
                  . sysuse auto2, clear
                  (1978 automobile data)
                  
                  . keep mpg foreign
                  
                  . bys mpg: gen j = _n
                  
                  . tempfile cars
                  
                  . save `cars'
                  file /var/folders/7j/h1nrksr57dd0rdmgtfydjdt80000gn/T//S_11641.0009mu saved as .dta format
                  
                  .
                  . *greshape wide
                  . greshape wide foreign, i(mpg) j(j)
                  (note: j = 1 2 3 4 5 6 7 8 9)
                  
                  Data                               long   ->   wide
                  -----------------------------------------------------------------------------
                  Number of obs.                       74   ->   21                  
                  Number of variables                    3  ->   10                  
                  j (9 values)                          j   ->   (dropped)
                  xij variables:
                                                  foreign   ->   foreign1 foreign2 ... foreign9
                  -----------------------------------------------------------------------------
                  
                  . describe
                  
                  Contains data
                   Observations:            21                  1978 automobile data
                      Variables:            10                  
                                                                (_dta has notes)
                  ----------------------------------------------------------------------------------------------------------
                  Variable      Storage   Display    Value
                      name         type    format    label      Variable label
                  ----------------------------------------------------------------------------------------------------------
                  mpg             int     %8.0g                 Mileage (mpg)
                  foreign1        byte    %8.0g      origin     1 Car origin
                  foreign2        byte    %8.0g      origin     2 Car origin
                  foreign3        byte    %8.0g      origin     3 Car origin
                  foreign4        byte    %8.0g      origin     4 Car origin
                  foreign5        byte    %8.0g      origin     5 Car origin
                  foreign6        byte    %8.0g      origin     6 Car origin
                  foreign7        byte    %8.0g      origin     7 Car origin
                  foreign8        byte    %8.0g      origin     8 Car origin
                  foreign9        byte    %8.0g      origin     9 Car origin
                  ----------------------------------------------------------------------------------------------------------
                  Sorted by: mpg
                       Note: Dataset has changed since last saved.
                  
                  .
                  . *greshape long
                  . greshape long foreign, i(mpg) j(j)
                  (note: j = 1 2 3 4 5 6 7 8 9)
                  
                  Data                               wide   ->   long
                  -----------------------------------------------------------------------------
                  Number of obs.                       21   ->   189                  
                  Number of variables                   10  ->   3                    
                  j (9 values)                              ->   j
                  xij variables:
                           foreign1 foreign2 ... foreign9   ->   foreign
                  -----------------------------------------------------------------------------
                  
                  . describe
                  
                  Contains data
                   Observations:           189                  1978 automobile data
                      Variables:             3                  
                                                                (_dta has notes)
                  ----------------------------------------------------------------------------------------------------------
                  Variable      Storage   Display    Value
                      name         type    format    label      Variable label
                  ----------------------------------------------------------------------------------------------------------
                  mpg             int     %8.0g                 Mileage (mpg)
                  j               long    %12.0g                
                  foreign         byte    %8.0g      origin      Car origin
                  ----------------------------------------------------------------------------------------------------------
                  Sorted by: mpg
                       Note: Dataset has changed since last saved.
                  
                  .
                  . *redo with variable labels the same
                  . use `cars', clear
                  (1978 automobile data)
                  
                  . greshape wide foreign, i(mpg) j(j)
                  (note: j = 1 2 3 4 5 6 7 8 9)
                  
                  Data                               long   ->   wide
                  -----------------------------------------------------------------------------
                  Number of obs.                       74   ->   21                  
                  Number of variables                    3  ->   10                  
                  j (9 values)                          j   ->   (dropped)
                  xij variables:
                                                  foreign   ->   foreign1 foreign2 ... foreign9
                  -----------------------------------------------------------------------------
                  
                  . keep mpg foreign1 foreign2
                  
                  . label var foreign1 "Car Origin"
                  
                  . label var foreign2 "Car Origin"
                  
                  . greshape long foreign, i(mpg) j(j)
                  (note: j = 1 2)
                  
                  Data                               wide   ->   long
                  -----------------------------------------------------------------------------
                  Number of obs.                       21   ->   42                  
                  Number of variables                    3  ->   3                    
                  j (2 values)                              ->   j
                  xij variables:
                                        foreign1 foreign2   ->   foreign
                  -----------------------------------------------------------------------------
                  
                  . describe
                  
                  Contains data
                   Observations:            42                  1978 automobile data
                      Variables:             3                  
                                                                (_dta has notes)
                  ----------------------------------------------------------------------------------------------------------
                  Variable      Storage   Display    Value
                      name         type    format    label      Variable label
                  ----------------------------------------------------------------------------------------------------------
                  mpg             int     %8.0g                 Mileage (mpg)
                  j               long    %12.0g                
                  foreign         byte    %8.0g      origin     Car Origin
                  ----------------------------------------------------------------------------------------------------------
                  Sorted by: mpg
                       Note: Dataset has changed since last saved.
                  
                  .
                  . *and when variable labels are different
                  . use `cars', clear
                  (1978 automobile data)
                  
                  . greshape wide foreign, i(mpg) j(j)
                  (note: j = 1 2 3 4 5 6 7 8 9)
                  
                  Data                               long   ->   wide
                  -----------------------------------------------------------------------------
                  Number of obs.                       74   ->   21                  
                  Number of variables                    3  ->   10                  
                  j (9 values)                          j   ->   (dropped)
                  xij variables:
                                                  foreign   ->   foreign1 foreign2 ... foreign9
                  -----------------------------------------------------------------------------
                  
                  . keep mpg foreign1 foreign2
                  
                  . label var foreign1 "Car Origin A"
                  
                  . label var foreign2 "Car Origin B"
                  
                  . greshape long foreign, i(mpg) j(j)
                  (note: j = 1 2)
                  (note: cannot preserve labels when reshaping long)
                  
                  Data                               wide   ->   long
                  -----------------------------------------------------------------------------
                  Number of obs.                       21   ->   42                  
                  Number of variables                    3  ->   3                    
                  j (2 values)                              ->   j
                  xij variables:
                                        foreign1 foreign2   ->   foreign
                  -----------------------------------------------------------------------------
                  
                  . describe
                  
                  Contains data
                   Observations:            42                  1978 automobile data
                      Variables:             3                  
                                                                (_dta has notes)
                  ----------------------------------------------------------------------------------------------------------
                  Variable      Storage   Display    Value
                      name         type    format    label      Variable label
                  ----------------------------------------------------------------------------------------------------------
                  mpg             int     %8.0g                 Mileage (mpg)
                  j               long    %12.0g                
                  foreign         byte    %8.0g      origin    
                  ----------------------------------------------------------------------------------------------------------
                  Sorted by: mpg
                       Note: Dataset has changed since last saved.

                  Comment


                  • #10
                    Well #9 is nice and all, but Stata's reshape still does (almost) the same thing. Stata even retains both variable labels in the last example; so you don't need greshape here to preserve variable labels.

                    Comment


                    • #11
                      daniel klein , would you mind sharing a MWE? I tried each case with "reshape" in my #9 code and in no case did it retain variable labels (in the first case, it creates variable labels based on the variable name, not the variable value), but it is very possible that I am doing something incorrectly!

                      Code:
                      sysuse auto2, clear
                      keep mpg foreign
                      bys mpg: gen j = _n
                      tempfile cars
                      save `cars'
                      
                      *1. wide
                      *it creates new variable label based on variable name, not original variable labels
                      reshape wide foreign, i(mpg) j(j)
                      describe
                      
                      *2. long
                      use `cars', clear
                      greshape wide foreign, i(mpg) j(j)
                      reshape long foreign, i(mpg) j(j)
                      describe
                      
                      *3. long, when variable labels are the same 
                      use `cars', clear
                      greshape wide foreign, i(mpg) j(j)
                      keep mpg foreign1 foreign2 
                      label var foreign1 "Car Origin"
                      label var foreign2 "Car Origin"
                      reshape long foreign, i(mpg) j(j)
                      describe

                      Comment


                      • #12
                        I copy your code from #9 verbatim, then change all occurrences of greshape to reshape. This is what I get (Stata/SE 18.0 for Windows (64-bit x86-64); Revision 04 Apr 2024):

                        Code:
                        . sysuse auto2, clear
                        (1978 automobile data)
                        
                        . keep mpg foreign
                        
                        . bys mpg: gen j = _n
                        
                        . tempfile cars
                        
                        . save `cars'
                        file C:\Users\klein\AppData\Local\Temp\ST_4430_000001.tmp saved as .dta format
                        
                        . 
                        . *(g)reshape wide
                        . reshape wide foreign, i(mpg) j(j)
                        (j = 1 2 3 4 5 6 7 8 9)
                        
                        Data                               Long   ->   Wide
                        -----------------------------------------------------------------------------
                        Number of observations               74   ->   21          
                        Number of variables                   3   ->   10          
                        j variable (9 values)                 j   ->   (dropped)
                        xij variables:
                                                        foreign   ->   foreign1 foreign2 ... foreign9
                        -----------------------------------------------------------------------------
                        
                        . describe
                        
                        Contains data
                         Observations:            21                  1978 automobile data
                            Variables:            10                  
                                                                      (_dta has notes)
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        Variable      Storage   Display    Value
                            name         type    format    label      Variable label
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        mpg             int     %8.0g                 Mileage (mpg)
                        foreign1        byte    %8.0g      origin     1 foreign
                        foreign2        byte    %8.0g      origin     2 foreign
                        foreign3        byte    %8.0g      origin     3 foreign
                        foreign4        byte    %8.0g      origin     4 foreign
                        foreign5        byte    %8.0g      origin     5 foreign
                        foreign6        byte    %8.0g      origin     6 foreign
                        foreign7        byte    %8.0g      origin     7 foreign
                        foreign8        byte    %8.0g      origin     8 foreign
                        foreign9        byte    %8.0g      origin     9 foreign
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        Sorted by: mpg
                        
                        . 
                        . *(g)reshape long
                        . reshape long foreign, i(mpg) j(j)
                        (j = 1 2 3 4 5 6 7 8 9)
                        
                        Data                               Wide   ->   Long
                        -----------------------------------------------------------------------------
                        Number of observations               21   ->   189         
                        Number of variables                  10   ->   3           
                        j variable (9 values)                     ->   j
                        xij variables:
                                 foreign1 foreign2 ... foreign9   ->   foreign
                        -----------------------------------------------------------------------------
                        
                        . describe
                        
                        Contains data
                         Observations:           189                  1978 automobile data
                            Variables:             3                  
                                                                      (_dta has notes)
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        Variable      Storage   Display    Value
                            name         type    format    label      Variable label
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        mpg             int     %8.0g                 Mileage (mpg)
                        j               byte    %10.0g                
                        foreign         byte    %8.0g      origin     Car origin
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        Sorted by: mpg  j
                             Note: Dataset has changed since last saved.
                        
                        . 
                        . *redo with variable labels the same
                        . use `cars', clear
                        (1978 automobile data)
                        
                        . reshape wide foreign, i(mpg) j(j)
                        (j = 1 2 3 4 5 6 7 8 9)
                        
                        Data                               Long   ->   Wide
                        -----------------------------------------------------------------------------
                        Number of observations               74   ->   21          
                        Number of variables                   3   ->   10          
                        j variable (9 values)                 j   ->   (dropped)
                        xij variables:
                                                        foreign   ->   foreign1 foreign2 ... foreign9
                        -----------------------------------------------------------------------------
                        
                        . keep mpg foreign1 foreign2
                        
                        . label var foreign1 "Car Origin"
                        
                        . label var foreign2 "Car Origin"
                        
                        . reshape long foreign, i(mpg) j(j)
                        (j = 1 2)
                        
                        Data                               Wide   ->   Long
                        -----------------------------------------------------------------------------
                        Number of observations               21   ->   42          
                        Number of variables                   3   ->   3           
                        j variable (2 values)                     ->   j
                        xij variables:
                                              foreign1 foreign2   ->   foreign
                        -----------------------------------------------------------------------------
                        
                        . describe
                        
                        Contains data
                         Observations:            42                  1978 automobile data
                            Variables:             3                  
                                                                      (_dta has notes)
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        Variable      Storage   Display    Value
                            name         type    format    label      Variable label
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        mpg             int     %8.0g                 Mileage (mpg)
                        j               byte    %10.0g                
                        foreign         byte    %8.0g      origin     Car origin
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        Sorted by: mpg  j
                             Note: Dataset has changed since last saved.
                        
                        . 
                        . *and when variable labels are different
                        . use `cars', clear
                        (1978 automobile data)
                        
                        . reshape wide foreign, i(mpg) j(j)
                        (j = 1 2 3 4 5 6 7 8 9)
                        
                        Data                               Long   ->   Wide
                        -----------------------------------------------------------------------------
                        Number of observations               74   ->   21          
                        Number of variables                   3   ->   10          
                        j variable (9 values)                 j   ->   (dropped)
                        xij variables:
                                                        foreign   ->   foreign1 foreign2 ... foreign9
                        -----------------------------------------------------------------------------
                        
                        . keep mpg foreign1 foreign2
                        
                        . label var foreign1 "Car Origin A"
                        
                        . label var foreign2 "Car Origin B"
                        
                        . reshape long foreign, i(mpg) j(j)
                        (j = 1 2)
                        
                        Data                               Wide   ->   Long
                        -----------------------------------------------------------------------------
                        Number of observations               21   ->   42          
                        Number of variables                   3   ->   3           
                        j variable (2 values)                     ->   j
                        xij variables:
                                              foreign1 foreign2   ->   foreign
                        -----------------------------------------------------------------------------
                        
                        . describe
                        
                        Contains data
                         Observations:            42                  1978 automobile data
                            Variables:             3                  
                                                                      (_dta has notes)
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        Variable      Storage   Display    Value
                            name         type    format    label      Variable label
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        mpg             int     %8.0g                 Mileage (mpg)
                        j               byte    %10.0g                
                        foreign         byte    %8.0g      origin     Car origin
                        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        Sorted by: mpg  j
                             Note: Dataset has changed since last saved.
                        
                        . 
                        end of do-file

                        Comment


                        • #13
                          daniel klein interesting, thanks. I was able to replicate what you did (in my #11, I kept the initial greshapes in #2 and #3). So it looks like 1) reshape does not retain variable labels with reshape wide; 2) reshape can retain variable labels with reshape long, at least in this example.

                          I am encountering strange behavior here:

                          Code:
                          sysuse auto2, clear
                          keep mpg foreign
                          bys mpg: gen j = _n
                          reshape wide foreign, i(mpg) j(j)
                          keep mpg foreign1 foreign2
                          label var foreign1 "ABC"
                          label var foreign2 "DEF"
                          describe
                          reshape long foreign, i(mpg) j(j)
                          describe
                          Result:

                          Code:
                          . sysuse auto2, clear
                          (1978 automobile data)
                          
                          . keep mpg foreign
                          
                          . bys mpg: gen j = _n
                          
                          . reshape wide foreign, i(mpg) j(j)
                          (j = 1 2 3 4 5 6 7 8 9)
                          
                          Data                               Long   ->   Wide
                          -----------------------------------------------------------------------------
                          Number of observations               74   ->   21          
                          Number of variables                   3   ->   10          
                          j variable (9 values)                 j   ->   (dropped)
                          xij variables:
                                                          foreign   ->   foreign1 foreign2 ... foreign9
                          -----------------------------------------------------------------------------
                          
                          . keep mpg foreign1 foreign2
                          
                          . label var foreign1 "ABC"
                          
                          . label var foreign2 "DEF"
                          
                          . describe
                          
                          Contains data
                           Observations:            21                  1978 automobile data
                              Variables:             3                  
                                                                        (_dta has notes)
                          ----------------------------------------------------------------------------------------------------------
                          Variable      Storage   Display    Value
                              name         type    format    label      Variable label
                          ----------------------------------------------------------------------------------------------------------
                          mpg             int     %8.0g                 Mileage (mpg)
                          foreign1        byte    %8.0g      origin     ABC
                          foreign2        byte    %8.0g      origin     DEF
                          ----------------------------------------------------------------------------------------------------------
                          Sorted by: mpg
                               Note: Dataset has changed since last saved.
                          
                          . reshape long foreign, i(mpg) j(j)
                          (j = 1 2)
                          
                          Data                               Wide   ->   Long
                          -----------------------------------------------------------------------------
                          Number of observations               21   ->   42          
                          Number of variables                   3   ->   3          
                          j variable (2 values)                     ->   j
                          xij variables:
                                                foreign1 foreign2   ->   foreign
                          -----------------------------------------------------------------------------
                          
                          . describe
                          
                          Contains data
                           Observations:            42                  1978 automobile data
                              Variables:             3                  
                                                                        (_dta has notes)
                          ----------------------------------------------------------------------------------------------------------
                          Variable      Storage   Display    Value
                              name         type    format    label      Variable label
                          ----------------------------------------------------------------------------------------------------------
                          mpg             int     %8.0g                 Mileage (mpg)
                          j               byte    %10.0g                
                          foreign         byte    %8.0g      origin     Car origin
                          ----------------------------------------------------------------------------------------------------------
                          Sorted by: mpg  j
                               Note: Dataset has changed since last saved.

                          The weird thing is that after the final reshape, foreign has the "Car origin" variable label, but foreign1 had the "ABC" label and foreign2 had the "DEF" label. What would explain this behavior? It looks like somehow Stata remembers the initial variable label and sticks with this.

                          It also seems to be remembering the original variable label with the following code, where the variable labels after the initial reshape are "foreign" but after the last reshape are "Car origin":

                          Code:
                          sysuse auto2, clear
                          keep mpg foreign
                          bys mpg: gen j = _n
                          greshape wide foreign, i(mpg) j(j)
                          describe
                          greshape long foreign, i(mpg) j(j)
                          describe
                          
                          . sysuse auto2, clear
                          (1978 automobile data)
                          
                          . keep mpg foreign
                          
                          . bys mpg: gen j = _n
                          
                          . reshape wide foreign, i(mpg) j(j)
                          (j = 1 2 3 4 5 6 7 8 9)
                          
                          Data                               Long   ->   Wide
                          -----------------------------------------------------------------------------
                          Number of observations               74   ->   21          
                          Number of variables                   3   ->   10          
                          j variable (9 values)                 j   ->   (dropped)
                          xij variables:
                                                          foreign   ->   foreign1 foreign2 ... foreign9
                          -----------------------------------------------------------------------------
                          
                          . describe
                          
                          Contains data
                           Observations:            21                  1978 automobile data
                              Variables:            10                  
                                                                        (_dta has notes)
                          ----------------------------------------------------------------------------------------------------------
                          Variable      Storage   Display    Value
                              name         type    format    label      Variable label
                          ----------------------------------------------------------------------------------------------------------
                          mpg             int     %8.0g                 Mileage (mpg)
                          foreign1        byte    %8.0g      origin     1 foreign
                          foreign2        byte    %8.0g      origin     2 foreign
                          foreign3        byte    %8.0g      origin     3 foreign
                          foreign4        byte    %8.0g      origin     4 foreign
                          foreign5        byte    %8.0g      origin     5 foreign
                          foreign6        byte    %8.0g      origin     6 foreign
                          foreign7        byte    %8.0g      origin     7 foreign
                          foreign8        byte    %8.0g      origin     8 foreign
                          foreign9        byte    %8.0g      origin     9 foreign
                          ----------------------------------------------------------------------------------------------------------
                          Sorted by: mpg
                          
                          . reshape long foreign, i(mpg) j(j)
                          (j = 1 2 3 4 5 6 7 8 9)
                          
                          Data                               Wide   ->   Long
                          -----------------------------------------------------------------------------
                          Number of observations               21   ->   189        
                          Number of variables                  10   ->   3          
                          j variable (9 values)                     ->   j
                          xij variables:
                                   foreign1 foreign2 ... foreign9   ->   foreign
                          -----------------------------------------------------------------------------
                          
                          . describe
                          
                          Contains data
                           Observations:           189                  1978 automobile data
                              Variables:             3                  
                                                                        (_dta has notes)
                          ----------------------------------------------------------------------------------------------------------
                          Variable      Storage   Display    Value
                              name         type    format    label      Variable label
                          ----------------------------------------------------------------------------------------------------------
                          mpg             int     %8.0g                 Mileage (mpg)
                          j               byte    %10.0g                
                          foreign         byte    %8.0g      origin     Car origin
                          ----------------------------------------------------------------------------------------------------------
                          Sorted by: mpg  j
                               Note: Dataset has changed since last saved.



                          I tried it with two different variables and later changed their variable labels to be the same. In this case, reshape does not retain the variable label, but greshape does:

                          Code:
                          sysuse auto2, clear
                          bys mpg: keep if _n==1
                          keep mpg headroom weight
                          rename headroom var1
                          rename weight var2
                          label var var1 "ABC"
                          label var var2 "ABC"
                          describe
                          preserve
                          reshape long var, i(mpg) j(j)
                          describe
                          restore
                          greshape long var, i(mpg) j(j)
                          describe
                          Result:

                          Code:
                          . sysuse auto2, clear
                          (1978 automobile data)
                          
                          . bys mpg: keep if _n==1
                          (53 observations deleted)
                          
                          . keep mpg headroom weight
                          
                          . rename headroom var1
                          
                          . rename weight var2
                          
                          . label var var1 "ABC"
                          
                          . label var var2 "ABC"
                          
                          . describe
                          
                          Contains data from /Applications/Stata/ado/base/a/auto2.dta
                           Observations:            21                  1978 automobile data
                              Variables:             3                  3 Jan 2022 17:18
                                                                        (_dta has notes)
                          ----------------------------------------------------------------------------------------------------------
                          Variable      Storage   Display    Value
                              name         type    format    label      Variable label
                          ----------------------------------------------------------------------------------------------------------
                          mpg             int     %8.0g                 Mileage (mpg)
                          var1            float   %6.1f                 ABC
                          var2            int     %8.0gc                ABC
                          ----------------------------------------------------------------------------------------------------------
                          Sorted by: mpg
                               Note: Dataset has changed since last saved.
                          
                          . preserve
                          
                          . reshape long var, i(mpg) j(j)
                          (j = 1 2)
                          
                          Data                               Wide   ->   Long
                          -----------------------------------------------------------------------------
                          Number of observations               21   ->   42          
                          Number of variables                   3   ->   3          
                          j variable (2 values)                     ->   j
                          xij variables:
                                                        var1 var2   ->   var
                          -----------------------------------------------------------------------------
                          
                          . describe
                          
                          Contains data
                           Observations:            42                  1978 automobile data
                              Variables:             3                  
                                                                        (_dta has notes)
                          ----------------------------------------------------------------------------------------------------------
                          Variable      Storage   Display    Value
                              name         type    format    label      Variable label
                          ----------------------------------------------------------------------------------------------------------
                          mpg             int     %8.0g                 Mileage (mpg)
                          j               byte    %10.0g                
                          var             float   %8.0gc                
                          ----------------------------------------------------------------------------------------------------------
                          Sorted by: mpg  j
                               Note: Dataset has changed since last saved.
                          
                          . restore
                          
                          . greshape long var, i(mpg) j(j)
                          (note: j = 1 2)
                          (note: cannot preserve variable formats when reshaping long)
                          
                          Data                               wide   ->   long
                          -----------------------------------------------------------------------------
                          Number of obs.                       21   ->   42                  
                          Number of variables                    3  ->   3                    
                          j (2 values)                              ->   j
                          xij variables:
                                                        var1 var2   ->   var
                          -----------------------------------------------------------------------------
                          
                          . describe
                          
                          Contains data from /Applications/Stata/ado/base/a/auto2.dta
                           Observations:            42                  1978 automobile data
                              Variables:             3                  3 Jan 2022 17:18
                                                                        (_dta has notes)
                          ----------------------------------------------------------------------------------------------------------
                          Variable      Storage   Display    Value
                              name         type    format    label      Variable label
                          ----------------------------------------------------------------------------------------------------------
                          mpg             int     %8.0g                 Mileage (mpg)
                          j               long    %12.0g                
                          var             float   %9.0g                 ABC
                          ----------------------------------------------------------------------------------------------------------
                          Sorted by: mpg
                               Note: Dataset has changed since last saved.


                          Comment

                          Working...
                          X