Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Concatenate two "string" observation into one

    Hi everyone, I am currently analysing a dataset which provides me the data as follow:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str4 var1 str6 var2 str5 var3 str4 var4 str6 var5 str5(var6 var7) str6 var8 str5 var9
    "AUS"  "AUS"    "AUS"   "CHN"  "CHN"    "CHN"   "POL"   "POL"    "POL"  
    "coal" "plants" "glass" "coal" "plants" "glass" "coal " "plants" "glass"
    "1"    "2"      "3"     "4"    "5"      "6"     "4"     "7"      "7"    
    end
    Where the first obs are the countries, the second are the sectors and the thirds are certain values associated.
    Although I reported just 3 countries and 3 sectors my data spans up to 50 countries and 50 sectors.

    My goal would be to concatenate the first two lines of obs generating a new obs right underneath the past two.

    Such as:

    Obs 3: AUScoal AUSplant AUS glass CHNcoal CHNplants .....etc for all the 50 countries and sectors.

    I did find some solution but just for a limited amount of variables.

    Thank you very much,
    Riccardo

  • #2
    https://xyproblem.info/

    I think this is an XY problem. The problem is that metadata are included in the data. The solution is not to rearrange the metadata within the data. The solution is to get the metadata into variable names and in some instances to become other data.

    This works for your data example, but I am fearful on your behalf that your real data are much more complicated.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str4 var1 str6 var2 str5 var3 str4 var4 str6 var5 str5(var6 var7) str6 var8 str5 var9
    "AUS"  "AUS"    "AUS"   "CHN"  "CHN"    "CHN"   "POL"   "POL"    "POL"  
    "coal" "plants" "glass" "coal" "plants" "glass" "coal " "plants" "glass"
    "1"    "2"      "3"     "4"    "5"      "6"     "4"     "7"      "7"    
    end
    
    local prefixes 
    forval j = 1/9 {
        local prefix = trim(var`j'[2])  
        local prefixes `prefixes' `prefix'
        local suffix = trim(var`j'[1])
        rename var`j' `prefix'`suffix'
    }
    
    drop in 1/2 
    destring *, replace 
    gen long obs = _n 
    
    local prefixes : list uniq prefixes 
    
    reshape long `prefixes', i(obs) j(country) string 
    
    
    list 
    
         +---------------------------------------+
         | obs   country   coal   plants   glass |
         |---------------------------------------|
      1. |   1       AUS      1        2       3 |
      2. |   1       CHN      4        5       6 |
      3. |   1       POL      4        7       7 |
         +---------------------------------------+
    
    . 
    
    
         +---------------------------------------+
         | obs   country   coal   plants   glass |
         |---------------------------------------|
      1. |   1       AUS      1        2       3 |
      2. |   1       CHN      4        5       6 |
      3. |   1       POL      4        7       7 |
         +---------------------------------------+
    
    .

    Comment


    • #3
      Yes. You are completely right, I was circumnavigating the problem.
      Still, when I apply your code to the bigger dataset :

      forval j = 1/2500 { including all the data I have, it tells me "var1 not found".
      Do you have any clue about this?
      Riccardo


      Comment


      • #4
        Sorry, no, not without a real(istic) data example. #1 implies that you have a var1 and #3 denies it.

        Comment

        Working...
        X