Concatenate two "string" observation into one

Riccardo Lucarno

Join Date: Sep 2021

Posts: 10
#1

Concatenate two "string" observation into one

21 Mar 2022, 07:30

Hi everyone, I am currently analysing a dataset which provides me the data as follow:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input str4 var1 str6 var2 str5 var3 str4 var4 str6 var5 str5(var6 var7) str6 var8 str5 var9 "AUS" "AUS" "AUS" "CHN" "CHN" "CHN" "POL" "POL" "POL" "coal" "plants" "glass" "coal" "plants" "glass" "coal " "plants" "glass" "1" "2" "3" "4" "5" "6" "4" "7" "7" end

Where the first obs are the countries, the second are the sectors and the thirds are certain values associated.
Although I reported just 3 countries and 3 sectors my data spans up to 50 countries and 50 sectors.

My goal would be to concatenate the first two lines of obs generating a new obs right underneath the past two.

Such as:

Obs 3: AUScoal AUSplant AUS glass CHNcoal CHNplants .....etc for all the 50 countries and sectors.

I did find some solution but just for a limited amount of variables.

Thank you very much,
Riccardo
Tags: concatenate, loops, observation, string

Nick Cox

Join Date: Mar 2014
Posts: 35219

21 Mar 2022, 08:01

https://xyproblem.info/

I think this is an XY problem. The problem is that metadata are included in the data. The solution is not to rearrange the metadata within the data. The solution is to get the metadata into variable names and in some instances to become other data.

This works for your data example, but I am fearful on your behalf that your real data are much more complicated.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str4 var1 str6 var2 str5 var3 str4 var4 str6 var5 str5(var6 var7) str6 var8 str5 var9
"AUS"  "AUS"    "AUS"   "CHN"  "CHN"    "CHN"   "POL"   "POL"    "POL"  
"coal" "plants" "glass" "coal" "plants" "glass" "coal " "plants" "glass"
"1"    "2"      "3"     "4"    "5"      "6"     "4"     "7"      "7"    
end

local prefixes 
forval j = 1/9 {
    local prefix = trim(var`j'[2])  
    local prefixes `prefixes' `prefix'
    local suffix = trim(var`j'[1])
    rename var`j' `prefix'`suffix'
}

drop in 1/2 
destring *, replace 
gen long obs = _n 

local prefixes : list uniq prefixes 

reshape long `prefixes', i(obs) j(country) string 


list 

     +---------------------------------------+
     | obs   country   coal   plants   glass |
     |---------------------------------------|
  1. |   1       AUS      1        2       3 |
  2. |   1       CHN      4        5       6 |
  3. |   1       POL      4        7       7 |
     +---------------------------------------+

. 


     +---------------------------------------+
     | obs   country   coal   plants   glass |
     |---------------------------------------|
  1. |   1       AUS      1        2       3 |
  2. |   1       CHN      4        5       6 |
  3. |   1       POL      4        7       7 |
     +---------------------------------------+

.

Comment

Riccardo Lucarno

Join Date: Sep 2021

Posts: 10
#3

21 Mar 2022, 11:00

Yes. You are completely right, I was circumnavigating the problem.
Still, when I apply your code to the bigger dataset :

forval j = 1/2500 { including all the data I have, it tells me "var1 not found".
Do you have any clue about this?
Riccardo
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35219
#4

21 Mar 2022, 13:11

Sorry, no, not without a real(istic) data example. #1 implies that you have a var1 and #3 denies it.
Comment

Announcement

Concatenate two "string" observation into one

Comment

Comment

Comment