Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshapes wide to long without wave identifier

    Hi

    I have a follow-up tricky reshape that I would like some help with please. I would like to reshape this wide file into long. I want to reshape *state *hgage *esdtl *mrcurr etc but I do not want to reshape activity1 activity2 because this data comes from some sequence analysis data so I don't need that changed.

    Is this possible?

    Any help, appreciated as always
    Brendan


    Code:
    Example generated by -dataex-. For more info, type help dataex 
     clear input str7 xwaveid byte(ghhstate ghgage gesdtl gmrcurr) int gwscei byte ghhsos int gancob byte(gedhigh1 gtchad) float(activity1 activity2 activity3) byte(hhhstate hesdtl htchad hmrcurr) int hwscei byte hhhsos int hancob byte hedhigh1 "0100003" 4 54 1 3  580 0 1101 9 3 3 3 3 4 5 3 4    0 1 1101 9 "0100005" 4 22 6 2    0 1 1101 8 1 5 5 5 4 5 1 6    0 1 1101 8 "0100010" 2 42 3 6    0 0 6101 2 0 1 1 1 2 5 0 6    0 0 6101 2 "0100015" 1 50 1 1  719 1 1101 8 3 3 3 3 1 1 3 1  851 1 1101 5 "0100016" 1 25 1 1  748 0 1101 3 0 3 3 3 1 1 0 1 1055 0 1101 3 "0100018" 5 46 2 1  431 0 1101 4 2 1 1 1 5 2 2 1  235 0 1101 4 "0100019" 5 50 1 1 1400 0 1101 1 2 3 3 3 5 1 2 1 2300 0 1101 1 "0100020" 5 19 2 6  333 0 1101 5 0 1 2 2 5 1 0 6  582 0 1101 5 "0100021" 5 15 4 6    0 0 1101 9 0 1 1 1 5 5 0 6    0 0 1101 9 "0100025" 1 19 2 6  100 0 1101 8 0 3 3 3 1 2 0 6   14 0 1101 8 "0100026" 3 16 1 6  115 3 1101 8 0 2 2 2 3 1 0 6  287 3 1101 8 "0100028" 1 38 1 6  939 1 1101 8 0 3 3 3 1 1 0 6 1001 1 1101 8 "0100029" 1 47 5 1    0 0 1101 9 4 5 5 5 1 5 4 1    0 0 1101 9 "0100030" 1 49 1 1 2129 0 2100 4 4 3 3 3 1 1 4 1 2200 0 2100 4 "0100031" 1 20 1 6  603 0 1101 5 0 3 3 3 1 1 0 6  673 0 1101 5 "0100032" 1 18 1 6  650 0 1101 8 0 2 2 2 1 2 0 6  300 0 1101 8 "0100033" 1 16 2 6   96 0 1101 9 0 2 2 2 1 6 0 6    0 0 1101 9 "0100037" 1 49 1 6  941 2 1101 9 0 3 3 3 1 1 0 6  975 2 1101 9 "0100038" 1 35 1 1  620 3 1101 8 1 3 3 3 1 1 1 1  620 3 1101 8 "0100039" 1 33 1 1  800 3 1101 5 1 3 3 3 1 1 1 1  810 3 1101 5 "0100042" 3 45 6 1    0 1 1101 5 2 5 5 5 3 6 2 1    0 1 1101 5 "0100043" 3 51 1 1  715 1 1101 9 0 3 3 3 3 1 0 1  720 1 1101 9 "0100055" 8 31 1 1 1352 0 1101 3 1 3 3 3 8 1 1 1 1669 0 1101 3 "0100056" 8 29 1 2 1900 0 1101 5 0 3 3 3 8 1 0 1 1300 0 1101 5 "0100057" 2 51 2 1  975 0 1101 1 3 1 2 1 2 2 3 1  968 0 1101 1 "0100058" 2 51 1 1 6479 0 1101 3 3 3 3 3 2 1 3 1 7855 0 1101 3 "0100059" 2 26 1 6 1036 0 1101 3 0 3 3 3 2 1 0 2 1381 0 1101 3 "0100069" 5 55 2 4  225 1 1101 9 4 5 5 3 5 6 4 4    0 1 1101 9 "0100071" 1 47 1 4 1465 0 1101 2 2 3 3 3 1 1 2 2 1444 0 1101 2 "0100082" 1 66 6 1    0 1 7103 8 3 5 5 5 1 5 3 1    0 1 7103 8 "0100083" 1 67 6 1    0 1 2100 9 3 5 5 5 1 6 3 1    0 1 2100 9 "0100084" 1 64 6 6    0 0 1101 9 0 5 5 5 1 6 0 6    0 0 1101 9 "0100097" 2 37 1 1  620 0 1101 9 1 3 3 3 2 1 1 1    0 0 1101 9 "0100098" 2 39 5 1    0 0 2100 2 1 5 5 5 2 6 1 1    0 0 2100 2 "0100100" 3 39 1 3 1269 3 1101 2 2 2 2 2 3 1 2 3 1016 3 1101 2 "0100103" 2 33 1 1 3808 0 1101 3 0 3 3 3 2 1 0 1 4987 0 1101 3 "0100105" 2 67 6 5    0 0 1101 9 5 5 5 5 2 6 5 5    0 0 1101 9 "0100106" 2 23 1 6  815 0 1101 8 0 3 4 4 2 1 0 6  807 0 1101 8 "0100107" 5 52 1 6  900 0 1101 9 0 3 3 3 5 1 0 6  920 0 1101 9 "0100113" 6 56 1 1 1275 1 1101 9 3 3 3 3 6 6 3 1    0 1 1101 9 "0100114" 6 56 2 1  375 1 1101 9 3 3 3 3 6 2 3 1  400 1 1101 9 "0100137" 2 51 1 6  750 1 1101 5 0 3 3 3 2 1 0 6 1093 1 1101 5 "0100138" 3 34 1 6  750 0 1101 8 1 3 3 3 3 5 2 6    0 0 1101 8 "0100140" 5 46 1 6 1000 0 1101 3 0 2 2 2 5 2 0 6  600 0 1101 3 "0100147" 6 47 1 3  750 1 1101 5 1 3 3 3 6 1 3 3  710 1 1101 5 "0100148" 6 44 2 3  500 2 1101 9 3 3 3 3 6 2 3 3  500 1 1101 9 "0100149" 6 18 3 6    0 1 1101 9 0 3 3 3 6 2 0 6  470 1 1101 9 "0100150" 1 60 5 1    0 0 2100 9 3 3 3 3 1 6 3 1    0 0 2100 9 "0100151" 1 64 6 1    0 0 2100 8 3 3 3 3 1 6 3 1    0 0 2100 8 "0100152" 1 43 1 1 1389 0 1101 2 3 3 3 3 1 1 3 1  913 0 1101 2 "0100158" 3 42 1 2  830 3 1101 9 1 3 3 3 3 1 1 1    0 3 1101 9 "0100164" 3 51 1 1  750 0 1101 9 2 3 3 3 3 1 2 1 1154 0 1101 9 "0100165" 3 47 1 1  800 0 1101 5 2 3 3 3 3 1 2 1 1050 0 1101 5 "0100166" 3 17 2 6  298 0 1101 8 0 2 2 2 3 2 0 6  110 0 1101 8 "0100167" 3 16 5 6    0 0 1101 9 0 1 1 1 3 5 0 6    0 0 1101 9 "0100177" 2 28 1 1 1174 3 1101 3 0 3 3 3 6 1 0 3 2050 0 1101 3 "0100178" 6 27 1 2  980 0 1101 2 0 3 3 3 6 1 0 2 1200 3 1101 2 "0100185" 3 58 1 4  972 1 1101 9 3 3 3 3 3 1 3 4  735 1 1101 9 "0100186" 3 39 1 6  683 0 1101 4 1 3 3 3 3 1 1 2  730 1 1101 4 "0100187" 3 37 1 1    0 0 1101 4 2 3 3 3 3 1 2 1 1200 0 1101 4 "0100188" 6 34 1 1  750 1 1101 5 2 3 3 3 6 1 2 1 1131 1 1101 5 "0100195" 3 52 1 4 1178 0 1101 9 1 3 3 3 3 1 1 4 1226 0 1101 9 "0100196" 3 23 1 6 1300 0 1101 8 0 3 3 3 3 1 0 6 1400 0 1101 8 "0100197" 1 33 6 1    0 0 1101 3 1 3 3 3 1 6 1 1    0 0 1101 3 "0100198" 5 53 1 1  555 3 1101 8 3 3 3 3 5 1 2 1    0 3 1101 8 "0100199" 5 47 2 1    0 3 1101 3 3 3 3 3 5 2 3 1    0 3 1101 3 "0100201" 5 16 6 6    0 3 1101 9 0 5 5 5 5 2 0 6  150 3 1101 8 "0100206" 2 34 1 1  990 0 1101 9 1 3 3 3 2 1 1 1 1100 0 1101 9 "0100207" 2 16 2 6  253 0 1101 9 0 1 1 1 2 1 0 6  400 0 1101 9 "0100208" 2 31 1 1  464 0 1101 4 0 3 3 3 2 1 0 1  646 0 1101 4 "0100209" 2 33 3 1    0 0 5203 5 0 3 3 3 2 1 0 1  810 0 5203 5 "0100217" 1 38 1 6 1726 0 1201 2 0 3 3 3 1 1 0 6 1414 0 1201 2 "0100218" 3 55 6 4    0 0 1101 3 2 5 5 5 3 6 2 4    0 0 1101 3 "0100246" 1 30 6 1  200 0 1101 3 3 3 3 3 1 2 3 1  150 0 1101 3 "0100249" 1 57 6 1    0 0 7203 1 2 5 5 5 1 6 2 1    0 0 7203 1 "0100250" 1 22 1 6  865 0 7203 3 0 3 3 4 1 1 0 6 1055 0 7203 3 "0100257" 2 38 1 6 1239 0 1101 3 0 3 3 3 2 1 0 6 1534 0 1101 3 "0100259" 5 38 1 2 1425 0 2100 5 1 3 3 3 5 1 1 2 1500 0 2100 5 "0100260" 1 53 4 4    0 1 1101 9 3 5 5 5 1 6 3 4    0 3 1101 9 "0100261" 1 24 5 6    0 1 1101 8 0 5 5 5 1 6 0 6    0 1 1101 8 "0100268" 1 49 6 4    0 1 1101 9 4 5 5 5 1 6 4 4    0 1 1101 9 "0100270" 2 47 2 1  550 3 1101 9 2 3 3 3 2 2 2 1   50 3 1101 9 "0100271" 2 48 1 1 2100 3 1101 5 2 3 3 3 2 1 2 1 2500 3 1101 5 "0100272" 2 16 2 6  115 3 1101 9 0 2 2 2 2 2 0 6  120 3 1101 9 "0100274" 3 38 5 1    0 0 1101 2 1 3 5 5 3 6 1 1    0 0 1101 2 "0100279" 3 53 1 4  500 0 2303 5 2 3 3 3 3 1 2 4    0 0 2303 5 "0100283" 2 29 1 6  856 0 1101 5 0 3 3 3 2 1 0 6  991 0 1101 5 "0100284" 5 54 1 1    0 3 1101 2 3 3 3 3 5 1 3 1  349 3 1101 2 "0100285" 5 62 1 1    0 3 1101 4 3 3 3 3 5 2 3 1    0 3 1101 4 "0100287" 2 52 1 1 1266 0 3302 8 2 3 3 3 2 1 2 1 3490 0 3302 8 "0100288" 2 61 6 1    0 0 3302 8 2 5 5 5 2 6 2 1    0 0 3302 8 "0100289" 2 45 1 1 1200 0 1101 5 4 3 3 3 2 1 4 1 1100 0 1101 5 "0100290" 2 44 1 1 1003 0 1101 1 4 3 3 3 2 1 4 1 1000 0 1101 1 "0100299" 4 42 1 1    0 0 1101 5 2 3 3 3 4 1 2 1    0 0 1101 5 "0100300" 4 45 2 1  230 0 1101 2 2 3 3 3 4 2 2 1  300 0 1101 2 "0100305" 3 26 1 6 1138 2 1101 3 0 3 3 3 3 1 0 6 1025 2 1101 3 "0100306" 3 24 2 6  529 1 1101 3 0 3 3 3 3 1 0 6  998 1 1101 3 "0100309" 4 60 5 1    0 3 1101 8 2 3 3 3 4 1 2 1  670 3 1101 8 "0100310" 1 82 6 5    0 1 1101 9 4 5 5 5 1 6 4 5    0 1 1101 9 "0100315" 2 33 1 1 2895 0 1101 1 1 3 3 3 2 1 2 1 3500 0 1101 1 end label values ghhstate GHHSTATE label def GHHSTATE 1 "[1] NSW", modify label def GHHSTATE 2 "[2] VIC", modify label def GHHSTATE 3 "[3] QLD", modify label def GHHSTATE 4 "[4] SA", modify label def GHHSTATE 5 "[5] WA", modify label def GHHSTATE 6 "[6] TAS", modify label def GHHSTATE 8 "[8] ACT", modify label values ghgage GHGAGE label values gesdtl GESDTL label def GESDTL 1 "[1] Employed FT", modify label def GESDTL 2 "[2] Employed PT", modify label def GESDTL 3 "[3] Unemployed, looking for FT work", modify label def GESDTL 4 "[4] Unemployed, looking for PT work", modify label def GESDTL 5 "[5] Not in the labour force, marginally attached", modify label def GESDTL 6 "[6] Not in the labour force, not marginally attached", modify label values gmrcurr GMRCURR label def GMRCURR 1 "[1] Legally married", modify label def GMRCURR 2 "[2] De facto", modify label def GMRCURR 3 "[3] Separated", modify label def GMRCURR 4 "[4] Divorced", modify label def GMRCURR 5 "[5] Widowed", modify label def GMRCURR 6 "[6] Never married and not de facto", modify label values gwscei GNUM label values ghhsos GHHSOS label def GHHSOS 0 "[0] Major Urban", modify label def GHHSOS 1 "[1] Other Urban", modify label def GHHSOS 2 "[2] Bounded Locality", modify label def GHHSOS 3 "[3] Rural Balance", modify label values gancob GCOUNTRY label def GCOUNTRY 1101 "[1101] Australia", modify label def GCOUNTRY 1201 "[1201] New Zealand", modify label def GCOUNTRY 2100 "[2100] United Kingdom", modify label def GCOUNTRY 2303 "[2303] France", modify label def GCOUNTRY 3302 "[3302] Czech Republic", modify label def GCOUNTRY 5203 "[5203] Malaysia", modify label def GCOUNTRY 6101 "[6101] China (excludes SARs and Taiwan)", modify label def GCOUNTRY 7103 "[7103] India", modify label def GCOUNTRY 7203 "[7203] Azerbaijan", modify label values gedhigh1 GEDHIGHB label def GEDHIGHB 1 "[1] Postgrad - masters or doctorate", modify label def GEDHIGHB 2 "[2] Grad diploma, grad certificate", modify label def GEDHIGHB 3 "[3] Bachelor or honours", modify label def GEDHIGHB 4 "[4] Adv diploma, diploma", modify label def GEDHIGHB 5 "[5] Cert III or IV", modify label def GEDHIGHB 8 "[8] Year 12", modify label def GEDHIGHB 9 "[9] Year 11 and below", modify label values gtchad GTCHAD label def GTCHAD 0 "[0] No children ever", modify label values activity1 activity label values activity2 activity label values activity3 activity label def activity 1 "Study only", modify label def activity 2 "Work & study", modify label def activity 3 "Work only", modify label def activity 5 "NILF", modify label def activity 4 "Unemployed", modify label values hhhstate HHHSTATE label def HHHSTATE 1 "[1] NSW", modify label def HHHSTATE 2 "[2] VIC", modify label def HHHSTATE 3 "[3] QLD", modify label def HHHSTATE 4 "[4] SA", modify label def HHHSTATE 5 "[5] WA", modify label def HHHSTATE 6 "[6] TAS", modify label def HHHSTATE 8 "[8] ACT", modify label values hesdtl HESDTL label def HESDTL 1 "[1] Employed FT", modify label def HESDTL 2 "[2] Employed PT", modify label def HESDTL 5 "[5] Not in the labour force, marginally attached", modify label def HESDTL 6 "[6] Not in the labour force, not marginally attached", modify label values htchad HTCHAD label def HTCHAD 0 "[0] No children ever", modify label values hmrcurr HMRCURR label def HMRCURR 1 "[1] Legally married", modify label def HMRCURR 2 "[2] De facto", modify label def HMRCURR 3 "[3] Separated", modify label def HMRCURR 4 "[4] Divorced", modify label def HMRCURR 5 "[5] Widowed", modify label def HMRCURR 6 "[6] Never married and not de facto", modify label values hwscei HNUM label values hhhsos HHHSOS label def HHHSOS 0 "[0] Major Urban", modify label def HHHSOS 1 "[1] Other Urban", modify label def HHHSOS 2 "[2] Bounded Locality", modify label def HHHSOS 3 "[3] Rural Balance", modify label values hancob HCOUNTRY label def HCOUNTRY 1101 "[1101] Australia", modify label def HCOUNTRY 1201 "[1201] New Zealand", modify label def HCOUNTRY 2100 "[2100] United Kingdom", modify label def HCOUNTRY 2303 "[2303] France", modify label def HCOUNTRY 3302 "[3302] Czech Republic", modify label def HCOUNTRY 5203 "[5203] Malaysia", modify label def HCOUNTRY 6101 "[6101] China (excludes SARs and Taiwan)", modify label def HCOUNTRY 7103 "[7103] India", modify label def HCOUNTRY 7203 "[7203] Azerbaijan", modify label values hedhigh1 HEDHIGHB label def HEDHIGHB 1 "[1] Postgrad - masters or doctorate", modify label def HEDHIGHB 2 "[2] Grad diploma, grad certificate", modify label def HEDHIGHB 3 "[3] Bachelor or honours", modify label def HEDHIGHB 4 "[4] Adv diploma, diploma", modify label def HEDHIGHB 5 "[5] Cert III or IV", modify label def HEDHIGHB 8 "[8] Year 12", modify label def HEDHIGHB 9 "[9] Year 11 and below", modify
    *

  • #2
    Actually, the only tricky part of this was unravelling your -datex- output which was somehow stretched out into a single, extremely long line. Please examine your posts in Preview before posting, and then give them at least a glance after you post to make sure what you tried to show is actually there and usable.

    Code:
     gen long obs_no = _n
     reshape long @state @hgage @esdtl @mrcurr @wscei @hhsos @ancob @edhigh1 @tchad, ///
        i(obs_no) j(whatever) string
    I should also add that in your example data, -xwaveid- uniquely identifies observations. If that is true of your data as a whole (check by running -isid xwaveid-), then you don't need to create the obs_no variable, and you should, instead, just use -i(xwaveid)- instead of -i(obs_no)-.

    Added: In case this thread goes on to deal with other issues, or in case somebody else wants to experiment with other approaches for this data set, here is a usable -dataex- for it:

    Code:
    *Example generated by -dataex-. For more info, type help dataex
     clear
     input str7 xwaveid byte(ghhstate ghgage gesdtl gmrcurr) int gwscei byte ghhsos int gancob byte(gedhigh1 gtchad) float(activity1 activity2 activity3) byte(hhhstate hesdtl htchad hmrcurr) int hwscei byte hhhsos int hancob byte hedhigh1
     "0100003" 4 54 1 3  580 0 1101 9 3 3 3 3 4 5 3 4    0 1 1101 9
     "0100005" 4 22 6 2    0 1 1101 8 1 5 5 5 4 5 1 6    0 1 1101 8
     "0100010" 2 42 3 6    0 0 6101 2 0 1 1 1 2 5 0 6    0 0 6101 2
     "0100015" 1 50 1 1  719 1 1101 8 3 3 3 3 1 1 3 1  851 1 1101 5
     "0100016" 1 25 1 1  748 0 1101 3 0 3 3 3 1 1 0 1 1055 0 1101 3
     "0100018" 5 46 2 1  431 0 1101 4 2 1 1 1 5 2 2 1  235 0 1101 4
     "0100019" 5 50 1 1 1400 0 1101 1 2 3 3 3 5 1 2 1 2300 0 1101 1
     "0100020" 5 19 2 6  333 0 1101 5 0 1 2 2 5 1 0 6  582 0 1101 5
     "0100021" 5 15 4 6    0 0 1101 9 0 1 1 1 5 5 0 6    0 0 1101 9
     "0100025" 1 19 2 6  100 0 1101 8 0 3 3 3 1 2 0 6   14 0 1101 8
     "0100026" 3 16 1 6  115 3 1101 8 0 2 2 2 3 1 0 6  287 3 1101 8
     "0100028" 1 38 1 6  939 1 1101 8 0 3 3 3 1 1 0 6 1001 1 1101 8
     "0100029" 1 47 5 1    0 0 1101 9 4 5 5 5 1 5 4 1    0 0 1101 9
     "0100030" 1 49 1 1 2129 0 2100 4 4 3 3 3 1 1 4 1 2200 0 2100 4
     "0100031" 1 20 1 6  603 0 1101 5 0 3 3 3 1 1 0 6  673 0 1101 5
     "0100032" 1 18 1 6  650 0 1101 8 0 2 2 2 1 2 0 6  300 0 1101 8
     "0100033" 1 16 2 6   96 0 1101 9 0 2 2 2 1 6 0 6    0 0 1101 9
     "0100037" 1 49 1 6  941 2 1101 9 0 3 3 3 1 1 0 6  975 2 1101 9
     "0100038" 1 35 1 1  620 3 1101 8 1 3 3 3 1 1 1 1  620 3 1101 8
     "0100039" 1 33 1 1  800 3 1101 5 1 3 3 3 1 1 1 1  810 3 1101 5
     "0100042" 3 45 6 1    0 1 1101 5 2 5 5 5 3 6 2 1    0 1 1101 5
     "0100043" 3 51 1 1  715 1 1101 9 0 3 3 3 3 1 0 1  720 1 1101 9
     "0100055" 8 31 1 1 1352 0 1101 3 1 3 3 3 8 1 1 1 1669 0 1101 3
     "0100056" 8 29 1 2 1900 0 1101 5 0 3 3 3 8 1 0 1 1300 0 1101 5
     "0100057" 2 51 2 1  975 0 1101 1 3 1 2 1 2 2 3 1  968 0 1101 1
     "0100058" 2 51 1 1 6479 0 1101 3 3 3 3 3 2 1 3 1 7855 0 1101 3
     "0100059" 2 26 1 6 1036 0 1101 3 0 3 3 3 2 1 0 2 1381 0 1101 3
     "0100069" 5 55 2 4  225 1 1101 9 4 5 5 3 5 6 4 4    0 1 1101 9
     "0100071" 1 47 1 4 1465 0 1101 2 2 3 3 3 1 1 2 2 1444 0 1101 2
     "0100082" 1 66 6 1    0 1 7103 8 3 5 5 5 1 5 3 1    0 1 7103 8
     "0100083" 1 67 6 1    0 1 2100 9 3 5 5 5 1 6 3 1    0 1 2100 9
     "0100084" 1 64 6 6    0 0 1101 9 0 5 5 5 1 6 0 6    0 0 1101 9
     "0100097" 2 37 1 1  620 0 1101 9 1 3 3 3 2 1 1 1    0 0 1101 9
     "0100098" 2 39 5 1    0 0 2100 2 1 5 5 5 2 6 1 1    0 0 2100 2
     "0100100" 3 39 1 3 1269 3 1101 2 2 2 2 2 3 1 2 3 1016 3 1101 2
     "0100103" 2 33 1 1 3808 0 1101 3 0 3 3 3 2 1 0 1 4987 0 1101 3
     "0100105" 2 67 6 5    0 0 1101 9 5 5 5 5 2 6 5 5    0 0 1101 9
     "0100106" 2 23 1 6  815 0 1101 8 0 3 4 4 2 1 0 6  807 0 1101 8
     "0100107" 5 52 1 6  900 0 1101 9 0 3 3 3 5 1 0 6  920 0 1101 9
     "0100113" 6 56 1 1 1275 1 1101 9 3 3 3 3 6 6 3 1    0 1 1101 9
     "0100114" 6 56 2 1  375 1 1101 9 3 3 3 3 6 2 3 1  400 1 1101 9
     "0100137" 2 51 1 6  750 1 1101 5 0 3 3 3 2 1 0 6 1093 1 1101 5
     "0100138" 3 34 1 6  750 0 1101 8 1 3 3 3 3 5 2 6    0 0 1101 8
     "0100140" 5 46 1 6 1000 0 1101 3 0 2 2 2 5 2 0 6  600 0 1101 3
     "0100147" 6 47 1 3  750 1 1101 5 1 3 3 3 6 1 3 3  710 1 1101 5
     "0100148" 6 44 2 3  500 2 1101 9 3 3 3 3 6 2 3 3  500 1 1101 9
     "0100149" 6 18 3 6    0 1 1101 9 0 3 3 3 6 2 0 6  470 1 1101 9
     "0100150" 1 60 5 1    0 0 2100 9 3 3 3 3 1 6 3 1    0 0 2100 9
     "0100151" 1 64 6 1    0 0 2100 8 3 3 3 3 1 6 3 1    0 0 2100 8
     "0100152" 1 43 1 1 1389 0 1101 2 3 3 3 3 1 1 3 1  913 0 1101 2
     "0100158" 3 42 1 2  830 3 1101 9 1 3 3 3 3 1 1 1    0 3 1101 9
     "0100164" 3 51 1 1  750 0 1101 9 2 3 3 3 3 1 2 1 1154 0 1101 9
     "0100165" 3 47 1 1  800 0 1101 5 2 3 3 3 3 1 2 1 1050 0 1101 5
     "0100166" 3 17 2 6  298 0 1101 8 0 2 2 2 3 2 0 6  110 0 1101 8
     "0100167" 3 16 5 6    0 0 1101 9 0 1 1 1 3 5 0 6    0 0 1101 9
     "0100177" 2 28 1 1 1174 3 1101 3 0 3 3 3 6 1 0 3 2050 0 1101 3
     "0100178" 6 27 1 2  980 0 1101 2 0 3 3 3 6 1 0 2 1200 3 1101 2
     "0100185" 3 58 1 4  972 1 1101 9 3 3 3 3 3 1 3 4  735 1 1101 9
     "0100186" 3 39 1 6  683 0 1101 4 1 3 3 3 3 1 1 2  730 1 1101 4
     "0100187" 3 37 1 1    0 0 1101 4 2 3 3 3 3 1 2 1 1200 0 1101 4
     "0100188" 6 34 1 1  750 1 1101 5 2 3 3 3 6 1 2 1 1131 1 1101 5
     "0100195" 3 52 1 4 1178 0 1101 9 1 3 3 3 3 1 1 4 1226 0 1101 9
     "0100196" 3 23 1 6 1300 0 1101 8 0 3 3 3 3 1 0 6 1400 0 1101 8
     "0100197" 1 33 6 1    0 0 1101 3 1 3 3 3 1 6 1 1    0 0 1101 3
     "0100198" 5 53 1 1  555 3 1101 8 3 3 3 3 5 1 2 1    0 3 1101 8
     "0100199" 5 47 2 1    0 3 1101 3 3 3 3 3 5 2 3 1    0 3 1101 3
     "0100201" 5 16 6 6    0 3 1101 9 0 5 5 5 5 2 0 6  150 3 1101 8
     "0100206" 2 34 1 1  990 0 1101 9 1 3 3 3 2 1 1 1 1100 0 1101 9
     "0100207" 2 16 2 6  253 0 1101 9 0 1 1 1 2 1 0 6  400 0 1101 9
     "0100208" 2 31 1 1  464 0 1101 4 0 3 3 3 2 1 0 1  646 0 1101 4
     "0100209" 2 33 3 1    0 0 5203 5 0 3 3 3 2 1 0 1  810 0 5203 5
     "0100217" 1 38 1 6 1726 0 1201 2 0 3 3 3 1 1 0 6 1414 0 1201 2
     "0100218" 3 55 6 4    0 0 1101 3 2 5 5 5 3 6 2 4    0 0 1101 3
     "0100246" 1 30 6 1  200 0 1101 3 3 3 3 3 1 2 3 1  150 0 1101 3
     "0100249" 1 57 6 1    0 0 7203 1 2 5 5 5 1 6 2 1    0 0 7203 1
     "0100250" 1 22 1 6  865 0 7203 3 0 3 3 4 1 1 0 6 1055 0 7203 3
     "0100257" 2 38 1 6 1239 0 1101 3 0 3 3 3 2 1 0 6 1534 0 1101 3
     "0100259" 5 38 1 2 1425 0 2100 5 1 3 3 3 5 1 1 2 1500 0 2100 5
     "0100260" 1 53 4 4    0 1 1101 9 3 5 5 5 1 6 3 4    0 3 1101 9
     "0100261" 1 24 5 6    0 1 1101 8 0 5 5 5 1 6 0 6    0 1 1101 8
     "0100268" 1 49 6 4    0 1 1101 9 4 5 5 5 1 6 4 4    0 1 1101 9
     "0100270" 2 47 2 1  550 3 1101 9 2 3 3 3 2 2 2 1   50 3 1101 9
     "0100271" 2 48 1 1 2100 3 1101 5 2 3 3 3 2 1 2 1 2500 3 1101 5
     "0100272" 2 16 2 6  115 3 1101 9 0 2 2 2 2 2 0 6  120 3 1101 9
     "0100274" 3 38 5 1    0 0 1101 2 1 3 5 5 3 6 1 1    0 0 1101 2
     "0100279" 3 53 1 4  500 0 2303 5 2 3 3 3 3 1 2 4    0 0 2303 5
     "0100283" 2 29 1 6  856 0 1101 5 0 3 3 3 2 1 0 6  991 0 1101 5
     "0100284" 5 54 1 1    0 3 1101 2 3 3 3 3 5 1 3 1  349 3 1101 2
     "0100285" 5 62 1 1    0 3 1101 4 3 3 3 3 5 2 3 1    0 3 1101 4
     "0100287" 2 52 1 1 1266 0 3302 8 2 3 3 3 2 1 2 1 3490 0 3302 8
     "0100288" 2 61 6 1    0 0 3302 8 2 5 5 5 2 6 2 1    0 0 3302 8
     "0100289" 2 45 1 1 1200 0 1101 5 4 3 3 3 2 1 4 1 1100 0 1101 5
     "0100299" 4 42 1 1    0 0 1101 5 2 3 3 3 4 1 2 1    0 0 1101 5
     "0100300" 4 45 2 1  230 0 1101 2 2 3 3 3 4 2 2 1  300 0 1101 2
     "0100305" 3 26 1 6 1138 2 1101 3 0 3 3 3 3 1 0 6 1025 2 1101 3
     "0100306" 3 24 2 6  529 1 1101 3 0 3 3 3 3 1 0 6  998 1 1101 3
     "0100309" 4 60 5 1    0 3 1101 8 2 3 3 3 4 1 2 1  670 3 1101 8
     "0100310" 1 82 6 5    0 1 1101 9 4 5 5 5 1 6 4 5    0 1 1101 9
     "0100315" 2 33 1 1 2895 0 1101 1 1 3 3 3 2 1 2 1 3500 0 1101 1
     end
     label values ghhstate GHHSTATE
     label def GHHSTATE 1 "[1] NSW", modify
     label def GHHSTATE 2 "[2] VIC", modify
     label def GHHSTATE 3 "[3] QLD", modify
     label def GHHSTATE 4 "[4] SA", modify
     label def GHHSTATE 5 "[5] WA", modify
     label def GHHSTATE 6 "[6] TAS", modify
     label def GHHSTATE 8 "[8] ACT", modify
     label values ghgage GHGAGE
     label values gesdtl GESDTL
     label def GESDTL 1 "[1] Employed FT", modify
     label def GESDTL 2 "[2] Employed PT", modify
     label def GESDTL 3 "[3] Unemployed, looking for FT work", modify
     label def GESDTL 4 "[4] Unemployed, looking for PT work", modify
     label def GESDTL 5 "[5] Not in the labour force, marginally attached", modify
     label def GESDTL 6 "[6] Not in the labour force, not marginally attached", modify
     label values gmrcurr GMRCURR
     label def GMRCURR 1 "[1] Legally married", modify
     label def GMRCURR 2 "[2] De facto", modify
     label def GMRCURR 3 "[3] Separated", modify
     label def GMRCURR 4 "[4] Divorced", modify
     label def GMRCURR 5 "[5] Widowed", modify
     label def GMRCURR 6 "[6] Never married and not de facto", modify
     label values gwscei GNUM
     label values ghhsos GHHSOS
     label def GHHSOS 0 "[0] Major Urban", modify
     label def GHHSOS 1 "[1] Other Urban", modify
     label def GHHSOS 2 "[2] Bounded Locality", modify
     label def GHHSOS 3 "[3] Rural Balance", modify
     label values gancob GCOUNTRY
     label def GCOUNTRY 1101 "[1101] Australia", modify
     label def GCOUNTRY 1201 "[1201] New Zealand", modify
     label def GCOUNTRY 2100 "[2100] United Kingdom", modify
     label def GCOUNTRY 2303 "[2303] France", modify
     label def GCOUNTRY 3302 "[3302] Czech Republic", modify
     label def GCOUNTRY 5203 "[5203] Malaysia", modify
     label def GCOUNTRY 6101 "[6101] China (excludes SARs and Taiwan)", modify
     label def GCOUNTRY 7103 "[7103] India", modify
     label def GCOUNTRY 7203 "[7203] Azerbaijan", modify
     label values gedhigh1 GEDHIGHB
     label def GEDHIGHB 1 "[1] Postgrad - masters or doctorate", modify
     label def GEDHIGHB 2 "[2] Grad diploma, grad certificate", modify
     label def GEDHIGHB 3 "[3] Bachelor or honours", modify
     label def GEDHIGHB 4 "[4] Adv diploma, diploma", modify
     label def GEDHIGHB 5 "[5] Cert III or IV", modify
     label def GEDHIGHB 8 "[8] Year 12", modify
     label def GEDHIGHB 9 "[9] Year 11 and below", modify
     label values gtchad GTCHAD
     label def GTCHAD 0 "[0] No children ever", modify
     label values activity1 activity
     label values activity2 activity
     label values activity3 activity
     label def activity 1 "Study only", modify
     label def activity 2 "Work & study", modify
     label def activity 3 "Work only", modify
     label def activity 5 "NILF", modify
     label def activity 4 "Unemployed", modify
     label values hhhstate HHHSTATE
     label def HHHSTATE 1 "[1] NSW", modify
     label def HHHSTATE 2 "[2] VIC", modify
     label def HHHSTATE 3 "[3] QLD", modify
     label def HHHSTATE 4 "[4] SA", modify
     label def HHHSTATE 5 "[5] WA", modify
     label def HHHSTATE 6 "[6] TAS", modify
     label def HHHSTATE 8 "[8] ACT", modify
     label values hesdtl HESDTL
     label def HESDTL 1 "[1] Employed FT", modify
     label def HESDTL 2 "[2] Employed PT", modify
     label def HESDTL 5 "[5] Not in the labour force, marginally attached", modify
     label def HESDTL 6 "[6] Not in the labour force, not marginally attached", modify
     label values htchad HTCHAD
     label def HTCHAD 0 "[0] No children ever", modify
     label values hmrcurr HMRCURR
     label def HMRCURR 1 "[1] Legally married", modify
     label def HMRCURR 2 "[2] De facto", modify
     label def HMRCURR 3 "[3] Separated", modify
     label def HMRCURR 4 "[4] Divorced", modify
     label def HMRCURR 5 "[5] Widowed", modify
     label def HMRCURR 6 "[6] Never married and not de facto", modify
     label values hwscei HNUM
     label values hhhsos HHHSOS
     label def HHHSOS 0 "[0] Major Urban", modify
     label def HHHSOS 1 "[1] Other Urban", modify
     label def HHHSOS 2 "[2] Bounded Locality", modify
     label def HHHSOS 3 "[3] Rural Balance", modify
     label values hancob HCOUNTRY
     label def HCOUNTRY 1101 "[1101] Australia", modify
     label def HCOUNTRY 1201 "[1201] New Zealand", modify
     label def HCOUNTRY 2100 "[2100] United Kingdom", modify
     label def HCOUNTRY 2303 "[2303] France", modify
     label def HCOUNTRY 3302 "[3302] Czech Republic", modify
     label def HCOUNTRY 5203 "[5203] Malaysia", modify
     label def HCOUNTRY 6101 "[6101] China (excludes SARs and Taiwan)", modify
     label def HCOUNTRY 7103 "[7103] India", modify
     label def HCOUNTRY 7203 "[7203] Azerbaijan", modify
     label values hedhigh1 HEDHIGHB
     label def HEDHIGHB 1 "[1] Postgrad - masters or doctorate", modify
     label def HEDHIGHB 2 "[2] Grad diploma, grad certificate", modify
     label def HEDHIGHB 3 "[3] Bachelor or honours", modify
     label def HEDHIGHB 4 "[4] Adv diploma, diploma", modify
     label def HEDHIGHB 5 "[5] Cert III or IV", modify
     label def HEDHIGHB 8 "[8] Year 12", modify
     label def HEDHIGHB 9 "[9] Year 11 and below", modify
    Added: Are you sure you have correctly identified the variable stubs for reshaping? Both *state (yours) and *hsos (mine, modeled on *state) seem problematic because we end up with a bunch of observations for which variable whatever takes on the values ghh and hhh, and all of the other variables variables that were converted to long are missing. Notably, there is no gstate or ghsos variable in the original data. Perhaps you really meant *hstate and *hhsos for those? If so, replace @state and @hsos in the -reshape- command with @hstate and @hhsos, respectively.

    Also, I have verified that in your example data, the value labels for corresponding variables are, if not identical in all cases, at least completely consistent. You should also verify that in your full data set, because if they are not, the reshape to long will mix apples and oranges in the same variable, but the result will not be tasty fruit salad. It will be data salad. If the G* and H* value labels are not compatible, you need to -decode- all of those variables first, -reshape- the data set with the string variables instead, and then you can -encode- them all at the end.
    Last edited by Clyde Schechter; 21 Feb 2022, 16:19.

    Comment


    • #3
      Note that this question somewhat continues an earlier topic, to which it was added earlier today.

      https://www.statalist.org/forums/for...ong-conversion

      The post here demonstrates that copying from a code block in a previous post and pasting it into a new post often does not work.

      Comment


      • #4
        Thanks Clyde for your assistance and apologies for the issue with the code.

        I should've clarified a few things.

        This data comes from a sequence OM analysis of monthly employment activity data, which is captured by the activity variable. So the data has been reshaped from wide to long at some point and the activity variable is now kind of an aggregate variable of like monthly employment calendar data. The groupWARD_5 variable is the identified pathways but I want to do further multivariate analysis so I need to make the other wide variables like @edhigh1 here long. I tried your code Clyde but then I get an error saying that xwaveid does not correctly identify. Is this because i'm going from the sequence and OM analysis to this or something else?

        Or is there a better way to do this - perform the sequence analysis and OM and then....

        Thanks for any insights
        Brendan

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str7 xwaveid int order byte(gedhigh1 hedhigh1 iedhigh1) float(activity w1age om1 _SQid) int(distWARD_id distWARD_ord distWARD_pht) double distWARD_hgt byte groupWARD_5
        "0100003"   1 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"   2 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"   3 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"   4 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"   5 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"   6 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"   7 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"   8 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"   9 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  10 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  11 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  12 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  13 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  14 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  15 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  16 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  17 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  18 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  19 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  20 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  21 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  22 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  23 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  24 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  25 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  26 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  27 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  28 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  29 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  30 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  31 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  32 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  33 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  34 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  35 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  36 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  37 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  38 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  39 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  40 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  41 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  42 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  43 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  44 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  45 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  46 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  47 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  48 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  49 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  50 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  51 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  52 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  53 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  54 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  55 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  56 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  57 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  58 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  59 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  60 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  61 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  62 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  63 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  64 9 9 9 3 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  65 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  66 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  67 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  68 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  69 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  70 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  71 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  72 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  73 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  74 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  75 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  76 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  77 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  78 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  79 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  80 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  81 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  82 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  83 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  84 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  85 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  86 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  87 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  88 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  89 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  90 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  91 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  92 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  93 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  94 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  95 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  96 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  97 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  98 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003"  99 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        "0100003" 100 9 9 9 5 3 1.3095238 5016 5016 2874 706 0 4
        end
        label values gedhigh1 GEDHIGHB
        label def GEDHIGHB 9 "[9] Year 11 and below", modify
        label values hedhigh1 HEDHIGHB
        label def HEDHIGHB 9 "[9] Year 11 and below", modify
        label values iedhigh1 IEDHIGHB
        label def IEDHIGHB 9 "[9] Year 11 and below", modify
        label values activity activity
        label def activity 3 "Work only", modify
        label def activity 5 "NILF", modify
        label values w1age wave1age
        label def wave1age 3 "mature adults", modify
        label values groupWARD_5 pathno
        Last edited by Brendan Churchill; 22 Feb 2022, 00:46.

        Comment


        • #5
          I have no idea what a sequence OM analysis is, nor what OM is an abbreviation of. So I can't shed much light on the genesis of the situation.

          You also did not answer my question about whether *state and *hsos should be *hstate and *hhsos. So not much basis for making progress here.

          Your new example data is rather different from what you showed in #1. Clearly xwaveid does not uniquely identify observations in this version (whereas it did in the earlier version.) But it appeares that the combination of xwaveid and order does. So ono the assumption that this pattern is also true of the full data set, you can take this already long data set to double-long as follows:

          Code:
          isid xwaveid order
          
          reshape long @edhigh1, i(xwaveid order) j(whatever) string
          The -isid- command will verify that xwaveid and order do, in fact, uniquely identify observations throughout your data set. If it halts and says they don't, then you can do this instead:

          Code:
          gen long obs_no = _n
          reshape long @edhigh1, i(obs_no) j(whatever) string

          Comment


          • #6
            Hi Clyde

            Thanks again.

            I'm using sequence analysis and optimal matching techniques on monthly employment and education calendar data.

            Code:
            foreach w in g h i j k l m n o p q r s t {
            
                use "Rperson_`w'200c.dta", clear
            
                //STEP 1: generate indicator activity variables. (at this stage, not mutually exclusive)
            
                // unemployment and NILF indicator variables are already in the HILDA calendar 
                // (acaune01 ... acaune36; acanlf01 ... acanlf36)
            
                //create an indicator variable (aedupart01...aedupart36) for participating in any education
                // (school or post-school; PT or FT) 
                // Note that at this stage, those enrolled in education will also include some who have a job.
                // Will separate students with and without a job in step 2.
                foreach y in 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  ///
                             31 32 33 34 35 36 {
                    generate `w'edupart`y' =0         // this will include both 'no' and 'not asked'                                                         
                    replace  `w'edupart`y' = 1 if `w'caeft`y' ==1 | `w'caept`y'==1
                    }                                                                                            
            
                forvalues y =1/9 {
                    rename `w'edupart0`y' `w'edupart`y'
                    }
                
                //create an indicator variable (aemployed01...aemployed36) for employed in at least one job 
                foreach y in 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 ///
                             21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 {
                    generate `w'employed`y' =0        
                    replace  `w'employed`y' =1  if `w'caj01`y' ==1 | `w'caj02`y' ==1 | ///
                            `w'caj03`y' ==1 | `w'caj04`y' ==1 |`w'caj05`y' ==1 | ///
                            `w'caj06`y' ==1 | `w'caj07`y' ==1 | `w'caj08`y' ==1 | ///
                            `w'caj09`y' ==1 | `w'caj10`y' ==1 |`w'caj11`y' ==1 |`w'caj12`y' ==1 
                    }
            
                // rename emp variables so that they have suffixes that can be treated as numbers
                forvalues y =1/9 {
                    rename `w'employed0`y' `w'employed`y'
                    }
            
                // rename unemp variables so that they have suffixes that can be treated as numbers    
                forvalues y =1/9 {
                    rename `w'caune0`y' `w'caune`y'
                    }
            
                // rename nilf variables so that they have suffixes that can be treated as numbers
                forvalues y = 1/9 {
                    rename `w'canlf0`y' `w'canlf`y'
                    }
            
            
                //STEP 2: create mutually exclusive and exhaustive, activity indicator variables
                // for each month third
            
                // 1st activity category: in education only (not working, but can be either 
                // unemployed or NILF while studying)
                // need to exclude those also in employment
                forvalues y =1/36 {
                    generate `w'eduonly`y' = `w'edupart`y'        
                    replace  `w'eduonly`y' =0 if  `w'employed`y' ==1 //those who are studying and also working 
                                                                // are removed from this indicator
                }
            
                // 2nd activity category: in education and also employed 
                forvalues y =1/36  {
                    generate `w'eduemp`y' = 0        
                    replace  `w'eduemp`y' = 1 if  `w'employed`y' ==1 & `w'edupart`y'==1 
                    }
            
                // 3rd activity category: employed only
                // need to exclude those also in education
                forvalues y =1/36 {
                    generate `w'emponly`y' = `w'employed`y'        
                    replace  `w'emponly`y' =0 if  `w'edupart`y' ==1 // those who are employed and also studying 
                                                                // are removed from this indicator
                    }
                
                // 4th activity category: unemployed (and not studying)
                //remove students and employed from unemployed category
                forvalues y =1/36 {
                    recode `w'caune`y'  -1 =0  
                    replace `w'caune`y'=0 if `w'edupart`y'==1 | `w'employed`y'==1                                                     
                    }
            
                // 5th activity category: not in labour force (and not studying)
                //create not in labour force (and not participating in education or training) variable
                forvalues y =1/36 {
                    generate `w'nilf`y'=0
                    replace `w'nilf`y'=1 if `w'canlf`y'==1 & `w'edupart`y' !=1 & ///
                    `w'employed`y'!=1 & `w'caune`y' !=1
                    }
            
                
                //STEP 3: Generate monthly activity variables 
                //generate activity variables with values to be filled in below:
                forvalues y=1/12 {
                    generate `w'activity`y' =.
                    }
                
                //define label for activity variables
                label define activity 1 "Study only" 2 "Work & study" 3 "Work only" ///
                4 "Unemployed" 5 "NILF"
            
                
                local i=1
                local j=2        // these three local macros represent the month thirds in each month
                local k=3
            
                    forvalues l=1/12 {
                        replace `w'activity`l' = 1 if  `w'eduonly`i' ==1 | `w'eduonly`j' ==1 | `w'eduonly`k' ==1 //study only
            
                        replace `w'activity`l' = 2 if `w'eduemp`i' ==1 | `w'eduemp`j' ==1 | `w'eduemp`k' ==1 // work & study                            
                        replace `w'activity`l' = 2 if `w'activity`l' ==1 & `w'emponly`i' ==1
                        replace `w'activity`l' = 2 if `w'activity`l' ==1 & `w'emponly`j' ==1
                        replace `w'activity`l' = 2 if `w'activity`l' ==1 & `w'emponly`k' ==1 
            
                        replace `w'activity`l' = 3 if `w'activity`l' == . & `w'emponly`i'==1  // work only
                        replace `w'activity`l' = 3 if `w'activity`l' == . & `w'emponly`j'==1 
                        replace `w'activity`l' = 3 if `w'activity`l' == . & `w'emponly`k'==1 
            
                        replace `w'activity`l' = 4 if `w'activity`l' == . & `w'caune`i'==1  // unemp
                        replace `w'activity`l' = 4 if `w'activity`l' == . & `w'caune`j'==1 
                        replace `w'activity`l' = 4 if `w'activity`l' == . & `w'caune`k'==1 
            
                        replace `w'activity`l' = 5 if `w'activity`l' ==. // NILF                            
                        
                        label values `w'activity`l' activity
                        
                    
                    local i=`i'+3
                    local j=`j'+3        // move to the next set of month thirds.
                    local k=`k'+3
                    }
            
                //STEP 4: Create wide dataset for each wave, save to folder.
                keep xwaveid `w'hgage `w'activity* `w'edhigh1
            
                local wnum = strpos("ghijklmnopqrst","`w'")
                local j= (12*`wnum')-11
                    forvalues i =1/12 {
                        rename `w'activity`i' activity`j'
                local j= (12*`wnum')-11 +`i' 
                    } 
                // Activities are now contained in variables activity1 - activity120
                // (ie. an activity variable for each month; split into one file per year) 
                // with values 1 - 5 indicating what the activity was.
                save "activity_wave`w'", replace
            }
            
            //STEP 5: include other variable(s) from wave 1 e.g. gender (ahgsex). Add 
            // other variables as appropriate. 
            use "Rperson_g200c.dta", clear
                keep xwaveid ghgsex 
                sort xwaveid
            merge 1:1 xwaveid using "activity_waveg.dta"
                keep if _merge==3
                drop _merge
            save "activity_waveg.dta", replace
            
            //STEP 6: merge 10 waves together to form a wide dataset of activities.
            use "activity_waveg.dta", clear
            foreach w in h i j k l m n o p q r s t {
                sort xwaveid
                merge 1:1 xwaveid using "activity_wave`w'.dta"
                keep if _merge==3 
                drop _merge `w'hgage // drop unnecessary variables
            }
            
            // generate age group segments 
            generate w1age=.
            replace w1age=1 if ghgage >=15 & ghgage <25 // youths
            replace w1age=2 if ghgage >=25 & ghgage <40 // young adults
            replace w1age=3 if ghgage >=40 & ghgage <55 // mature adults
            replace w1age=4 if ghgage >=55 & ghgage <65 // seniors
            
            label define wave1age 1 "youths" 2 "young adults" 3 "mature adults" 4 "seniors" 
            
            label values w1age wave1age
            
            save "activity_month_wide.dta", replace
            
            keep if w1age==1
            
            reshape long activity, i(xwaveid) j(order)
            
            sqset activity xwaveid order     
            
            sqom, name(om1)
            
            sqom, subcost(meanprobdistance) full k(2)
            
            sqclusterdat 
            
            cluster tree distWARD, cutnumber(20) 
            
            graph save "youth dendrogram", replace
            
            cluster generate groupWARD_5=groups(5), name(distWARD) 
            sqclusterdat, return 
            
            label define pathno 1 "pathway 1" 2 "pathway 2" 3 "pathway 3" ///
            4 "pathway 4" 5 "pathway 5" 
            
            label values groupWARD_5 pathno
            And when I use your suggested code above Clyde, it works but I noticed that the observations are huge and I assume that's because in the reshaping it's turned the aggregated monthly calendar data into many observations, but is this a problem for when I do my analysis? Let's say I use a fixed effects regression?

            Is there a more simplified way of reshaping the data that doesn't do this?

            Best
            Brendan

            Comment


            • #7
              I noticed that the observations are huge and I assume that's because in the reshaping it's turned the aggregated monthly calendar data into many observations, but is this a problem for when I do my analysis?
              Yes, the -reshaping- expands the number of observations and decreases the number of variables. Overall the total size of the data set increases because the variables that are not -reshape-d now get repeated throughout the series of observations that arise from each original observation. It's not likely to be a problem. And, in fact, using any of Stata's estimation commands you have no alternative. There is no way you can have the various series of *edhigh1 values treated as a single edhigh1 variable without doing this. And if the variable, which I called -whatever-, which is created by -reshape- is itself a variable in your regression, then, again, there is no alternative to -reshape long- here.

              There are a few circumstances where a data set can have too many observations and lead to failure. One is if the total number of observations exceeds the limit for a data set. In Stata version 17, you can have at least 2.147 billion observations in a data set, and roughly 500 times that many if you are running the MP flavor. Given that your data appears to be about human beings it is hard for me to imagine you will exceed even the smaller of those limits. The other problem that can be encountered is with maximum likelihood estimations such as logistic, probit, or Poisson models. Sometimes as Stata adds up the contributions to the log-likelihood from each observation, a very large data set can lead to numerical overflow: the total exceeds the largest number that Stata can represent. In that case, the regression will fail, and the solution would be to restrict the analysis to some smaller, more manageable subset. But if you are doing a regression that does not rely on maximum likelihood (like -xtreg-) this will not be an issue.

              Comment


              • #8
                Thank you very much Clyde for your very helpful response - it is most appreciated.
                Best
                Brendan

                Comment

                Working...
                X