Generating a wave (time) variable in a panel data set

Chris Boulis

Join Date: Feb 2019

Posts: 363
#1

Generating a wave (time) variable in a panel data set

26 Sep 2019, 00:42

I want to generate a variable 'wave' to identify each wave in the dataset. I have the responding person identifier, but no year variable and so I need to create a wave variable to xtset (tsset) my panel (e.g. id, wave). I've searched throughout Statalist and elsewhere with no luck to date.

Can anyone kindly steer me in the right direction with the code please?
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

26 Sep 2019, 01:31

Chris:
maybe something along the following lines can be useful:

Code:

. set obs 6 number of observations (_N) was 0, now 6 . g id=_n in 1/3 (3 missing values generated) . replace id=_n-3 if id==. (3 real changes made) . bysort id: g wave=_n . list +-----------+ | id wave | |-----------| 1. | 1 1 | 2. | 1 2 | 3. | 2 1 | 4. | 2 2 | 5. | 3 1 | |-----------| 6. | 3 2 | +-----------+ .

However, please note that creating -wave- yourself is at risk of ignoring gaps in the panel.
As an aside, if you do not plan to use time-series commands, such as lags and leads, you can simply -xtset- your data xith -panelid- only:

Code:

xtset panelid

Kind regards,
Carlo
(Stata 19.0)
Comment
Chris Boulis

Join Date: Feb 2019

Posts: 363
#3

29 Sep 2019, 19:41

Carlo Lazzaro Thank you very much Carlo. This code did the job, thanks.
bysort id: g wave=_n
Comment
Chris Boulis

Join Date: Feb 2019

Posts: 363
#4

16 Oct 2019, 20:02

As I intend to use lags in my panel data analysis, can I kindly receive more guidance on creating a wave variable given the advice, Carlo Lazzaro

note that creating -wave- yourself is at risk of ignoring gaps in the panel. As an aside, if you do not plan to use time-series commands, such as lags and leads, you can simply -xtset- your data xith -panelid- only

Note that in my dataset I merge both respondent and partner data in each wave then appending all waves. All the variables in the original dataset contain a wave-prefix (e.g. a-q representing waves 1-17), however, I remove this prefix in the merging/appending process, as such, I need to create a wave variable to identify changes in variables over time. Would it be better to do this before I remove the wave-prefix to variables?

Thank you in advance.

Code:

* partner data local wave a b c d e f g h i j k l m n o p q foreach x of local wave { use "c:/data/Combined_`x'170c.dta", clear bysort hhid: g wave=_n rename `x'* p_* // to denote partner data rename waveid hhid sort hhid tempfile mergedata save "`mergedata'", replace emptyok * respondent data use "c:/data/Combined_`x'170c.dta", clear bysort hhid: g wave=_n rename `x'* * drop if hhpxid=="" sort hhpxid merge 1:1 hhid using "`mergedata'", replace

Last edited by Chris Boulis; 16 Oct 2019, 20:25. Reason: added in draft code
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#5

17 Oct 2019, 01:03

Chris:
I'm not really clear with what your after.
However, my gut-feeling is that it is safer to create a new wave indicator before you remove the (original) wave-prefix to variable.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Chris Boulis

Join Date: Feb 2019

Posts: 363
#6

17 Oct 2019, 02:20

Yes thank you Carlo. I posted my code to see if anyone had a better idea about how I may go about creating the wave variable. However, I notice there are a couple of errors in my code, such as I include "bysort hhid : gen wave=_n" twice, the first is probably more accurate (so ignore the second) and would line up with your suggestion to add this prior to removing the wave prefix from variables.
Comment

Announcement

Generating a wave (time) variable in a panel data set

Comment

Comment

Comment

Comment

Comment