Pushing back a series of observations

Ashani Abayasekara

Join Date: May 2023
Posts: 106

Pushing back a series of observations

10 Jan 2024, 03:36

Hi all,

I have a dataset with 12 observations per individual. Each individual however answers only eight of 12 randomly assigned questions, and therefore I need to delete four observations for each individual which are not relevant. My first variable of interest, jobsec, records missing values straightforwardly. However, the next variable, q_order, records the eight values one after the other in the first eight rows for each individual. What I need to do is to adjust the q_order variable values to follow the same pattern as the jobsec variable; i.e., for each individual id, to make q_order equal to missing whenever jobsec is missing, and then push back other recorded values of q_order accordingly to match the jobsec values.

Below is a sample of my data. For individual id 1, I want to make q_order=missing for the 3rd and 4th observations, and push back QID545 to the 5th observation, and QID 530 to the 7th observation, QID400 to the 8th observation, and so on. I would greatly appreciate any code that can do this for all observations.

Thank you very much.

individual id	jobsec	q_order
1	7	QID500
1	8	QID300
1	.	QID545
1	.	QID530
1	6	QID400
1	.	QID350
1	5	QID310
1	9	QID560
1	6	.
1	7	.
1	8	.
1	.	.
2	.	QID530
2	4	QID500
2	.	QID310
2	6	QID 440
2	9	QID350
2	7	QID310
2	7	QID400
2	3	QID545
2	.	.
2	.	.
2	5	.
2	6	.

Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 9958

10 Jan 2024, 04:27

You are 33 posts in, so familiarize yourself with the dataex command for presenting data examples (see FAQ Advice #12 for details)

Code:

clear
input byte(individualid jobsec) str7 q_order
1 7 "QID500"
1 8 "QID300"
1 . "QID545"
1 . "QID530"
1 6 "QID400"
1 . "QID350"
1 5 "QID310"
1 9 "QID560"
1 6 "."      
1 7 "."      
1 8 "."      
1 . "."      
2 . "QID530"
2 4 "QID500"
2 . "QID310"
2 6 "QID 440"
2 9 "QID350"
2 7 "QID310"
2 7 "QID400"
2 3 "QID545"
2 . "."      
2 . "."      
2 5 "."      
2 6 "."      
end

frame put ind jobsec, into(dropobs)
gen obsno=_n
bys ind (obsno): replace obsno=_n
frame dropobs{
    drop if missing(jobsec)
    gen obsno=_n
    bys ind (obsno): replace obsno=_n
}
frlink 1:1 ind obsno, frame(dropobs)
frget wanted= jobsec, from(dropobs)
frame drop dropobs
drop if missing(wanted)
drop jobsec obsno dropobs

Res.:

Code:

. l, sepby(ind)

     +-----------------------------+
     | indivi~d   q_order   wanted |
     |-----------------------------|
  1. |        1    QID500        7 |
  2. |        1    QID300        8 |
  3. |        1    QID545        6 |
  4. |        1    QID530        5 |
  5. |        1    QID400        9 |
  6. |        1    QID350        6 |
  7. |        1    QID310        7 |
  8. |        1    QID560        8 |
     |-----------------------------|
  9. |        2    QID530        4 |
 10. |        2    QID500        6 |
 11. |        2    QID310        9 |
 12. |        2   QID 440        7 |
 13. |        2    QID350        7 |
 14. |        2    QID310        3 |
 15. |        2    QID400        5 |
 16. |        2    QID545        6 |
     +-----------------------------+

.

Last edited by Andrew Musau; 10 Jan 2024, 04:45.

Comment

Ashani Abayasekara

Join Date: May 2023

Posts: 106
#3

10 Jan 2024, 07:44

Thanks very much for this, Andrew.

If I understand the code correctly, it matches the two variables by rearranging the jobsec variable, right? This will be a bit messy, since there are many other variables in the actual dataset which are aligned with the jobsec variable. Therefore I would like to rearrange the q_order variable instead, in line with the jobsec variable (inserting missing values to q_oprder where there are missing values in jobsec and changing other values accordingly). Is this doable?
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 9958

10 Jan 2024, 10:16

Code:

clear
input byte(individualid jobsec) str7 q_order
1 7 "QID500"
1 8 "QID300"
1 . "QID545"
1 . "QID530"
1 6 "QID400"
1 . "QID350"
1 5 "QID310"
1 9 "QID560"
1 6 ""      
1 7 ""      
1 8 ""      
1 . ""      
2 . "QID530"
2 4 "QID500"
2 . "QID310"
2 6 "QID 440"
2 9 "QID350"
2 7 "QID310"
2 7 "QID400"
2 3 "QID545"
2 . ""      
2 . ""      
2 5 ""      
2 6 ""      
end


gen long obsno=_n
bys ind (obsno): gen seq=sum(!missing(jobsec)) if !missing(jobsec)
bys ind (obsno): gen seq2=sum(!missing(q_order)) if !missing(q_order)
frame put ind seq2 q_order if !missing(q_order), into(Seq2)
frlink m:1 ind seq, frame(Seq2 ind seq2)
frget wanted= q_order, from(Seq2)
frame drop Seq2
drop obsno seq seq2 Seq2 q_order

Res.:

Code:

. l, sepby(ind)

     +-----------------------------+
     | indivi~d   jobsec    wanted |
     |-----------------------------|
  1. |        1        7    QID500 |
  2. |        1        8    QID300 |
  3. |        1        .           |
  4. |        1        .           |
  5. |        1        6    QID545 |
  6. |        1        .           |
  7. |        1        5    QID530 |
  8. |        1        9    QID400 |
  9. |        1        6    QID350 |
 10. |        1        7    QID310 |
 11. |        1        8    QID560 |
 12. |        1        .           |
     |-----------------------------|
 13. |        2        .           |
 14. |        2        4    QID530 |
 15. |        2        .           |
 16. |        2        6    QID500 |
 17. |        2        9    QID310 |
 18. |        2        7   QID 440 |
 19. |        2        7    QID350 |
 20. |        2        3    QID310 |
 21. |        2        .           |
 22. |        2        .           |
 23. |        2        5    QID400 |
 24. |        2        6    QID545 |
     +-----------------------------+

.

Comment

Ashani Abayasekara

Join Date: May 2023

Posts: 106
#5

11 Jan 2024, 01:09

This worked, thank you so much!
Comment

Announcement

Pushing back a series of observations

Comment

Comment

Comment

Comment