replacing value based on group identifier

Christoff Galvao

Join Date: Feb 2022

Posts: 4
#1

replacing value based on group identifier

25 Feb 2022, 03:19

Dear all,

I would consider myself a bloody beginner in Stata and statalist so I might not have found a previous post where this question was already asked, if so please let me know!

To my problem:
My data set has two identifiers, one unique for each participant lets call it "S" and one which is shared for every family member lets call it "F".
For each family "F" I have a member "S" who has a value in the variable "X" while all other members have a missing value.
Now I would like to replace all the missing values in X with the value of the specific family member.

my approach so far:

Code:

replace X = value1 if F =="F1" replace X = value2 if F =="F2"

This is not only time consuming but also increases the risk of human error as it has to be repeated for each family.
So, I am sure that there is a more elegant way to complete this and I would like to know how a more experienced Stata user would tackle this.

thanks for any input!

I am using Stata 16

Last edited by Christoff Galvao; 25 Feb 2022, 03:55.
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35212
#2

25 Feb 2022, 03:22

Data example please https://www.statalist.org/forums/help#stata
Comment

Christoff Galvao

Join Date: Feb 2022
Posts: 4

25 Feb 2022, 03:53

this is how my data looks like:

Code:

* Example generated by -dataex-. To install: ssc install dataex
*dataex SUJETO FAMILIA cs_2_2, count(30)
clear
input str7(SUJETO FAMILIA) byte cs_2_2
"S00004" "F0002" .
"F0006B" "F0006" .
"F0006A" "F0006" .
"F0008"  "F0008" .
"S00043" "F0008" 2
"S00045" "F0009" .
"S00073" "F0015" .
"S00211" "F0046" 4
"S00213" "F0046" .
"F0083C" "F0083" .
"S00365" "F0083" 3
"F0083"  "F0083" .
"F0083B" "F0083" .
"F0083D" "F0083" .
"S00465" "F0103" .
"S00528" "F0115" .
"S00535" "F0118" .
"F118A"  "F0118" .
"S00536" "F0118" .
"F0138C" "F0138" .
"S0138A" "F0138" .
"F0138D" "F0138" .
"F0138B" "F0138" .
"S00617" "F0138" 2
"S00668" "F0149" .
"S00677" "F0152" .
"S00678" "F0152" .
"F0155A" "F0155" .
"S00687" "F0155" 4
"S03697" "F0171" 3
end

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35212
#4

25 Feb 2022, 04:49

Perhaps you want something like

Code:

bysort FAMILIA (cs2_2) : replace cs2_2 = cs2_2[_n-1] if missing(cs2_2)

See https://www.stata.com/support/faqs/d...issing-values/

If there are two or more distinct answers within each family. this is unlikely to be the best solution.
Comment
Christoff Galvao

Join Date: Feb 2022

Posts: 4
#5

25 Feb 2022, 09:05

This is exactly what I was looking for.
Thanks Nick Cox for the input and also for the link!
Comment

Announcement

replacing value based on group identifier

Comment

Comment

Comment

Comment