Dear All,
I am using a crossectional data and i am trying to generate a household size for 1843 observed households .The entity IDs are reg,dist and hh.
I have generated a unique ID = uniq_hid.
The issue is ,each household have many members represented by "member number" or "mem" and year of birth.They are listed in columns after the other.
I want to generate a household size which shows the number of people in each household .Because household members are listed in columns this the command i used ;
egen HHsize=count( mem), by(hh dist region )
bys hh dist region : gen dupl=[_n]
drop if dupl>1
but the ouctome outcome shows large member size for the household .
I have also tried the following command.
bysort uniq_hid: gen size=_N //but the results only repeat the total observation.
I have therefore browsed and I am showing below the observation for your valued support.
Can anyone help me please to find the appropriate code to generate the household size .Thank you
[CODE]
* Example generated by -dataex-. To install: ssc install dataex
clear
input str2 reg str4 dist1 str3 hh1 str9 uniq_hid byte mem int DA13
reg dist hh uniq_hid mem birth In order: reg =region id,dist=dist id ,hh=household id ,mem=household member number, DA14/birth=year of member
"06" "0626" "001" "060626001" 1 1976
"06" "0626" "001" "060626001" 2 1953
"06" "0626" "001" "060626001" 3 1987
"06" "0626" "001" "060626001" 4 1994
"06" "0626" "001" "060626001" 5 2000
"06" "0626" "001" "060626001" 6 2003
"06" "0626" "001" "060626001" 7 2006
"06" "0626" "001" "060626001" 8 2011
"07" "0715" "001" "070715001" 1 1974
"07" "0715" "001" "070715001" 2 1976
"07" "0715" "001" "070715001" 3 2002
"07" "0715" "001" "070715001" 4 2007
"07" "0715" "001" "070715001" 5 2009
"07" "0715" "001" "070715001" 6 2013
"08" "0810" "001" "080810001" 1 1965
"08" "0810" "001" "080810001" 2 1970
"08" "0810" "001" "080810001" 3 1974
"08" "0810" "001" "080810001" 4 2008
"08" "0810" "001" "080810001" 5 2011
"08" "0810" "001" "080810001" 6 2012
"08" "0815" "001" "080815001" 1 1979
"08" "0815" "001" "080815001" 2 1963
"08" "0815" "001" "080815001" 3 1991
"08" "0815" "001" "080815001" 4 1973
"08" "0815" "001" "080815001" 5 1981
"08" "0815" "001" "080815001" 6 1988
"08" "0815" "001" "080815001" 7 1994
"08" "0815" "001" "080815001" 8 2001
"08" "0815" "001" "080815001" 9 2006
"08" "0815" "001" "080815001" 10 2008
"08" "0815" "001" "080815001" 11 2011
"08" "0815" "001" "080815001" 12 1928
"08" "0815" "001" "080815001" 13 1938
"09" "0903" "001" "090903001" 1 1988
"09" "0903" "001" "090903001" 2 1990
"09" "0903" "001" "090903001" 3 1992
"09" "0903" "001" "090903001" 4 1993
"09" "0903" "001" "090903001" 5 1994
"09" "0903" "001" "090903001" 6 1995
"09" "0903" "001" "090903001" 7 1996
"09" "0903" "001" "090903001" 8 1997
"09" "0903" "001" "090903001" 9 1999
"09" "0903" "001" "090903001" 10 2000
"09" "0909" "001" "090909001" 1 1977
"09" "0909" "001" "090909001" 2 1987
"09" "0909" "001" "090909001" 3 1991
"09" "0909" "001" "090909001" 4 1938
"09" "0909" "001" "090909001" 5 1942
"09" "0909" "001" "090909001" 6 1982
"09" "0909" "001" "090909001" 7 1985
"09" "0909" "001" "090909001" 8 1981
"09" "0909" "001" "090909001" 9 1953
"09" "0909" "001" "090909001" 10 1998
"09" "0909" "001" "090909001" 11 2001
"09" "0909" "001" "090909001" 12 2005
You can see for the table that each household uniq-id shows it has many members with uniq_hid1 having 8 people .
Could anyone help me to generate the household size in this situation.
Or Is the command i used is fine ?,if it is ,then i can go ahead with it.
Thank you.
Regards,
Michael Asare
I am using a crossectional data and i am trying to generate a household size for 1843 observed households .The entity IDs are reg,dist and hh.
I have generated a unique ID = uniq_hid.
The issue is ,each household have many members represented by "member number" or "mem" and year of birth.They are listed in columns after the other.
I want to generate a household size which shows the number of people in each household .Because household members are listed in columns this the command i used ;
egen HHsize=count( mem), by(hh dist region )
bys hh dist region : gen dupl=[_n]
drop if dupl>1
but the ouctome outcome shows large member size for the household .
I have also tried the following command.
bysort uniq_hid: gen size=_N //but the results only repeat the total observation.
I have therefore browsed and I am showing below the observation for your valued support.
Can anyone help me please to find the appropriate code to generate the household size .Thank you
[CODE]
* Example generated by -dataex-. To install: ssc install dataex
clear
input str2 reg str4 dist1 str3 hh1 str9 uniq_hid byte mem int DA13
reg dist hh uniq_hid mem birth In order: reg =region id,dist=dist id ,hh=household id ,mem=household member number, DA14/birth=year of member
"06" "0626" "001" "060626001" 1 1976
"06" "0626" "001" "060626001" 2 1953
"06" "0626" "001" "060626001" 3 1987
"06" "0626" "001" "060626001" 4 1994
"06" "0626" "001" "060626001" 5 2000
"06" "0626" "001" "060626001" 6 2003
"06" "0626" "001" "060626001" 7 2006
"06" "0626" "001" "060626001" 8 2011
"07" "0715" "001" "070715001" 1 1974
"07" "0715" "001" "070715001" 2 1976
"07" "0715" "001" "070715001" 3 2002
"07" "0715" "001" "070715001" 4 2007
"07" "0715" "001" "070715001" 5 2009
"07" "0715" "001" "070715001" 6 2013
"08" "0810" "001" "080810001" 1 1965
"08" "0810" "001" "080810001" 2 1970
"08" "0810" "001" "080810001" 3 1974
"08" "0810" "001" "080810001" 4 2008
"08" "0810" "001" "080810001" 5 2011
"08" "0810" "001" "080810001" 6 2012
"08" "0815" "001" "080815001" 1 1979
"08" "0815" "001" "080815001" 2 1963
"08" "0815" "001" "080815001" 3 1991
"08" "0815" "001" "080815001" 4 1973
"08" "0815" "001" "080815001" 5 1981
"08" "0815" "001" "080815001" 6 1988
"08" "0815" "001" "080815001" 7 1994
"08" "0815" "001" "080815001" 8 2001
"08" "0815" "001" "080815001" 9 2006
"08" "0815" "001" "080815001" 10 2008
"08" "0815" "001" "080815001" 11 2011
"08" "0815" "001" "080815001" 12 1928
"08" "0815" "001" "080815001" 13 1938
"09" "0903" "001" "090903001" 1 1988
"09" "0903" "001" "090903001" 2 1990
"09" "0903" "001" "090903001" 3 1992
"09" "0903" "001" "090903001" 4 1993
"09" "0903" "001" "090903001" 5 1994
"09" "0903" "001" "090903001" 6 1995
"09" "0903" "001" "090903001" 7 1996
"09" "0903" "001" "090903001" 8 1997
"09" "0903" "001" "090903001" 9 1999
"09" "0903" "001" "090903001" 10 2000
"09" "0909" "001" "090909001" 1 1977
"09" "0909" "001" "090909001" 2 1987
"09" "0909" "001" "090909001" 3 1991
"09" "0909" "001" "090909001" 4 1938
"09" "0909" "001" "090909001" 5 1942
"09" "0909" "001" "090909001" 6 1982
"09" "0909" "001" "090909001" 7 1985
"09" "0909" "001" "090909001" 8 1981
"09" "0909" "001" "090909001" 9 1953
"09" "0909" "001" "090909001" 10 1998
"09" "0909" "001" "090909001" 11 2001
"09" "0909" "001" "090909001" 12 2005
You can see for the table that each household uniq-id shows it has many members with uniq_hid1 having 8 people .
Could anyone help me to generate the household size in this situation.
Or Is the command i used is fine ?,if it is ,then i can go ahead with it.
Thank you.
Regards,
Michael Asare
Comment