Dear Statalist members,
I am working with data about nutritional intake where participants describe what they eat during two consecutive days. Each food is entered as a new observation in the database and is categorised by a food group and subgroup, which appear as the variables "group" and "subgroup1". There are 18 groups, coded from 1 to 18. A quantity ("cons_qty") is also associated with every food consumed.
I successfully used fillin as suggested by Clyde Schechter to create new observations for groups when these observations did not exist in the database (see:https://www.statalist.org/forums/for...-in-a-database). I would have another question related to this.
Each group is further divided into subgroups. The different groups don't have the same number of subgroups, for example group 4 has 5 subgroups whereas group 7 has 8 subgroups. I want to create new observations for subgroups that were not consumed, in the same idea that for the groups.
Here is an example of the dataset:
I tried using:
but the problem is that it creates 8 subgroups for each group (8 is the highest number of subgroups inside a group in my dataset), independently of the group concerned.
I saw that the option by() is not allowed for fillin and I also tried using if to distinguish between the different groups, it didn't work either. I checked mipolate but as I understood, it doesn't do what I am looking for.
My solution would be to use fillin on subsets of my data where I keep only one group and append all the subsets afterwards. Yet, it doesn't seem like the most efficient way to do it. Would you have another approach to suggest?
I am working with data about nutritional intake where participants describe what they eat during two consecutive days. Each food is entered as a new observation in the database and is categorised by a food group and subgroup, which appear as the variables "group" and "subgroup1". There are 18 groups, coded from 1 to 18. A quantity ("cons_qty") is also associated with every food consumed.
I successfully used fillin as suggested by Clyde Schechter to create new observations for groups when these observations did not exist in the database (see:https://www.statalist.org/forums/for...-in-a-database). I would have another question related to this.
Each group is further divided into subgroups. The different groups don't have the same number of subgroups, for example group 4 has 5 subgroups whereas group 7 has 8 subgroups. I want to create new observations for subgroups that were not consumed, in the same idea that for the groups.
Here is an example of the dataset:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float id byte(int_num group subgroup1) float cons_qty 1 21 1 . 0 1 21 2 0 261 1 21 2 1 32 1 21 2 2 132.1875 1 21 2 2 76.8 1 21 2 2 111 1 21 2 3 40.5 1 21 3 . 0 1 21 4 1 207 1 21 4 1 138 1 21 5 1 64.375 1 21 5 1 64.375 1 21 5 5 21.8 1 21 5 5 125 1 21 6 . 0 1 21 7 . 0 1 21 8 . 0 1 21 9 . 0 1 21 10 1 25.2 1 21 11 2 10 1 21 12 2 10 1 21 13 3 187.5 1 21 13 3 187.5 1 21 13 4 500 1 21 13 4 500 1 21 13 4 300 1 21 13 4 300 1 21 14 . 0 1 21 15 4 28 1 21 15 4 .6 1 21 16 . 0 1 21 17 2 1 1 21 17 2 1 1 21 18 . 0 1 22 1 . 0 1 22 2 1 15.208208 1 22 2 2 30.416416 1 22 2 2 30.416416 1 22 3 . 0 1 22 4 1 54.6 1 22 4 1 124.2 1 22 4 1 57.6 1 22 4 1 138 1 22 4 1 36.399998 1 22 5 1 64.375 1 22 5 1 64.375 1 22 5 1 57.68 1 22 5 4 100 1 22 6 3 28 1 22 6 3 27.95 1 22 6 3 40 1 22 6 3 55.9 1 22 7 4 90 1 22 8 . 0 1 22 9 . 0 1 22 10 1 26.614365 1 22 10 2 2.5 1 22 11 1 5.25 1 22 11 1 7 1 22 12 . 0 1 22 13 3 187.5 1 22 13 3 187.5 1 22 13 4 500 1 22 13 4 300 1 22 13 4 500 1 22 14 . 0 1 22 15 3 .1520821 1 22 15 4 19.01026 1 22 15 4 .3802052 1 22 15 4 3.802052 1 22 16 1 123.2 1 22 17 2 1 1 22 17 2 1 1 22 18 . 0 1 21 1 . 0 2 21 2 1 24.14001 2 21 2 2 48.28003 2 21 2 6 48.28003 2 21 3 . 0 2 21 4 . 0 2 21 5 1 37.5 2 21 6 3 47.5 2 21 6 3 47.5 2 21 7 4 100 2 21 7 4 90 2 21 8 . 0 2 21 9 . 0 2 21 10 0 42.24502 2 21 10 1 1.8 2 21 11 1 8 2 21 11 1 8 2 21 11 1 8 2 21 12 . 0 2 21 13 3 150 2 21 13 3 112.5 2 21 13 3 150 2 21 13 4 500 2 21 13 4 1000 2 21 13 4 500 2 21 13 4 500 end
I tried using:
Code:
fillin id int_num group subgroup1 replace cons_qty = 0 if _fillin
I saw that the option by() is not allowed for fillin and I also tried using if to distinguish between the different groups, it didn't work either. I checked mipolate but as I understood, it doesn't do what I am looking for.
My solution would be to use fillin on subsets of my data where I keep only one group and append all the subsets afterwards. Yet, it doesn't seem like the most efficient way to do it. Would you have another approach to suggest?
Comment