My dataset has 120,000 observations of air pollution (variable name PM25) data recorded by 21 people (variable name IDnum) before and after an intervention (variable name pre_post), where location (variable name timeactivity) is recorded. We've also indicated when PM25 data was missing using the variable name missing. FixedPM is the outdoor PM25 amount.
Here is my question - I am collapsing the data 2 different ways for 2 tables. Unfortunately my mean PM25 values for home (timeactivity==4) do not match on the 2 tables and I think this is due to a problem with my collapse command or possibly with missing data.
PM25=6.3 at home before any collapse commands are used.
PM25 at home is 6.3 with this very brief collapse command (it works).
But PM=6.9 when I use this collapse command
And PM25=6.9 when I run the collapse command I planned for the analysis:
Thank you for your advice on this confusing issue! Any input on what may be going wrong or what to check would be appreciated. Happy to upload data here if that would be helpful as well.
Here is my question - I am collapsing the data 2 different ways for 2 tables. Unfortunately my mean PM25 values for home (timeactivity==4) do not match on the 2 tables and I think this is due to a problem with my collapse command or possibly with missing data.
PM25=6.3 at home before any collapse commands are used.
PM25 at home is 6.3 with this very brief collapse command (it works).
Code:
collapse (mean) PM25 FixedPM (sum) obs_min, by(timeactivity)
Code:
collapse (mean) PM25 FixedPM (sum) obs_min, by(ID_Sess timeactivity)
Code:
collapse (mean) PM25 FixedPM (sum) obs_min (sum) miss_min, by(IDnum ID_Sess pre_post timeactivity Mit_Cat missing)
Thank you for your advice on this confusing issue! Any input on what may be going wrong or what to check would be appreciated. Happy to upload data here if that would be helpful as well.
Comment