Hi everyone,
I have a dataset summarising the population of each district in a country, with 5 different sub-populations, and 1 total.
That is, for district "Paris", I have 6 observations: Total, Refugees, IDPs, Other Affected Populations, Unknown, Disabled. (see bottom of my message)
For each district, I want to check that Total = Refugees + IDPs + OAP + Unknown.
(Notice one difficulty: Disabled is not included in the total !! i.e. Total ≠ Refugees + IDPs + OAP + Unknown + Disabled)
My idea would be to create a new row for each district, checking that the numbers match. If the total is correct, then Number = 0. Otherwise, number = 1.
In the case of the district of Paris, the row would look like this:
If feel like the solution probably includes a "bysort", but I can't quite get to it... My first "draft" is: bysort(District): egen checktotal = 0 if sum(Number)==TOTAL
Could you please help me ?
Many thanks !!
I have a dataset summarising the population of each district in a country, with 5 different sub-populations, and 1 total.
That is, for district "Paris", I have 6 observations: Total, Refugees, IDPs, Other Affected Populations, Unknown, Disabled. (see bottom of my message)
For each district, I want to check that Total = Refugees + IDPs + OAP + Unknown.
(Notice one difficulty: Disabled is not included in the total !! i.e. Total ≠ Refugees + IDPs + OAP + Unknown + Disabled)
My idea would be to create a new row for each district, checking that the numbers match. If the total is correct, then Number = 0. Otherwise, number = 1.
In the case of the district of Paris, the row would look like this:
Paris | checktotal | 0 |
Could you please help me ?
Many thanks !!
District | Population | Number |
Paris | Refugees | 3 |
Paris | IDPs | 3 |
Paris | OtherAffected | 3 |
Paris | Unknown | 3 |
Paris | TOTAL | 12 |
Paris | Disabled | 4 |
Lyon | Refugees | 5 |
Lyon | IDPs | 5 |
Lyon | OtherAffected | 5 |
Lyon | unknown | 5 |
Lyon | Total | 20 |
Lyon | Disabled | 11 |
Comment