Hi everyone,
I am using DHS data for all countries for some exploratory analysis. The variable religion in the DHS dataset has some missing values for some countries (either whole or partly).
a) For countries where religion is missing completely:
From external sources, I can get the proportion of people that belong to a certain religion in a specific country. Say for instance India (country code=10) is 94% Hinduism (religion_code=1), 3% Muslims (religion_code=2), 2 Christians (religion_code=3), and 1% others (religion_code=4). Is there a way I can replace the missing values for religion in India in a random manner but based on the criteria above? i.e. randomly replace 94% missing values in religion for India with 1 (i.e. Hinduism), 2 (Muslim) and so on.
b) For countries where religion is partly missing
How to replace the missing values randomly based on the existing distribution of religion data for that country. So if a particular country has 50% Hindus and 50% Christians, then 50% of the religion's missing values are replaced by Hindus and son on.
Any help on this will be appreciated.
Danish
P.S. I understand that I'll have to calculate religion shares for each country and each year separately. For now, however, let's assume the religion shares remain constant across all years.
I am using DHS data for all countries for some exploratory analysis. The variable religion in the DHS dataset has some missing values for some countries (either whole or partly).
a) For countries where religion is missing completely:
From external sources, I can get the proportion of people that belong to a certain religion in a specific country. Say for instance India (country code=10) is 94% Hinduism (religion_code=1), 3% Muslims (religion_code=2), 2 Christians (religion_code=3), and 1% others (religion_code=4). Is there a way I can replace the missing values for religion in India in a random manner but based on the criteria above? i.e. randomly replace 94% missing values in religion for India with 1 (i.e. Hinduism), 2 (Muslim) and so on.
b) For countries where religion is partly missing
How to replace the missing values randomly based on the existing distribution of religion data for that country. So if a particular country has 50% Hindus and 50% Christians, then 50% of the religion's missing values are replaced by Hindus and son on.
Any help on this will be appreciated.
Danish
P.S. I understand that I'll have to calculate religion shares for each country and each year separately. For now, however, let's assume the religion shares remain constant across all years.
Comment