*STATA17*
For my Thesis, I am trying to explain what the effects of distance are on multiple different dependent variables which include Stress in the past 4 weeks. This data is in the format of percentual averages (so for neighbourhood 1, 24% reported feeling stress in the past 4 weeks.
I only have 48 neighbourhoods and a wide range of other control variables which are also procentual.
using the pwcorr command with the stress in past 4 weeks & average_distance of park results in a significant negative association. After that, I looked into whether there was a linear relationship between the variables using a scatterplot which was not the case in my opinion
pwcorr Having_stress_last_4_weeks Distance_to_park, star(0.05) obs
twoway (scatter Having_stress_last_4_weeks Distance_to_park) (lfit Having_stress_last_4_weeks Distance_to_park)
Due to there also being an outlier as seen in the scatterplot, I decided to also check the correlation using both spearman & Ktau
spearman Having_stress_last_4_weeks Distance_to_park, stats(rho p)
ktau Having_stress_last_4_weeks Distance_to_park, stats(taua taub p)
These both showed insignificant outcomes.
My exact question now is what to do from here. As it is unclear to me due to the nature of the dependent variable being a percentage, is there a non-parametric regression available that suits the data well?
An additional problem with the data is that most other variables such as "Male_gender" is also percentual and thus almost fully correlate with "Female_Gender" and the same counts for "Education_Level" being 3 separate percentages per neighbourhood (Low, average & high).
I hope that I worded my example well, the same for the examples I have given below as png images.
For my Thesis, I am trying to explain what the effects of distance are on multiple different dependent variables which include Stress in the past 4 weeks. This data is in the format of percentual averages (so for neighbourhood 1, 24% reported feeling stress in the past 4 weeks.
I only have 48 neighbourhoods and a wide range of other control variables which are also procentual.
using the pwcorr command with the stress in past 4 weeks & average_distance of park results in a significant negative association. After that, I looked into whether there was a linear relationship between the variables using a scatterplot which was not the case in my opinion
pwcorr Having_stress_last_4_weeks Distance_to_park, star(0.05) obs
twoway (scatter Having_stress_last_4_weeks Distance_to_park) (lfit Having_stress_last_4_weeks Distance_to_park)
Due to there also being an outlier as seen in the scatterplot, I decided to also check the correlation using both spearman & Ktau
spearman Having_stress_last_4_weeks Distance_to_park, stats(rho p)
ktau Having_stress_last_4_weeks Distance_to_park, stats(taua taub p)
These both showed insignificant outcomes.
My exact question now is what to do from here. As it is unclear to me due to the nature of the dependent variable being a percentage, is there a non-parametric regression available that suits the data well?
An additional problem with the data is that most other variables such as "Male_gender" is also percentual and thus almost fully correlate with "Female_Gender" and the same counts for "Education_Level" being 3 separate percentages per neighbourhood (Low, average & high).
I hope that I worded my example well, the same for the examples I have given below as png images.
Comment