Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Test of Independence: Continuous IV; Categorical DV

    Hi,
    I have spent hours trawling through the deepest depths of Google trying to find a test that will do this for me. Eventually I found the test of significance contained within an OLOGIT output would work for the ordinal categorical DVs (it's measured in Z scores, but as I understand it, they aren't standard Z scores), but it's no good for the categorical DVs that are not ordinal.

    My data is based on a survery of behaviours. The IV is Age and the dependent variables include

    Ordinal- "How Frequently Do you Take Care of the Kids While your Partner is Sick?" (Categories: Never, sometimes, etc,)
    Non-Ordinal- " Who has the Final Say Regarding Spending Time with Relations?" (You, Partner, Both, etc)

    I've been stumped on this for weeks, I've never used a forum like this before (preferring to be a parasite searching older posts!) I really hope you can help!

    Cheers,
    Seán

  • #2
    If X is independent of Y, then Y is independent of X, and vice versa. So it is easy enough to turn this around and think of Age as the dependent variable. With a nominal variable such as who has final say regarding spending time with relations, you can just look at means of age within each category. -regress- followed by -margins-, perhaps with the -contrast- option will tell you what you need. For an ordinal variable such as frequency of taking care of the kids, you might be interested only in monotone associations. In that case think about -nptrend-.

    Comment


    • #3
      Everything Clyde says makes sense, but I'd also like to throw out there the idea of using -mlogit-, multinomial logit, for the categorical responses. If age really is your only IV, then turning DV and IV on their head make sense. However, if you did end up wanting to take into account income, gender, race, whatever in addition to age, then you need an appropriate model, and multinomial logit is the main one for categorical outcomes.

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        If X is independent of Y, then Y is independent of X, and vice versa. So it is easy enough to turn this around and think of Age as the dependent variable. With a nominal variable such as who has final say regarding spending time with relations, you can just look at means of age within each category. -regress- followed by -margins-, perhaps with the -contrast- option will tell you what you need. For an ordinal variable such as frequency of taking care of the kids, you might be interested only in monotone associations. In that case think about -nptrend-.
        Hi Clyde,
        First, thank you for your reply. I've never been part of a forum like this, and to get a reply so fast was a great surprise.
        Could you explain a little more about regress and margins? Margins- it seems to me- helps with understanding interactions among variables on the Right Hand Side of a regression equation. This is where I'm confused, if you suggest forgetting the distinction between DV & IV how can you do a regression, surely that distinction is fundamental?

        I should probably also be a bit clearer in what I am trying to do, as it might help you understand where I'm coming from.

        I was asked to run bivariate analysis on about 30 "behavioural" (dependent) variables with 8 background (explanatory) variables. The reason I'm only focusing on age is because the other explanatory variables are categorical making analysis much easier. Age is my only continuous variable.
        For the initial analysis I was asked to run simple bivariate tests of independence of all DVs against all IVs.

        Unfortunately I haven't yet been given the ultimate goal of the analysis, other than construct some sort of logit regression, but my next step will be to figure out the most useful way to work with the data once I can see which variables have significant relationships.

        I understand this may (probably) not be the best way to do this kind of work. But this was the way I was asked and it is my first task so I would like to try stick to my instructions until I get a bit more confident in my own abilities.

        Cheers,
        Seán.

        Comment


        • #5
          So, you have a different task from what you originally said (and what I responded to). Independence between two variables has no direction. But if you are talking explanatory and outcome, then there is a direction implied and you do not necessarily have latitude to switch things around. In a regression model, there is a clear distinction between DV and IV. For just determining independence, there is no such distinction at all.

          So with an ordinal DV and age as IV, ordinal logistic regression (-ologit-) is one place to start. Another possibility that is a bit simpler would be spearman rank correlation (-spearman-). For an unordered multi-category outcome with age as predictor, you probably need to go to multinomial regression (-mlogit-).

          Unfortunately I haven't yet been given the ultimate goal of the analysis...
          To me, that's a big red flag. I think you have not only a right to know what the goal of the analysis you are being asked to work on is, you absolutely need to know it so that you can make appropriate professional judgments about how to proceed. I think you need to approach the people you are working for and get this information.

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            To me, that's a big red flag. I think you have not only a right to know what the goal of the analysis you are being asked to work on is, you absolutely need to know it so that you can make appropriate professional judgments about how to proceed. I think you need to approach the people you are working for and get this information.

            Hi Clyde,
            I think you have a fair point with this, but the plan was to originally get this stage of analysis done and then sit down and discuss the next stage with the person I am now volunteering for. At the time I was interning in the office, I no longer am and am therefore to be a little more independent in my work. I will however ask for better direction once I have this stage complete.

            I terms of mlogit; I just want to confirm I will only be taking the Z-scores from the output of the regression and ignoring all the rest. Is there no, other simpler way to test for significance with the type of vars. that I have?

            Thanks again for your help,
            Seán
            Last edited by Seán McKiernan; 28 Jan 2015, 13:00.

            Comment


            • #7
              I terms of mlogit; I just want to confirm I will only be taking the Z-scores from the output of the regression and ignoring all the rest. Is there no, other simpler way to test for significance with the type of vars. that I have?
              Multinomial logistic regression is basically a series of ordinary logistic regressions, with a single base category for the outcome variable (DV in your terms) and a separate single non-base category in each of the "slices" of the analysis. If you look at the z-scores you will be looking only at the statistical significance of the odds ratio associating your predictor variable (IV) with that particular level of the outcome variable vs the base level. While I don't support the whole approach of using significance testing of bivariate associations to decide what to include in a multivariate model (a long discussion I won't go into), assuming this is what you want to do, looking at these individual z-scores will not be helpful. What will you do if you have a significant association between age and outcome 2 vs base outcome, but not outcome 1 vs base outcome. Is age in the model or not? It makes more sense to look at the omnibus association between the categorical outcome as a whole (across all its levels) and your predictor variable. From a significance testing perspective, that is given by the overall model chi square. So if I were in your shoes, the overall model chi square is what I would attend to, not the separate z-tests.

              As for a simpler way, I'll reiterate what I said in the first post. If you are not looking at directionality of associations here and just want to know if the two variables are independent or not, to me the simplest way is to just look at the mean value of age in each response category, and the significance test would be the result of an ANOVA. Whether that seems simpler is in the eye of the beholder, of course. But if you are thinking causally, then this approach would be, in a sense, backwards. I don't know of any simpler approach that preserves the directionality.

              Comment


              • #8
                This has been very helpful. I really appreciate your thorough advice Clyde. I am definitely not married to the mlogit strategy, it feels a bit like square peg, round hole. I'll have a think about ANOVA again, I understand much better now my options. I think I'd probably be happy to sacrifice directionality.

                Appreaciate everything!
                Seán

                Comment

                Working...
                X