Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with zero in matrix


    Hi
    I am working with a transition matrix in stata recently and I have 4 x 4 matrix. I have the 4th row-1st column as zero and that disturbs my methodology. What can I do to replace the zero with a non zero element keeping the properties of the matrix unchanged?
    thank you

  • #2
    I doubt anybody can help you with this until you clarify:

    1) What is the methodology involved and why and how does the zero "disturb" it?

    2) What "properties" of the matrix do you want to keep unchanged? Obviously if you preserve all possible properties of the matrix you cannot change it at all. So what are the critical requirements?

    And probably you should show the entire matrix as well.

    Comment


    • #3
      Thank you for replying
      I am comparing two transitional matrix using a statistic called Altham statistic. The Altham statistic d(P, J) is defined as the square root of the sum of the squared deviations. of two-way odds ratios from a hypothetical “full mobility” setting where said odds ratios are zero:1.It is a distant measure. Zero in element is giving me zero as the value of the statistic and some odds ratios are becoming non-existent.
      So I need to have a matrix of non zero element. My matrix is based on survey questionnaire.

      The matrix shows how’s the rows and column association. If any of the element is zero then my odds ratio becomes incomparable.

      By properties I mean that the underlying roe column association of the matrix will remain unchanged.
      Matrix 1 Matrix 2
      Column Column
      Row P. G B. Row sum Row P G. B. Row sum
      P 39 4 42 85 P 16 3 12 31
      G 0 1 3 4 P 0 1 0 1
      B 5 4 440 449 B 1 0 51 52
      Column sum 44 9 485 538 Column sum 17 4 63 84
      I hope this helps.

      Comment


      • #4
        I am not previously familiar with this statistic. I've just done a little bit of reading about it.

        It is clear to me that taken literally, what you want is impossible. If you change the zeroes, you will necessarily change the odds ratios that measure the various association--no matter how small a change you make.

        That said, there may be some convention about how to handle situations like this that people who use this statistic regularly follow. I don't know what that might be. I hope somebody more familiar with the Altham statistic will respond.

        Comment


        • #5
          Is it possible to add 1 to the zero and other elements of the matrix? Will that shift the origin and keep the association unchanged?

          Comment


          • #6
            Someone suggested, as an ad hoc measure, you replace that 0 frequency by a very
            small number, eg (1/2) if all the other frequencies are rather large.
            Or, combine rows/columns so that the problem of a zero entry no longer
            arises.

            I have question regarding the suggestion but I never got a reply back. Can you help me with this? If I am increasing by 1/2 will I increase all other elements? Also, I don’t understand the second line?

            thank you

            Comment


            • #7
              Re #5: No, this will change all of the associations. If you had an original odds ratio like 2*3/(4*5) = 0.3 and you add 1 to everything you now have 3*4/(5*6) = .4. So adding 1 to everything changes everything.

              Re #6:
              Someone suggested, as an ad hoc measure, you replace that 0 frequency by a very
              small number, eg (1/2) if all the other frequencies are rather large.
              Well, this is often done to compute approximate chi square tests in contingency tables. It's a questionable practice even there. At least, since you won't be changing any numbers other than the zero cells, associations not involving the row or column f the zero cell will be left unchanged. But 0.5 is a completely arbitrary number, and the results would be sensitive to the choice of small number. For example, if you were to use 0.25 instead of 0.5, the associations that do involve the rows or columns of the zero cell would be different by a factor of 2 from what you would get using 0.5.

              Or, combine rows/columns so that the problem of a zero entry no longer
              arises.
              Well, it depends. If I have a contingency table that I want to do a chi square test on and I have some small cells, my preferred approach is usually to combine some rows or columns to eliminate the small cells. Now, I have no idea what the variables you are dealing with are here. But let's just say for the sake of discussion that they are religions. If the zero cell involves, let's say, Catholicism, and there is a row or column for Protestantism, you might do this by combining Catholicism + Protestantism and calling it Christianity or something like that. This would be reasonably meaningful: yes the most devout might object, but for many purposes, treating Christianity as a unified whole would be sensible. So if there are some similar categories in your rows and columns that could be combined in that way, that seems very sensible to me. The problem is if there are no similar categories to the one(s) with zero cell involvement. Say the zero cells involved Judaism. Well it would make no sense (for most purposes, at least) to combine Judaism with, say, Buddhism, or Shinto. They have nothing in common. Maybe you could combine Judaism with Islam: it's the same deity and many of the laws and prophets are common to both (due to ancient Semitic customs underlying both). But if the substance of the study had any overtones of modern geopolitics this would be utterly crazy given the intensity hostility that prevails between many Muslims and Jews in the Middle East.

              My point is that in many other statistical situations, problems of small cells are often dealt with by combining categories--but you have to use some judgment about which categories can be combined with which other ones based on your knowledge and understanding of the subject matter. Sometimes there is an obvious combination to use; sometimes no reasonable combination exists. And sometimes there are combinations that are possible but don't look very desirable.

              All of that said, I think you need to review the literature on the use of Altham's index in your discipline. It is likely that others have encountered this situation before, and there may be a conventional approach that is used. If there is, you should probably follow the convention in your field, even if it 's one that I have said doesn't seem very sound (though the best thing to do would also be to point out in your publication or presentation that the approach you are using is conventional but unsound statistically.) If you can't find anything in the literature in your discipline, check out the literature in other disciplines that use Altham's index and see if there is a conventional approach: borrow that one. If there is nothing in the literature at all, I think you need to identify someone experienced in the use of this index (your literature search will surely turn up some suitable names) and contact one of them. (Or, you might get lucky: there could be a Forum member here who has this expertise and sees this thread and responds in the next day or two.)


              Comment


              • #8
                Thank you so much for such detailed explanation. I shall go through the literature which uses this index but in different field of study as I did’t find one in my literature.

                Comment

                Working...
                X