Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Surprisingly negative coefficients in a logistic regression

    Hello,

    first of all - i just started to use stata, so i hope my questions are not too stupid.
    I writing a paper on the effects of innovation cooperation on the success of an product (1) / process (2) innovation (indicator: market launch)
    With a simple logistic regression i got positive effects (as expected) for both, but not really strong.
    Then I added some control variables and weird things happened
    In the first case the coefficient of innovation cooperation turned to -2.
    In the second case the control variable Research & Development Effort was -1,3, which would mean that R&D effort effects process innovation success negatively.
    Those numbers are really weird, an I don't see the problem: The coding is correct, meaning that all the numbers go up like anyone would expect them to. Also I use high quality panel data from a well-known institute.

    Can somebody help me with this?

    I attached the screenshots of the outputs

    Thanks very much and sorry for my english.

    Thomas

  • #2

    Comment


    • #3
      Screen shots are very hard to read. Instead, click on the underlined A next to the smiley on the upper right hand side when writing messages. Then click on the # sign, and then copy your code and output between the two code tags.

      In both of your screen shots, N = 30 and nothing is statistically significant. It is hard for me to believe that this is high quality panel data from a well known institute. (What kind of high quality panel data set has only 30 records???) But if it is, then I suspect you have somehow zapped the data on some earlier step, costing you huge numbers of cases. Or, maybe there is a huge amount of missing data on one or more of your variables. Double check your data, and figure out why you have so few cases.

      EDIT: If N really does equal 30, then the data set is probably much too small for you to do anything useful with it.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Thank you really much for the quick answer. I totally oversaw the number of observations. Of course it is way too less and the dataset has more than 4000 observations.
        I also identified the problem but I couldn't solve it yet.
        The last variable is an addition of 11 variables (with the coding 0-1-2). The problem is that stata shows me 0 as a missing variable in the new variable.
        f.e. if the 11 variables are all 0, Stata shows me an missing value for my new one. But if there are real numbers like 0 + 2 + 1 + 2... it does it correctly and shows me 5.

        How can I fix this?

        Comment


        • #5
          Please show exactly what you typed when computing your variable, and list the values for a few of the cases that ended up being coded as missing. My guess is those cases didn't have 11 0s. Rather, they had at least one value that was missing, so when you added the values together you got missing for the new variable.

          Type -help egen-. My guess is that what you really want to do is use the rowmean function. Or, if you do want missing values treated as 0s, use the total function. But be clear on what a missing data code means, and think about how best to handle it.

          If egen doesn't sound right, then that just makes it all the more important to show exactly what you did and what some of the data look like; otherwise we can't tell what the problem is or how to solve it.
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          StataNow Version: 19.5 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            Well, I can't think of any Stata commands that would add up 11 variables and then show a missing value if the total were zero--so somebody must have set up that variable specifically that way, and, who knows, maybe for good reason. But I think you would be best off disregarding that last variable and doing the summation yourself. The easy way is
            Code:
            egen new_last_var = rowtotal(varlist)
            where you replace varlist by a list of the 11 variables. If they are sequentially ordered variables in your data set, something like v1-v11 would be the ticket. If they are all the variables starting with a certain prefix, then stub* could substitute for varlist. Or if they are just 11 variables with no particular pattern to their names, you will have to just list them all out there.

            Comment


            • #7
              Thank you so much for your help guys. It works fine now

              Comment

              Working...
              X