Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Anova

    I am starting to use Stata for the first time, I managed to import data from excel and some few command. I want to do a twoway ANOVA and i used these commands: anova trapsite april## june## september## december

    and i got this result;

    'trapsite' found where numeric variable
    > expected
    r(7);
    How can i best perform this twoway ANOVA test using Stata 11 platform? From the look of things here it seems my variable trapsite is not being recognized as numeric, how do i make it recognized as numeric?

  • #2
    Code:
    destring trapsite, gen(trapsite2)
    might be all you need. However, if there are strings in the variable's values, which were causing the import to recognize it as a string, you may need to do some data cleanup. If you look things over and know that it's just a few nuisance characters you want set to missing, then
    Code:
    destring trapsite, gen(trapsite2) force
    will force it to make strings missing. But if the strings are meaningful and you can recover usable data from them, post an example of your data (look for dataex in the faqs) and we can advise on how to transform them.

    Comment


    • #3
      trial stat.dta
      Attached is my data set. i am failing to destring

      Comment


      • #4
        stephen musasa : just in order to clarify an important issue, I fear there is a contradiction on what you said:

        I want to do a twoway ANOVA and i used these commands: anova trapsite april## june## september## december
        Actually, these commands are not related to a twoway ANOVA, but a 4 way ANOVA with full interactions.

        We don't have a clue on what the months really present in the model, I mean, are they dummies, are they mutually exclusive? If so, you could just use factor notation. Are they continous variables, repeated measures? If so, the model should ideally cope with this issue.

        Last but no least, considering the panoply of interactions terms, you may find it difficult to interpret the results, let alone "explain" them to a broad audience.

        I kindly suggest you to consider, at least at a first try, to simplify the model.

        Hopefully that helps!

        Best,

        Marcos


        Best regards,

        Marcos

        Comment


        • #5
          Ah. Every observation has the word "location" in it. The "ignore" option took care of that. Then, for whatever reason, it imported your other variables as strings. Normally I don't recommend the "replace" option, but since they're all clearly supposed to be numeric, I used that. Then comes your final complication: your factor variables are non-integers, so anova won't run. This last complication is up to you to figure out, since you know your data and what kind of model you want to run.

          BTW -- see how I used dataex to make all your data easily accessible? We prefer people to use that instead of posting Stata data files, for various reasons including the fact that I have Stata 13 and you have Stata 11, so you couldn't read data I might generate.

          Code:
          clear
          input str11(trapsite april june september december)
          "location 1" "17.76666667" "26.26666667" "19.53333333" "31.2" 
          "location 2" "18.63333333" "26" "19.83333333" "30.66666667" 
          "location 3" "18.96666667" "23.3" "19.23333333" "28.06666667" 
          "location 4" "17.13333333" "23.63333333" "19.23333333" "28.5" 
          "location 5" "17.5" "25.3" "18.3" "30.23333333" 
          "location 6" "18.13333333" "25.33333333" "19.4" "30.1" 
          "location 7" "18.33333333" "25.9" "21.76666667" "29.6" 
          "location 8" "18.3" "25.43333333" "19.6" "29.46666667" 
          "location 9" "17.43333333" "27.5" "20.43333333" "30.56666667" 
          "location 10" "17.23333333" "26.96666667" "19.46666667" "31.46666667" 
          "location 11" "21.1" "26.7" "22.96666667" "30.26666667" 
          "location 12" "19.26666667" "25.33333333" "22.53333333" "28.9" 
          "location 13" "20" "26.53333333" "22.76666667" "30.1" 
          "location 14" "19.1" "25.6" "22.03333333" "31.43333333" 
          "location 15" "19.96666667" "27.23333333" "22.33333333" "30.93333333" 
          "location 16" "16.43333333" "23.4" "18.63333333" "29.1" 
          "location 17" "12.9" "23.7" "19.66666667" "25.36666667" 
          "location 18" "11.7" "23.56666667" "19.2" "27.83333333" 
          "location 19" "13.26666667" "23.06666667" "19.96666667" "25.7" 
          "location 20" "9.266666667" "23.66666667" "19.56666667" "26.06666667" 
          end
          
          destring trapsite, gen(trapsite2) ignore("location ")
          destring april, replace
          destring june, replace
          destring september, replace
          destring december, replace

          Comment


          • #6
            But now that data management issue is worked out, it is vital to realise that the analysis of variance you are asking for is meaningless any way, or so I guess. You have 20 locations, which cannot meaningfully be the response (outcome, dependent) variable: presumably they are a covariate. Also, I guess that you have different observations for different months, so these are repeated measures, are they not? Further, I can't see any obvious rationale to interactions there.

            Repeated measures ANOVA is what you should read up on.

            Comment


            • #7
              thank you Nick let me try to work on it. Yes i got your point i have replicated data therefore i need to read around ANOVA with replication

              Comment


              • #8
                Thank you Nick Cox my problem seems solved, i did as per your suggestion and managed a oneway ANOVA without replication (for one month only) but i would like to try with 4 replications

                Comment


                • #9
                  Originally posted by ben earnhart View Post
                  Ah. Every observation has the word "location" in it. The "ignore" option took care of that. Then, for whatever reason, it imported your other variables as strings. Normally I don't recommend the "replace" option, but since they're all clearly supposed to be numeric, I used that. Then comes your final complication: your factor variables are non-integers, so anova won't run. This last complication is up to you to figure out, since you know your data and what kind of model you want to run.

                  BTW -- see how I used dataex to make all your data easily accessible? We prefer people to use that instead of posting Stata data files, for various reasons including the fact that I have Stata 13 and you have Stata 11, so you couldn't read data I might generate.

                  Code:
                  clear
                  input str11(trapsite april june september december)
                  "location 1" "17.76666667" "26.26666667" "19.53333333" "31.2"
                  "location 2" "18.63333333" "26" "19.83333333" "30.66666667"
                  "location 3" "18.96666667" "23.3" "19.23333333" "28.06666667"
                  "location 4" "17.13333333" "23.63333333" "19.23333333" "28.5"
                  "location 5" "17.5" "25.3" "18.3" "30.23333333"
                  "location 6" "18.13333333" "25.33333333" "19.4" "30.1"
                  "location 7" "18.33333333" "25.9" "21.76666667" "29.6"
                  "location 8" "18.3" "25.43333333" "19.6" "29.46666667"
                  "location 9" "17.43333333" "27.5" "20.43333333" "30.56666667"
                  "location 10" "17.23333333" "26.96666667" "19.46666667" "31.46666667"
                  "location 11" "21.1" "26.7" "22.96666667" "30.26666667"
                  "location 12" "19.26666667" "25.33333333" "22.53333333" "28.9"
                  "location 13" "20" "26.53333333" "22.76666667" "30.1"
                  "location 14" "19.1" "25.6" "22.03333333" "31.43333333"
                  "location 15" "19.96666667" "27.23333333" "22.33333333" "30.93333333"
                  "location 16" "16.43333333" "23.4" "18.63333333" "29.1"
                  "location 17" "12.9" "23.7" "19.66666667" "25.36666667"
                  "location 18" "11.7" "23.56666667" "19.2" "27.83333333"
                  "location 19" "13.26666667" "23.06666667" "19.96666667" "25.7"
                  "location 20" "9.266666667" "23.66666667" "19.56666667" "26.06666667"
                  end
                  
                  destring trapsite, gen(trapsite2) ignore("location ")
                  destring april, replace
                  destring june, replace
                  destring september, replace
                  destring december, replace
                  Thank you so much this really helped me

                  Comment

                  Working...
                  X