Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PLS regression in Stata

    Hi all,

    I'm trying to do a pls regression in Stata for my thesis (to find the effects of various factors on the discard rate).
    I have never done it in Stata before, only in R (however, I don't like R that much).
    I looked at the "help" file, but still got constant error messages when trying to do it with my data.
    I particularly struggle with the 'adjacent' option, and I'm also not sure which scheme to use.
    Could anyone explain this to me? There is basically no tutorial to be found online...

    Thanks a lot!

    Guest
    Last edited by sladmin; 31 Jul 2018, 13:37. Reason: anonymize poster

  • #2
    Guest:
    welcome to the list.
    The following thread might be useful: http://www.statalist.org/forums/foru...l-least-square
    Last edited by sladmin; 31 Jul 2018, 13:38. Reason: anonymize poster
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      You are probably talking about pls from SSC, as you are asked to explain. Also you description of the problem that you are facing is way to vague to give useful advice.

      As an aside, you are aware of the author's comment about his own program?

      This program is provided for educational purposes. It is difficult to recommend
      the PLS composites for any serious empirical work (see Ršnkkš, McIntosh, and
      Antonakis (2015))
      The article is quite convincing and you might want to think twice before applying a method that seems to be questionable in many ways. Carlo provided a link pointing to sem, which might be better way to go.

      Best
      Daniel

      Comment


      • #4
        pls calculates composite variables using the partial least squares path modeling (PLS) algorithm. The composites are calculated as weighted combinations of existing variables using the weight algorithm introduced by Wold (see Wold (1982)). The composites produced by pls are identical to the composites produced by commercial PLS software as well as the open source matrixpls R package except for small numerical differences due to different convergence criterion.

        https://ideas.repec.org/c/boc/bocode/s458107.html
        Emad A. Shehata
        Professor (PhD Economics)
        Agricultural Research Center - Agricultural Economics Research Institute - Egypt
        Email: [email protected]
        IDEAS: http://ideas.repec.org/f/psh494.html
        EconPapers: http://econpapers.repec.org/RAS/psh494.htm
        Google Scholar: http://scholar.google.com/citations?...r=cOXvc94AAAAJ

        Comment


        • #5
          Thanks Daniel for the advice. After talking again with my supervisor about these issues, we agreed to use sem.
          However, now I'm not sure how my model should exactly look like. I made a first quick attempt.
          'numberpercent' (or 'weightpercent') is my dependent variable, while the biological and economic factors that might effect it are "tac" "ssb" "price" "landings" etc
          I have the number of fish thrown back into sea by year, species and vesseltype. Do I then have to add 'species' 'year' and 'vesseltype' on the right side of the dependent variable?

          My other issue is that I have high linearity between some independent variables (as seen in the correlation matrix). How do I deal with this?

          Any advice would be highly appreciated!

          Best regards
          Guest
          Attached Files
          Last edited by sladmin; 31 Jul 2018, 13:38. Reason: anonymize poster

          Comment


          • #6
            I am afraid, I can give little further advice here.

            Form the attachment (cf. FAQ 12.4 and 12.5) you show, it seems you do not have any latent (unobserved) variables. Without a measurement model, I do not even see the reason to use pls or sem where regress might suffice.

            You ask about which variables to include in your model, but I have no clue about the underlying theory or the research question(s) you are trying to answer. Also I have no idea what numberprecent or weightpercent might mean. Thus, I cannot comment on which variables should be in the model. What I can say is this: neither sem nor regress will not be happy to accept string variables, like species.

            Your question regarding collinearity is not easily answered. The usual "problem" manifests itself in high(er) standard errors, that accurately reflect the uncertainty associated with the estimated coefficients, given the available information in the data. The only way to solve this "problem" is to collect more data. You need to make sure whether collinearity is a problem for you and why.

            Best
            Daniel
            Last edited by daniel klein; 26 May 2016, 05:07.

            Comment


            • #7
              Thanks for the quick reply.

              numberpercent and weightpercent are the percentage discards (fish not retained but thrown back into sea) in numbers and in weight, respectively.
              My hypotheses regard the effect of biological, economic and regulatory factors on the amount of discards, eg A higher biomass ('referencebiomass') results in higher discards.
              Someone suggested to group the effects (see attachement), but I have to limited knowledge to assess if that makes sense...
              Attached Files

              Comment


              • #8
                The acronym PLS can mean two different things. PLS regression (PLSR), like principal-component regression, aggregates a large number of independent variables into a smaller number of composite variables that are used to predict one observed dependent variable. PLS path modeling (PLS-PM) refers to an approach where both independent and dependent variables are composites. My pls module implements PLS-PM, not PLSR.

                PLS path modeling is a very problematic technique. We explain (some of) the problems in this article, which is also cited in the pls module documentation.

                Rönkkö, M., McIntosh, C. N., & Antonakis, J. (2015). On the adoption of partial least squares in psychological research: Caveat emptor. Personality and Individual Differences, (87), 76–84. http://doi.org/10.1016/j.paid.2015.07.019

                PLS regression may be a genuinely useful tool if you are interested in prediction, but I am not aware of any Stata implementations. But you can do principal components regression using pca and regress. The procedure is explained in e.g. Wikipedia.

                Comment


                • #9
                  which is the stata command of pls?

                  Comment

                  Working...
                  X