Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Continuous or discrete variable?

    Hello.

    I have a question regarding a variable I have constructed. For a number of countries, I have multiple observations for each country.
    The thing is that I have a dichotomous variable (it can only take value 0 or 1) for each such observation. From this, for each country, I have created a variable that shows the percentage of observations (within the country), in which the dichotomous variable takes value 1.
    Example: I have a country A with 24 observations for that country. Of those 24 observations, in the dichotomous variable X, 13 observations take value 1 and the rest 0. Therefore, my constructed variable, Y, for country A would take value 51.166%.
    My question is, is that variable Y discrete or continuous? Could I estimate by OLS if the dependent variable is Y?

    Thanks for your attention!

  • #2
    that variable is not discrete, which does not mean that -regress- is necessarily the best option (you might get predicted values below 0 or above 1); see
    Code:
    help fracreg
    for a different option

    added in edit - I missed that this is panel data which you should account for in some way (possibly just using "vce(cluster clustervar)" depending on what your research question is
    Last edited by Rich Goldstein; 25 May 2021, 06:58.

    Comment


    • #3
      It is not a panel data, my variable measures for each country, "the percentage of establishments with a loan or a line of credit" in one year.

      Thank you so much for your quick answer.

      Comment


      • #4
        Originally posted by Ibai Ostolozaga Falcon View Post
        It is not a panel data, my variable measures for each country, "the percentage of establishments with a loan or a line of credit" in one year.

        Thank you so much for your quick answer.
        I would like to add one more question. That variable, "the percentage of establishments with a loan or a line of credit", will be my dependent variable, which is a percentage between 0 and 100. Could I use OLS or may be i will have to use another kind of regression?

        I hope you can ask my question. Thank you.

        Ibai

        Comment


        • #5
          OLS is an estimation method; the question is rather whether Y = Xb is a suitable functional form. My answer is that it depends.

          Your dependent variable is discrete in principle in that -- if I understand correctly -- the possible values are 0/24 1/24 ... 23/24 24/24 expressed as percents. But here as almost always treating variables as continuous seems natural or least convenient whenever there are many possible values and finer resolution can be imagined. Even with variables that really are continuous in principle -- people's ages, heights or weights, or rainfalls or temperatures -- there is usually a convention on what resolution is reported.

          Y = Xb is a bad idea if predicted values ever fall outside the possible range of 0 to 100%.

          It would also be a bad idea if relationships should show curvature because the limits are inhibiting how the dependent variable behaves.

          I don't think the issue can be settled by oracular pronouncements either way. You need to try out a linear regression and see it it works well enough for your purposes Added variables plots are, so far as I can judge, an underestimated tool, particularly in fields trained to focus on tabular output.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            OLS is an estimation method; the question is rather whether Y = Xb is a suitable functional form. My answer is that it depends.

            Your dependent variable is discrete in principle in that -- if I understand correctly -- the possible values are 0/24 1/24 ... 23/24 24/24 expressed as percents. But here as almost always treating variables as continuous seems natural or least convenient whenever there are many possible values and finer resolution can be imagined. Even with variables that really are continuous in principle -- people's ages, heights or weights, or rainfalls or temperatures -- there is usually a convention on what resolution is reported.

            Y = Xb is a bad idea if predicted values ever fall outside the possible range of 0 to 100%.

            It would also be a bad idea if relationships should show curvature because the limits are inhibiting how the dependent variable behaves.

            I don't think the issue can be settled by oracular pronouncements either way. You need to try out a linear regression and see it it works well enough for your purposes Added variables plots are, so far as I can judge, an underestimated tool, particularly in fields trained to focus on tabular output.
            Thank you so much Nick, that information was very helpful.

            Comment

            Working...
            X