Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 'if' commands

    Hello everyone,

    I am doing a panel data analysis on elections. Basically, my data set has a variable suffrage ("sufr") that measures the proportion of population that has suffrage for a country in a single year.

    I created a dummy variable, "suffrage_dummy" that I want to code 0 or 1 with the following criteria:

    - if the difference between the proportion of suffrage in year x & x+1 is 1, dummy = 1; otherwise = 0

    In other words, if suffrage this year is 1 and last year it was 0, then dummy value = 1. For all others, dummy = 0. I tried doing some if commands on my own but it isn't working. Any help will be much appreciated. Thanks.

    Ashvinder

  • #2
    you write that the "variable suffrage ("sufr") that measures the proportion of population that has suffrage for a country in a single year." but then you want to condition on the difference being 1 (which would imply a change from no suffrage at all to 100% suffrage - I find this confusing; however, the following will give you some technique:
    Code:
    sort country year
    by country: gen byte sufrdummy=sufr>sufr[_n-1]

    Comment


    • #3
      Welcome to the Stata Forum / Statalist.

      I kindly suggest to provide data (full, abridged or mock) as recommend in the FAQ.

      For this, you may use the CODE delimiters or install the SSC dataex.

      Sharing data to work with, surely, is the best approach to entail a reply which really fits your needs.
      Last edited by Marcos Almeida; 20 Jul 2017, 09:50. Reason: Crossed with Rich's reply.
      Best regards,

      Marcos

      Comment


      • #4
        If I follow, and assuming everything is already sorted properly, something like this ought to do it:

        Code:
        generate byte suffrage_dummy = sufr==1 & sufr[_n-1]==0 if ID==ID[_n-1]
        You'll have to replace ID with your ID variable.
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 18.5 (Windows)

        Comment


        • #5
          You don't give a precise example of your data. I'll guess that it is xtset or tsset with panel identifier country and time variable year

          That being so

          Code:
          gen suffrage_dummy = sufr == 1 & L.sufr == 0 
          picks up observations for which suffrage is new this year.

          I wonder what you want to do about

          Code:
          suffrage == 0 & L.suffrage == 1
          as for that the difference between values for successive years is also 1.

          I also wonder in what sense sufr measures a proportion as you seem to be implying that possible values are just 0 and 1.

          EDIT: Similar comments from Rich, Marcos, Bruce. Four people are now unclear. Please do visit FAQ Advice and give us a better picture of your data.
          Last edited by Nick Cox; 20 Jul 2017, 09:54.

          Comment


          • #6
            You should read up on _n and _N notation. It is invaluable for this sort of task.
            Code:
            bys country (year) : gen suffrage_dummy =(suffrage[_n+1]-suffrage[_n])==1
            edit: soln identical to # 2 & # 4, leaving anyway.
            Last edited by Apoorva Lal; 20 Jul 2017, 10:04.

            Comment


            • #7
              Dear Rich/Marcos/Bruce,

              Thank you all for your prompt replies.

              I attach here a portion of my data, which I hope to illustrate what I want to do:

              year country v2x_suffr suffrage_dummy
              1922 Afghanistan 0 0
              1923 Afghanistan 0 0
              1924 Afghanistan 1 0
              1925 Afghanistan 1 0
              1926 Afghanistan 1 0
              1927 Afghanistan 1 0
              1928 Afghanistan 1 0
              1929 Afghanistan 1 0
              1930 Afghanistan 1 0
              1931 Afghanistan 1 0
              1932 Afghanistan .5 0
              1933 Afghanistan .5 0


              At the moment, I have coded all my suffrage_dummy values to 0. What I hope to do is, using the data above, is that the value for suffrage_dummy in 1924 Afghanistan would be equal to 1, as the difference between that year and the year before (i.e. 1923) is 1 (1923 suffrage = 0). The value for suffrage_dummy in 1925 Afghanistan would be 0, as the difference between the values of suffrage for that year and the year before is 0. The value for suffrage_dummy in 1932 Afghanistan is also equal to 0, as the difference between the proportion of suffrage between that year and the year before is -0.5.

              In other words, I am looking to condition my suffrage_dummy variable to take on the value of 1, conditional on the difference between the proportion of suffrage in year x and year (x-1) to be equal to 1; otherwise, all values = 0.

              Ashvinder

              Comment


              • #8
                Apoorva's solution isn't guaranteed identical to any based on a lag operator unless there are no gaps in the data.

                Comment


                • #9
                  Dear Rich/Marcos/Bruce/Nick/Apporva,

                  I apologise for my being unclear. I am still getting the hang of Statalist and Stata (in general). This won't happen again in the future.

                  I have tried Apoorva's syntax, but with slight modification:

                  Code:
                   
                   bys country (year) : gen suffrage_dummy =(suffrage[_n+1]-suffrage[_n])==1
                  This works for all those with data. However, as you mentioned Nick, there are some gaps in my data and this code doesn't account for that. Is there a way to also account for gaps ?

                  Ashvinder

                  Comment


                  • #10
                    #9 is already partially covered in #5. If you tsset or xtset your data, gaps imply missings. Alternatively, you may be able to interpolate within your data.

                    I still don't understand "proportion". I see a value of 0.5.

                    Comment


                    • #11
                      Dear Nick,

                      Yes, my data is in xtset mode. The gaps are indeed missing data.

                      By proportion, I meant, say if 50% of the population is granted suffrage, the value in the cell is 0.5. Likewise for two-thirds (0.67).

                      Ashvinder

                      Comment

                      Working...
                      X