Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating new variable

    Hi within a dataset I have been looking at there are 33 countries. I am trying to create a dummy variable for a UK native and a UK migrant. I'm wondering how I would create this variable. I have two variables that may help to do this. One is the Country name itself. Then there is a variable for country of birth.

    I thought using both variables I could create a dummy variable for UK migrants and UK natives.

    For UK natives I thought the command is:
    gen UKnative = Country==UK | Country of birth==UK
    So I want to create a UK native dummy variable which = 1 if the individual is a native in the UK

    But I am struggling to do this for a UK migrant since there are many other countries. Would it be:
    gen UKmigrant = Country==UK | Country of birth not = UK.
    So I want to create a UK migrant dummy variable which =1 if the individual is a migrant in the UK
    I am not sure how to do this
    Last edited by Taiba Chau; 08 Apr 2022, 15:19.

  • #2
    I think what you want is:
    Code:
    gen UKnative = (country == "UK" & country_of_birth == "UK")
    gen UKmigrant = (country == "UK & country_of_birth != "UK")

    Comment


    • #3
      Couldn't we make this a one liner? Maybe,
      Code:
      g UKnative = cond(1,country == "UK" & country_of_birth == "UK",0)

      Comment


      • #4
        The code in #3 is just a longer version of the first command given in #2. The first argument of -cond()- is a logical expression. There it is specified as 1, so always true. So this is equivalent to -g UKnative = country == "UK" & country_of_birth == "UK"=-, which is exactly the first command in #2. (The second command in #2 is not covered by #3.)

        And, though I'm not sure this is what is intended in #3, I don't think there is any way to create two different new variables in a single command.
        Last edited by Clyde Schechter; 08 Apr 2022, 16:58.

        Comment


        • #5
          Oh yes you're right,I got it mixed up! So instead, it would be
          Code:
          g UKnative = cond((country == "UK" & country_of_birth == "UK"),1,0)
          I think that's about right?

          Comment


          • #6
            It's correct, but it's unnecessarily complicated because -g UKnative = (country == "UK" & country_of_birth == "UK")- does exactly the same thing and is shorter and more transparent. The real value of the -cond()- function is when the values to be returned are something more complicated than the 1 and 0 that "come for free" with just a bare logical expression.

            Comment


            • #7
              I am slightly confused on what the right code is. Because I tried both and got a mismatch error message
              Last edited by Taiba Chau; 09 Apr 2022, 14:50.

              Comment


              • #8
                The type mismatch error suggests you're trying to take the average of "Apple pie", "Santos Reyes Tepejillo", and "corn flakes" or trying to find the number of times the word "another" appears in the number 654. In short, a numerical operation on a string or a string operation on a number. This is why we ask for example data at Statalist.

                So please, give your example data using the dataex command. For us to provide meaningful feedback, you must provide your example data using the dataex command, the real data from an easily importable source (i.e., Github), or the equivalent of a toy example.
                Otherwise, anything we say is simply a waste of time. Note, that I'm not trying to be mean in saying this, I'm saying this because if we can't see your dataset as you do with a minimal worked example, anything we suggest is just guesswork. The reason that I'm emphasizing this is because questions like this one likely have a relatively simple fix, but even simple fixes can be wildly overcomplicated without a minimal worked example of a dataset and code that you've tried to accomplish your task.

                So please, provide us with your example data that encapsulates the problem and I'm more than okay with helping you solve this

                Comment

                Working...
                X