Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a loop

    Hello,

    I'm using the below codes to create a district variable.

    Code:
    gen dist = ""
    
    replace dist = "Nicobars" if statecode==35 & district==1
    replace dist = "North & Middle Andaman" if statecode==35 & district==2
    replace dist = "South Andaman" if statecode==35 & district==3
    
    replace dist = "Srikakulam" if statecode==28 & district==1
    replace dist = "Vizianagaram" if statecode==28 & district==2
    replace dist = "Visakhapatnam" if statecode==28 & district==3
    replace dist = "East Godavari" if statecode==28 & district==4
    replace dist = "West Godavari" if statecode==28 & district==5
    replace dist = "Krishna" if statecode==28 & district==6
    replace dist = "Guntur" if statecode==28 & district==7
    replace dist = "Prakasam" if statecode==28 & district==8
    replace dist = "Sri Potti Sriramulu Nellore" if statecode==28 & district==9
    replace dist = "Y.S.R. (Cuddapah)" if statecode==28 & district==10
    replace dist = "Kurnool" if statecode==28 & district==11
    replace dist = "Anantapur" if statecode==28 & district==12
    replace dist = "Chittoor" if statecode==28 & district==13
    I'm trying to create a loop for each of the values of the variable 'statecode' as follows:
    Code:
    foreach x in Nicobars  North & Middle Andaman South Andaman {
        replace dist = `x' if statecode==28
    }
    How can I incorporate the value of the variable 'district' in the above codes? It's different for each `x', so I'm confused about it. Also, is it possible to combine all the values of the variable 'statecode' in one loop?

    Thanks


  • #2
    Literal strings must be enclosed in double quotation marks.

    This is more likely to work, i.e. be legal, but I can't test it.
    Code:
    foreach x in "Nicobars"  "North & Middle Andaman" "South Andaman" {    
        replace dist = "`x'" if statecode==28
    }
    More to the point. the code is wrong as you would need at the same time to loop over district 1 2 3.

    Loops can be great. I have seen code where about 500 statements could be replaced by a loop with a total of 3 command lines.

    Here, the example is one where you're replacing 3 lines by more than 3 lines -- as said, you need to loop over 1 2 3 at the same time.

    The last thing I want to do is discourage people from extending their Stata skills, but the example is one where I feel obliged to say that a loop for this example just makes your code longer and trickier to write correctly, harder to follow, and harder to debug or maintain.

    Comment


    • #3
      I used to work with state and district-level data for India regularly!

      Rather than doing this programmatically, my advice here would be to open an excel document and add each corresponding district name, statecode, and district each in their own column, then merge the files by statecode and district. Hopefully, you already have something that gives you that information in a handy format. Otherwise, matching around 650 districts by hand will take a long time (but isn't avoidable with a loop).

      The correct way to do this with a loop is to first create two arrays, one containing the district name and the other containing an ordered pair representing the state and district code for the given district at the corresponding index. Then loop through both lists, preforming the replace operation on each iteration. I think the above is a much simpler option that is better supported in Stata.

      Comment


      • #4
        Daniel Schaefer gives excellent advice. A problem of this kind can often be solved conveniently using merge.

        See e.g. https://www.stata.com/support/faqs/d...s-for-subsets/ for some flavour.

        As a detail, the use of MS Excel somewhere in the process is a matter of taste and convenience, and in no sense essential to the process. What is essential is another Stata dataset with the extra information, as otherwise no merge is possible.

        Comment


        • #5
          As a detail, the use of MS Excel somewhere in the process is a matter of taste and convenience
          Indeed, and I only recommend excel as a matter of convenience: In my mind excel is just a shorthand for any software that can preform the necessary data entry/data management task, including Stata. If I were inputting data by hand I tend to prefer excel, but that is only a mild preference.

          Comment


          • #6
            Thank you, Nick and Daniel for your advices.

            Comment

            Working...
            X