Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • stuck with "foreach" command

    I am working on a large dataset, and trying to make binary variables/columns for patients who have been receiving 4 analgesics from an exhaustive varlist of medications.
    I thought I may benefit from doing a loop. so I have these commands.


    local list "pregabalin Co-codamol Carbamazepine paracetamol"
    foreach drug of local list {
    clear
    use "R:\Working project files\Danah\Danah_folder\medications.dta"
    keep if analgesics=="`drug'"
    by patid, sort: keep if _n==1
    gen `drug'=1
    keep patid `drug'
    merge m:m patid using "R:\Working project files\final_dataset.dta"
    replace `drug' =0 if _merge==2
    drop _merge
    save "R:\Working project files\Danah\final_dataset.dta", replace
    }
    but I get
    invalid syntax

  • #2
    You give no example data, and don't say why you think you need a loop, so I can't help you here. However, I'll just quote the Stata manual really quick
    Because m:m merges are such a bad idea, we are not going to show you an example. If you think that you need an m:m merge, then you probably need to work with your data so that you can use a 1:m or m:1 merge.
    Anyways, where does Stata give you invalid syntax? Which part of this loop is giving the trouble?

    Comment


    • #3
      Again, I'm only guessing here, but I don't know why you'd need a loop. Why not just do
      Code:
      keep if inlist(analgesics,"pregabalin","Co-codamol","Carbamazepine","paracetamol")

      Comment


      • #4
        Jared Greathouse Thank you Jared for your reply.
        Indeed with the syntaxes above I was successful in getting a dataset with the binary varliset "pregabalin" (which is the first drug in the local list). However, the loop does not keep going to include the rest of the local list (Co-codamol Carbamazepine paracetamol). I am afraid I cannot share data at the moment for confidentiality reasons. The loop I thought using because it would save me time rather than going through each of the four drugs separately.

        If I am to do it without a loop it will look as follows for two of the drugs:


        clear
        use "R:\Working project files\Danah\Danah_folder\medications.dta"

        preserve
        keep if analgesics=="paracetamol"
        by patid, sort: keep if _n==1
        gen paracetamol=1
        keep patid paracetamol
        merge m:m patid using "R:\Working project files\final_dataset.dta"
        replace paracetamol=0 if _merge==2
        drop _merge
        save "R:\Working project files\Danah\final_dataset.dta", replace
        restore

        preserve
        keep if analgesics=="Carbamazepine"
        by patid, sort: keep if _n==1
        gen carbamazepine=1
        keep patid carbamazepine
        merge m:m patid using "R:\Working project files\final_dataset.dta"
        replace carbamazepine=0 if _merge==2
        drop _merge
        save "R:\Working project files\Danah\final_dataset.dta", replace
        restore
        and so on...

        nb-there is no problem with the merge option.

        I am also wondering how can I involve a two word drug "oxycodone hydrochloride" in the local list?? it's one drug made of two words.

        Thank you
        Danah

        Comment


        • #5
          You don't need to share the full dataset, you can share a de-identified example of your data.

          Either way, why does the syntax I mentioned not do the thing you want? When you use the inlist function, you don't need to keep making dummy variables. As I mentioned,

          Code:
           keep if inlist(analgesics,"pregabalin","Co-codamol","Carbamazepine","paracetamol")
          Should keep any observations whos analgesic values are in the list of those words. The loop seems unnecessary here.



          Your local macro would now look like

          Code:
          local list "pregabalin Co-codamol "oxycodone hydrochloride" Carbamazepine paracetamol"
          Why is there no issue with the merge option? StataCorp is telling you what you want is likely a terrible idea. It isn't my data to analyze, but I'm almost 100% convinced that a many to many merge is never something one would need.
          Last edited by Jared Greathouse; 29 Dec 2021, 15:35.

          Comment


          • #6
            Jared Greathouse
            Thank you for your reply.

            I cannot use the proposed command

            keep if inlist (analgesics,"pregabalin","Co-codamol","Carbamazepine","paracetamol")
            because of the way the dataset is made, which makes sorting these drugs for varlists not possible.

            I would change the merge option, yet, the problem is with the syntaxes for the loop. It can't get going, and stops after the first drug pregabalin!


            Comment


            • #7
              How is the sorting not possible? You sort on patid in your first syntax, not the drugs. Without you using dataex and giving me an example of your data. I don't know why my original solution wouldn't keep the patients who were given the drugs you're interested in.

              Either way, the loop and the merge procedure don't appear to be related.To quote Clyde Schechter

              The use of m:m merges almost invariably produces useless garbage. There are very rare situations (I have encountered only one in 24 [now 27] years of using Stata) where the results of a m:m merge could be appropriate, but I am quite confident this isn't one of them.
              As somebody who regularly works with panel data, trust me: you're better off creating unique ID's for each patient and then merging off those IDs, because unless this is a very specific circumstance, you'll likely learn that the many to many merge idea was a bad one much later on in your project.


              Either way, for me to try and give any solution here, I'm going to need a subset of your data. Nobody can work on something they can't see.

              Comment


              • #8
                Jared Greathouse
                I figured out what was wrong with the syntaxes.
                The problem was with "co-codamol"
                This syntax is not possible :

                gen `drug'=1
                because of the hyphen in co-codamol.

                so there had to be a preliminary step before running the syntaxes:

                replace analgesics= "Cocodamol" if analgesics=="Co-codamol"

                and then the local list will be:

                local list "pregabalin Cocodamol Carbamazepine paracetamol"
                Thank you for all your replies

                Danah

                Comment

                Working...
                X