stuck with "foreach" command

Danah Abdul

Join Date: Dec 2020

Posts: 74
#1

stuck with "foreach" command

29 Dec 2021, 14:23

I am working on a large dataset, and trying to make binary variables/columns for patients who have been receiving 4 analgesics from an exhaustive varlist of medications.
I thought I may benefit from doing a loop. so I have these commands.

local list "pregabalin Co-codamol Carbamazepine paracetamol"
foreach drug of local list {
clear
use "R:\Working project files\Danah\Danah_folder\medications.dta"
keep if analgesics=="`drug'"
by patid, sort: keep if _n==1
gen `drug'=1
keep patid `drug'
merge m:m patid using "R:\Working project files\final_dataset.dta"
replace `drug' =0 if _merge==2
drop _merge
save "R:\Working project files\Danah\final_dataset.dta", replace
}

but I get

invalid syntax
Tags: None
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#2

29 Dec 2021, 15:00

You give no example data, and don't say why you think you need a loop, so I can't help you here. However, I'll just quote the Stata manual really quick

Because m:m merges are such a bad idea, we are not going to show you an example. If you think that you need an m:m merge, then you probably need to work with your data so that you can use a 1:m or m:1 merge.

Anyways, where does Stata give you invalid syntax? Which part of this loop is giving the trouble?
1 like
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#3

29 Dec 2021, 15:07

Again, I'm only guessing here, but I don't know why you'd need a loop. Why not just do

Code:

keep if inlist(analgesics,"pregabalin","Co-codamol","Carbamazepine","paracetamol")
1 like
Comment
Danah Abdul

Join Date: Dec 2020

Posts: 74
#4

29 Dec 2021, 15:23

Jared Greathouse Thank you Jared for your reply.
Indeed with the syntaxes above I was successful in getting a dataset with the binary varliset "pregabalin" (which is the first drug in the local list). However, the loop does not keep going to include the rest of the local list (Co-codamol Carbamazepine paracetamol). I am afraid I cannot share data at the moment for confidentiality reasons. The loop I thought using because it would save me time rather than going through each of the four drugs separately.

If I am to do it without a loop it will look as follows for two of the drugs:

clear
use "R:\Working project files\Danah\Danah_folder\medications.dta"

preserve
keep if analgesics=="paracetamol"
by patid, sort: keep if _n==1
gen paracetamol=1
keep patid paracetamol
merge m:m patid using "R:\Working project files\final_dataset.dta"
replace paracetamol=0 if _merge==2
drop _merge
save "R:\Working project files\Danah\final_dataset.dta", replace
restore

preserve
keep if analgesics=="Carbamazepine"
by patid, sort: keep if _n==1
gen carbamazepine=1
keep patid carbamazepine
merge m:m patid using "R:\Working project files\final_dataset.dta"
replace carbamazepine=0 if _merge==2
drop _merge
save "R:\Working project files\Danah\final_dataset.dta", replace
restore

and so on...

nb-there is no problem with the merge option.

I am also wondering how can I involve a two word drug "oxycodone hydrochloride" in the local list?? it's one drug made of two words.

Thank you
Danah
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#5

29 Dec 2021, 15:31

You don't need to share the full dataset, you can share a de-identified example of your data.

Either way, why does the syntax I mentioned not do the thing you want? When you use the inlist function, you don't need to keep making dummy variables. As I mentioned,

Code:

keep if inlist(analgesics,"pregabalin","Co-codamol","Carbamazepine","paracetamol")

Should keep any observations whos analgesic values are in the list of those words. The loop seems unnecessary here.

Your local macro would now look like

Code:

local list "pregabalin Co-codamol "oxycodone hydrochloride" Carbamazepine paracetamol"

Why is there no issue with the merge option? StataCorp is telling you what you want is likely a terrible idea. It isn't my data to analyze, but I'm almost 100% convinced that a many to many merge is never something one would need.

Last edited by Jared Greathouse; 29 Dec 2021, 15:35.
1 like
Comment
Danah Abdul

Join Date: Dec 2020

Posts: 74
#6

29 Dec 2021, 15:42

Jared Greathouse
Thank you for your reply.

I cannot use the proposed command

keep if inlist (analgesics,"pregabalin","Co-codamol","Carbamazepine","paracetamol")

because of the way the dataset is made, which makes sorting these drugs for varlists not possible.

I would change the merge option, yet, the problem is with the syntaxes for the loop. It can't get going, and stops after the first drug pregabalin!
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#7

29 Dec 2021, 16:02

How is the sorting not possible? You sort on patid in your first syntax, not the drugs. Without you using dataex and giving me an example of your data. I don't know why my original solution wouldn't keep the patients who were given the drugs you're interested in.

Either way, the loop and the merge procedure don't appear to be related.To quote Clyde Schechter

The use of m:m merges almost invariably produces useless garbage. There are very rare situations (I have encountered only one in 24 [now 27] years of using Stata) where the results of a m:m merge could be appropriate, but I am quite confident this isn't one of them.

As somebody who regularly works with panel data, trust me: you're better off creating unique ID's for each patient and then merging off those IDs, because unless this is a very specific circumstance, you'll likely learn that the many to many merge idea was a bad one much later on in your project.

Either way, for me to try and give any solution here, I'm going to need a subset of your data. Nobody can work on something they can't see.
1 like
Comment
Danah Abdul

Join Date: Dec 2020

Posts: 74
#8

29 Dec 2021, 16:50

Jared Greathouse
I figured out what was wrong with the syntaxes.
The problem was with "co-codamol"
This syntax is not possible :

gen `drug'=1

because of the hyphen in co-codamol.

so there had to be a preliminary step before running the syntaxes:

replace analgesics= "Cocodamol" if analgesics=="Co-codamol"

and then the local list will be:

local list "pregabalin Cocodamol Carbamazepine paracetamol"

Thank you for all your replies

Danah
Comment

Announcement

stuck with "foreach" command

Comment

Comment

Comment

Comment

Comment

Comment

Comment