Hello Statalist users,
I have found much information on the duplicate command; however, most are regarding dropping duplicate observations or renaming variables. Instead, I want to rename my duplicate observations. The data I received has categories (i.e. crops, livestock, etc.) with multiple subcategories. Each category and subcategory is imported as an observation and I eventually will be transposing this data. However, there are some subcategories that are labeled the same under different categories. I need to relabel these subcategory observations in order to transpose.
Below is a simplified version of my data:
I have found a quick fix; however, this is not dynamic.
Again, this works for this situation, but I want this code to be dynamic and allow for changes in duplicated names not just the current names.
I was trying to mend the information provided in the discussion about renaming variables as it seemed relevant. However, I could not identify the proper way to edit Daniel Klein's suggested posted.
I recognize the permname would assist in this process, but I am not sure how to use it to replace a duplicated observation instead of a variable.
I appreciate any and all information regarding this topic.
Regards,
Amie
I have found much information on the duplicate command; however, most are regarding dropping duplicate observations or renaming variables. Instead, I want to rename my duplicate observations. The data I received has categories (i.e. crops, livestock, etc.) with multiple subcategories. Each category and subcategory is imported as an observation and I eventually will be transposing this data. However, there are some subcategories that are labeled the same under different categories. I need to relabel these subcategory observations in order to transpose.
Below is a simplified version of my data:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str23(statename B) "Crop" "1" "consumption" "2" "adjustment" "3" "Animal" "4" "consumption" "5" "adjustment" "6" end
Code:
sort statename quietly by statename: gen dup=cond(_N==1,0,_n) tabulate dup replace statename = "consumption_crop" if dup>1 sort statename quietly by statename: replace dup=cond(_N==1,0,_n) tabulate dup replace statename = "adjustment_crop" if dup>1 sort B
I was trying to mend the information provided in the discussion about renaming variables as it seemed relevant. However, I could not identify the proper way to edit Daniel Klein's suggested posted.
Code:
// Create data set clear input str23 A str23 B // note str23 "This is my desired name" "This is my desired name" "9098" "8676878" end // rename foreach var of var A-B { loc original_text : di `var'[1] loc newname = strtoname(`"`original_text'"') loc newname : permname `newname' ren `var' `newname' char `newname'[original_text] `"`original_text'"' } d ,f l char l
I appreciate any and all information regarding this topic.
Regards,
Amie
Comment