Batch Import SAS Value Labels

Sarah Arron

Join Date: Dec 2014

Posts: 5
#1

Batch Import SAS Value Labels

15 Dec 2014, 17:36

I am working with a Stata formatted dataset from an organization that sent me the value labels in SAS format. I have Stata data in one file and then a separate Excel spreadsheet that has four columns: value label, value name, data type (not useful for Stata) and current data field value. The variable names that use those value labels are not included in that spreadsheet.

I am trying to create a .do file that will generate the value labels I want from the Excel sheet, so that I can label the variables with the appropriate values.
For each label in column 2, label define based on columns 4 and 1. Thus, for the table example below:
label define ABO 1 "O" 2 "A" 3 "B" 4 "AB"
label define ABOMAT 1 "Identical" 2 "Compatible" 3 "Incompatible"
And then assign them to the .dta file?

I appreciate any advice!
Thanks,
Sarah

Data Field Formatted Value SAS ANALYSIS FORMAT DATA TYPE Data Field Value

O ABO C 1

A ABO C 2

B ABO C 3

AB ABO C 4

Identical ABOMAT C 1

Compatible ABOMAT C 2

Incompatible ABOMAT C 3
Tags: None

ben earnhart

Join Date: May 2014
Posts: 1027

15 Dec 2014, 18:12

This should get you well on your way. Good luck sorting out which variables go with which formats; you didn't give any info on that aspect.

Code:

clear

*========I'M ASSUMING RAW TEXT WITH NO HEADER ROW.   ADJUST ACCORDINGLY TO IMPORT EXCEL
*========AND NAMES OF VARIABLES IF IT HAS A HEADER ROW
import delimited C:\data\trash1\sasformats.txt, delimiter(space)

*========PUT SECOND AND FOURTH COLUMNS (VALUE LABEL AND NUMERIC VALUE) FIRST
order v2 v4
*========DATA TYPE IRRELEVANT, DROP IT
drop v3

*========WRAP TEXT FOR LABEL IN QUOTATION MARKS; CHAR(34) IS A DOUBLE QUOTE
replace v1=char(34)+v1+char(34)

*========GET RID OF REPEATS IN VALUE LABEL NAME COLUMN, AND TACK ON "label define "
bysort v2: replace v2="" if _n!=1
replace v2="label define " + v2 if length(trim(v2))>0

*=========ADD CONTINUATION MARKER WHERE NECESSARY
replace v1=v1+" ///" if length(v2[_n+1])==0 & _n<_N

*=========OUTPUT RESULTS!
outfile using "C:\data\trash1\mylabels.do", noquote replace

Last edited by ben earnhart; 15 Dec 2014, 18:40.

Comment

David Radwin

Join Date: Mar 2014

Posts: 368
#3

17 Dec 2014, 11:34

Also, see http://www.stata.com/statalist/archi.../msg00349.html .

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
Comment
Sarah Arron

Join Date: Dec 2014

Posts: 5
#4

18 Dec 2014, 16:01

Brilliant, thanks so much! I tweaked it a bit to make it work.
For anyone finding this thread in the future, for UNOS STAR File SAS data conversion to Stata, this should get you started. Note that C type variables (with string values) have to be dropped, but these usually don't need value labels anyway, they are self-explanatory.

order SASANALYSISFORMAT DataFieldValue DataFieldFormattedValue
replace DataFieldFormattedValue = char(34)+DataFieldFormattedValue+char(34)
bysort SASANALYSISFORMAT DataFieldValue: replace DataFieldValue="" if _n!=1
replace DataFieldValue ="." if DataFieldValue ==""
replace DataFieldValue ="7777777" if DataFieldValue =="**OTHER**"
replace DataFieldValue ="9999999" if DataFieldValue =="Null or Missing"
drop if DATATYPE=="C"
destring DataFieldValue, replace
drop DATATYPE
replace SASANALYSISFORMAT ="label define "+SASANALYSISFORMAT if length(trim(SASANALYSISFORMAT))>0
gen str add = ", add"
outfile using "Outfile.do", noquote replace wide
Comment

Data Field Formatted Value	SAS ANALYSIS FORMAT	DATA TYPE	Data Field Value
O	ABO	C	1
A	ABO	C	2
B	ABO	C	3
AB	ABO	C	4

Identical	ABOMAT	C	1
Compatible	ABOMAT	C	2
Incompatible	ABOMAT	C	3

Announcement

Batch Import SAS Value Labels

Comment

Comment

Comment