Apologies if this has been asked before; I couldn't find a similar situation:
I'm developing a do file to take CSV's from our information management system, merge them into one large database, and create reports based on the descriptive statistics. I've run into a problem importing one of the databases to Stata, and it seems this has to do with how the system-generated CSV's handle commas within an observation.
The code is just your standard import delim:
import delim "$repdir\registros.csv"
The database is the initial registry when someone requests our organization's services for the first time, and includes a short questionnaire with several open fields. This is how some observations are imported (database is in Spanish):
clave_registro
14731
14732
PLANTAS EXÓTICAS Y CAPACITACIÓN DE ORQUÍDEAS",Tengo una idea de negocio que me gustaría desarrollar,,0,0,0,0,0,,,,,Ninguna de las anteriores,,No,,,,,,,,,,0,,,,,Si,No,Sin IDE,,Activo
14737
14740
Using Excel's text import function to look at how the database is structured, it looks like when the system generates the CSV, it usually separates each observation with only a comma, but when the observation itself contains a comma, it adds quotation marks around the observation. The problem, it seems, is that Stata only recognizes the comma as the delimiter, as the quotations are not present for all observations, and in ignoring the quotations fails to separate the observations correctly.
I've tried using the delimiters("chars") option with several different specifications, but with no luck.
I use Stata/MP 13.1, 64-bit version. Also, I've never had this issue with Excel 2013.
I'm developing a do file to take CSV's from our information management system, merge them into one large database, and create reports based on the descriptive statistics. I've run into a problem importing one of the databases to Stata, and it seems this has to do with how the system-generated CSV's handle commas within an observation.
The code is just your standard import delim:
import delim "$repdir\registros.csv"
The database is the initial registry when someone requests our organization's services for the first time, and includes a short questionnaire with several open fields. This is how some observations are imported (database is in Spanish):
clave_registro
14731
14732
PLANTAS EXÓTICAS Y CAPACITACIÓN DE ORQUÍDEAS",Tengo una idea de negocio que me gustaría desarrollar,,0,0,0,0,0,,,,,Ninguna de las anteriores,,No,,,,,,,,,,0,,,,,Si,No,Sin IDE,,Activo
14737
14740
Using Excel's text import function to look at how the database is structured, it looks like when the system generates the CSV, it usually separates each observation with only a comma, but when the observation itself contains a comma, it adds quotation marks around the observation. The problem, it seems, is that Stata only recognizes the comma as the delimiter, as the quotations are not present for all observations, and in ignoring the quotations fails to separate the observations correctly.
I've tried using the delimiters("chars") option with several different specifications, but with no luck.
I use Stata/MP 13.1, 64-bit version. Also, I've never had this issue with Excel 2013.
Comment