My colleagues and I are having issues importing a number of pipe (“|”) delimited .txt files of 5-10GB in size. The data are stored on a remote server and we are using a remote desktop connection to access a network copy of Stata/IC 15.1 64bit installed on a different server in the same virtual network. The OS is Windows Server 2016 Standard configured as RDS VM in Azure
The issue we are having is that when we import the datasets into Stata it appears Stata is occasionally not reading in all observations. This is occurring sporadically as on some occasions it imports the appropriate number of observations whereas on others it does not. We have not been able to identify any common factor underlying the times we get fewer observations.
As an example I have shown the code below in which we sought to import the same file, which we know to contain 101,809,217 observations, twice:
Does anyone know why this is occurring?
Thanks
The issue we are having is that when we import the datasets into Stata it appears Stata is occasionally not reading in all observations. This is occurring sporadically as on some occasions it imports the appropriate number of observations whereas on others it does not. We have not been able to identify any common factor underlying the times we get fewer observations.
As an example I have shown the code below in which we sought to import the same file, which we know to contain 101,809,217 observations, twice:
Code:
. import delimited using "`filename’", clear delimiters("|") (10 vars, 101,809,217 obs) . import delimited using "`filename’", clear delimiters("|") (10 vars, 39,894,187 obs)
Thanks