Hi,
Does anyone use the -odbc- command extensively to work with remote databases?
I understand how to use it to send queries to the remote db to, for example, join tables that are already on the db server and then bring the result to a local Stata dataset (or frame).
However, I would like to be able to easily use the Stata dataset in memory as part of the query-- for example, suppose I have a list of study participant IDs in a local dta file (maybe 100,000 IDs) and want to send that list to the remote database to query for records from some very large tables (billions of rows).
Pulling down the entire remote db table using odbc load and merging locally is not an option.
I can do this kind of thing in R (or SAS) but would like to be able to do it better in Stata (because I prefer Stata to R and SAS for most other things).
The only way I can figure out to do this entirely from Stata is that I must explicitly create a table on the remote db, then insert the local data, then run the query?
This method could get a little tedious if I needed to send more than just one column to the remote temporary table, because I would have to declare the whole table in the SQL to create it. (For example if I might need to join on ID, and use a date in the local dataset in a where clause against a column in the remote db, etc.).
Also it would be nice if I could use temporary (volatile) tables on the remote server (e.g., in MS SQL #mytemptable) but I have not been successful with this.
So although I can do what I want, it is a bit clumsy and I am wondering if anyone has a better method.
Thanks,
Scott
Does anyone use the -odbc- command extensively to work with remote databases?
I understand how to use it to send queries to the remote db to, for example, join tables that are already on the db server and then bring the result to a local Stata dataset (or frame).
However, I would like to be able to easily use the Stata dataset in memory as part of the query-- for example, suppose I have a list of study participant IDs in a local dta file (maybe 100,000 IDs) and want to send that list to the remote database to query for records from some very large tables (billions of rows).
Pulling down the entire remote db table using odbc load and merging locally is not an option.
I can do this kind of thing in R (or SAS) but would like to be able to do it better in Stata (because I prefer Stata to R and SAS for most other things).
The only way I can figure out to do this entirely from Stata is that I must explicitly create a table on the remote db, then insert the local data, then run the query?
Code:
odbc exec("CREATE TABLE myschema.IDs (ID bigint);") dsn("MyDatabase") odbc insert ID, table("myschema.IDs") dsn("MyDatabase") odbc load, exec("SELECT i.ID, a.whatever FROM myschema.ID as i INNER JOIN otherschema.VeryLargeTable as a on i.ID = a.ID; ") dsn("MyDatabase") odbc exec("DROP TABLE myschema.IDs;") dsn("MyDatabase")
Also it would be nice if I could use temporary (volatile) tables on the remote server (e.g., in MS SQL #mytemptable) but I have not been successful with this.
So although I can do what I want, it is a bit clumsy and I am wondering if anyone has a better method.
Thanks,
Scott
Comment