Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • SQL commands in Stata

    Hi all,

    Suppose I load a dataset in Stata and would like to explore the data by writing SQL queries, Could anyone point me to the procedure?

    Thanks,
    Ramesh

  • #2
    Stata does not "speak" SQL. The only role of SQL in Stata that I am aware of is that a SQL command can be passed to an SQL database through the -odbc- command. But then you are not using Stata to process data, just to load it or write it out.

    Comment


    • #3
      I would also add that for data exploration purposes - aside from generating a cube - it is far easier and more efficient to use Stata commands.

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        Stata does not "speak" SQL. The only role of SQL in Stata that I am aware of is that a SQL command can be passed to an SQL database through the -odbc- command. But then you are not using Stata to process data, just to load it or write it out.
        Hi Clyde

        May I ask another question regarding SQL and Stata. If I use Stata to run SQL queries on a SQL Server database, how can I load the results into a Stata variable?

        Comment


        • #5
          I haven't used the -odbc- command in many years--it just doesn't come up in my work. So all I can tell you is that this is what -odbc- does. But I no longer remember any of the details of how it is used. Read the -help odbc- helpfile first and the linked sections of the PDF manuals second. If that doesn't clear it up for you, post back, and perhaps somebody who is current on the use of -odbc- will respond.

          Comment


          • #6
            Originally posted by Xiaoke Ye View Post
            If I use Stata to run SQL queries on a SQL Server database, how can I load the results into a Stata variable?
            It would be something along the lines of
            Code:
            odbc load, exec("SELECT * FROM myschema.MyTable WHERE whatever = 'criterion';") dsn(MyDatasource)

            Comment


            • #7
              odbc load, exec("SELECT * FROM myschema.MyTable WHERE whatever = 'criterion';") dsn(MyDatasource)
              I am using Stata 15.1 and this works for me.
              Code:
              odbc load, exec("SELECT * FROM dsw_db.surveyor")
              . But when I use the table("MyTable") option
              Code:
              odbc load, table("surveyor") dsn(pgm_db)
              Iget this error "The ODBC driver reported the following diagnostics
              [MySQL][ODBC 8.0(w) Driver][mysqld-5.6.44-log]You have an error in your SQL syntax; check the manual that corresponds to your
              MySQL server version for the right syntax to use near '"surveyor"' at line 1
              SQLSTATE=42000
              r(682);"



              Last edited by felix opero; 26 Jun 2020, 03:04.

              Comment


              • #8
                Dear all,

                I would like to JOIN two tables from SQL using the below executable query, the query is running well in SQL, however, when I run the query in Stata, I get the following error message "variable address_id already defined"

                odbc load, exec("SELECT * FROM customer JOIN address ON customer.address_id=address.address_id") dsn("company") clear

                I would appreciate your kind help if there is any way to solve this error.

                Thank you,
                Last edited by Rabih El Habta; 05 Aug 2023, 07:22.

                Comment


                • #9
                  Your select return both address_id cols so some revision is needed. The following code is not tested, but should return only one address_id
                  Code:
                  SELECT * FROM customer JOIN address using(address_id);

                  Comment


                  • #10
                    Dear Bjarte,

                    Thank you very much for the suggestion, nevertheless it did not work, now I am getting the following error:

                    "address_id" is not a recognized table hints option


                    P.S: same error message in SQL and Stata.

                    Comment


                    • #11
                      #9 does not work in MS SQL server. You can specify the cols to keep and use alias (not tested):
                      Code:
                      SELECT c.*, a.address_id AS a_address_id FROM customer c join address a ON c.address_id=a.address_id

                      Comment


                      • #12
                        Dear Bjarte,


                        Thank you so much for your valuable suggestions, the code provided (tested in both MySQL and SQL Server) does not join the two tables but rather it creates an alias a_address_id -picture below from MySQL.

                        I am using the Sakila database, if you would to test any other codes that work on both SQL types (MySQL and SQL Server).

                        Since the code in #9 worked on MySQL, I am considering using MySQL instead of SQL Server, which SQL type would you recommend where all SQL queries can be executed from Stata?

                        Thank you again,

                        Rabih.

                        Click image for larger version

Name:	Screenshot 2023-08-06 100440.png
Views:	1
Size:	166.8 KB
ID:	1723017

                        Comment


                        • #13
                          The minimal example in #11 should be modified by explicitly defining all cols from address to select in addition to address_id. ( using the column names in the SQL queries is best practice).

                          Comment

                          Working...
                          X