Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Flat vs. Relational Database for use with Stata

    Hello,

    I'm an entry-level optimization analyst working at a company which utilizes a flat database (Excel) in conjunction with Stata 12.

    I am sure that a relational database (Access, SQL, etc.) would be far more efficient, just based off of how a relational database generally works, but I can't find anything online stating that relational databases are far superior for use with Stata specifically, let alone anything quantifying this stance.

    Can anyone provide any kind of data or information to show that a relational database will speed up our process significantly? At present, one of our production runs takes 9 hours to complete. Any time I mention using a relational database, my boss writes it off and tells me not to bother with it--that we need to optimize our code for use with Excel before bothering about a relational database.

    I'm glad to learn it on my own and then prove it to my boss, but I want to *know* that moving to a relational database would be worth the time before doing so.

  • #2
    It depends on what exactly are you doing with Excel and how the data gets there.. In general yep, SQL is way faster, but YMMV depending on the size and structure of the DB

    Comment


    • #3
      I expect you are correct that a database would speed up your process. With that said, though, unless your programs do nothing but read and write Excel files, it's hard to see how a significant proportion of the 9 hours of statistics could be due to the use of Excel. I assume you've used timer to assess how much time is spent in import, export, and putexcel commands. I also assume that you avoid writing data to an Excel file and then reading that same data back in from Excel in the same program. (That is, that you instead use Stata-format datasets or other techniques for storage and retrieval of intermediate results.)

      From your boss's point of view, if there is currently no database expertise in your organization, and if it is running so lean that you continue to use Stata 12 rather than pay the expense to upgrade to Stata 13 or now Stata 14, then I understand why your boss would be reluctant to move from a simple process that requires adding additional expertise and additional software expense - and perhaps additional hardware expense - to the organization's budget.

      Comment

      Working...
      X