Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Java heap size

    Can anyone provide any advice on the Java heap size required to load a large table from a PostgreSQL database? I have stored a table in PostgreSQL for which the Stata dta file is roughly 860 MB. I have used the command

    Code:
    java set heapmax 16g
    to set the java_heapmax parameter to 16384m (confirmed by java query). Still I am getting a message "java.lang.OutOfMemoryError: Java heap space" even though the heap space is more than 16 times the Stata dataset size. This ratio seems extreme. Or is there some other parameter that I need to change?

    I have noticed that Stata tends to store strings extracted from SQL databases as strL even though the PostgreSQL storage format is varchar(15) - i.e. a varying number of characters with a maximum of 15. That applies to two variables. The rest are either integers or floating point numbers.

    Incidentally, in case anyone queries it my system has 32 GB of RAM with over 31 GB available.

  • #2
    I'm sorry to hear about your troubles with the jdbc command. If you have time, I need a little more information. First, what jdbc driver are you using? If I know the exact jdbc driver number, I can download the driver and try to reproduce the problem.

    Second, what happens if you exclude the String/Varchar variables from your jdbc load command? Create a temporary table that is a copy of the original table. Next, drop the String variables from the temporary table. Last, load the temporary table into Stata.

    Last, try using the rows(#) option of jdbc load to limit the fetch results set of the table in question.

    Comment


    • #3
      @KevinCrow - I am sorry that I didn't respond to your original reply. I found a workaround and got distracted by other priorities. Also there don't seem to be many people on Statalist with an interest in or familiarity with jdbc. Unfortunately, I keep on running into difficulties related to jdbc. There is an earlier post today with a related query.

      To answer your queries: I am using postgresql-42.7.3.jar driver for PostgreSQL which, according to my file records, was downloaded on 23May2024. As my separate post implies, I am unable to get the jdbc load, exec() variant of load to work properly, but I have more success when relying on a sequence of jdbc exec ... commands.

      Even then I have still run into a problem. One of the tables in my database has 2 variable of type enum. Stata's jdbc load command tries to load them as type long and falls over when it encounters a string value. So instead I have done the following:

      Code:
      jdbc exec "SELECT * INTO TABLE temptab FROM devices;"
      jdbc describe "temptab"
      jdbc exec "ALTER TABLE temptab ALTER COLUMN type TYPE varchar;"
      jdbc exec "ALTER TABLE temptab ALTER COLUMN floor TYPE varchar;"
      jdbc load, exec("SELECT * FROM temptab;")
      The ALTER commands successfully convert the enum variables to type varchar (as confirmed by jdbc describe ...) but the load statement still fails in the same way. It appears that jdbc doesn't recognize the conversion to a string type.

      Comment

      Working...
      X