Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • stata:I/O error writing .dta file

    Hi everyone!
    I've got a problem when trying to merge two big data, borrow.dta and return,dta.
    I got the following message:
    (master is borrow.dta)

    Code:
    merge 1:1 ID_time using return.dta
    I/O error writing .dta file
        Usually such I/O errors are caused by the disk or file system being full.
    Could you please kindly tell me how to deal with this ?
    Or all I need is to change my PC with a larger CPU or something?

    Thanks a lot!!


  • #2
    You wouldn't think the merge command would need to write to disk, but it does have a save command, used if the data is not sorted. You could -set trace on- and rerun your program to find out where and what Stata was trying to write.

    Comment


    • #3
      Originally posted by [email protected] View Post
      You wouldn't think the merge command would need to write to disk, but it does have a save command, used if the data is not sorted. You could -set trace on- and rerun your program to find out where and what Stata was trying to write.
      Thanks for your help , I tried save and it cannot be saved , neither.
      How should I do to cope with this?

      Comment


      • #4
        Hope this material could help you:
        https://back.nber.org/stata/efficient/bigio.html

        Comment


        • #5
          I wasn't clear, I expect that if you made sure that both datasets were sorted (in Stata, so that -merge- knew they were sorted) before the -merge- command, that might avoid the -save- command that -merge- executed behind the scenes.

          Have you thought about exactly how big the merged file will be? Will it fit in your core memory? Will you want to save it to disk? You should be able to estimate the size. -des- will return the size of one row of the dataset in core :

          Code:
          des
          di "`r( width)'"
          You can use that information with the number of rows in each dataset to estimate the size of the merged file and compare it to the free space on your computer disk storage. If n1 and n2 are the number of rows respectively and w1 and w2 the the widths, then worst case for the merged file size is (n1+n2)*(w1+w2) bytes but each matched row reduces the size of the merged file by w1+w2 bytes.

          There is a posting somewhere saying that Stata needs core memory equal to the worst case, even if all the records match. I don't know if this is correct, or was ever correct, but if that is the case you might divide the using dataset in to pieces, merge the pieces and append the results into one dataset.

          When you have really big files, you have to be conscious of the size, and plan for it.

          Comment


          • #6
            Originally posted by Chen Samulsion View Post
            Hope this material could help you:
            https://back.nber.org/stata/efficient/bigio.html
            Thanks soooo much, very helpful !

            Comment


            • #7
              Originally posted by [email protected] View Post
              I wasn't clear, I expect that if you made sure that both datasets were sorted (in Stata, so that -merge- knew they were sorted) before the -merge- command, that might avoid the -save- command that -merge- executed behind the scenes.

              Have you thought about exactly how big the merged file will be? Will it fit in your core memory? Will you want to save it to disk? You should be able to estimate the size. -des- will return the size of one row of the dataset in core :

              Code:
              des
              di "`r( width)'"
              You can use that information with the number of rows in each dataset to estimate the size of the merged file and compare it to the free space on your computer disk storage. If n1 and n2 are the number of rows respectively and w1 and w2 the the widths, then worst case for the merged file size is (n1+n2)*(w1+w2) bytes but each matched row reduces the size of the merged file by w1+w2 bytes.

              There is a posting somewhere saying that Stata needs core memory equal to the worst case, even if all the records match. I don't know if this is correct, or was ever correct, but if that is the case you might divide the using dataset in to pieces, merge the pieces and append the results into one dataset.

              When you have really big files, you have to be conscious of the size, and plan for it.
              Thanks for your answering , I may need a little more time to digest ^_^

              Comment

              Working...
              X