Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stumped by problem loading files w/ long paths - any ideas?

    Hello everyone,

    I've gotten stumped by a repeated issue loading files through long paths.

    To put it briefly, I am trying to load a file in another directory through a long relative path, from a pwd with a long path itself. I can confirm that the target directory exists, and that the file in it exists. I can cd to the target directory via the relative path, and after doing so can load the file by just its name. But I can't load the file through its relative path, from the original pwd.

    I've tried experimenting, and this issue appears to arise when a pwd + relative path concatenated is >260 characters (the traditional value of the MAX_PATH environment variable in Windows).

    The thing is, I'm running Stata 17.0 MP4 on Windows 10 release 21H1, with long paths enabled. I don't think this should be a binding system limit. Has anybody else run into this issue?

    Some example code illustrating the issue is below:

    Code:
    . pwd
    C:\Users\dgross\Dropbox (Personal)\Research\xxxxxx Project\Paper 1 (Main)\Analysis\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    
    . global usptoDir="../../../../Data Collection/USPTO Datasets"
    
    . local dirname="xxx Firms with Focal (Top 1000) Words"
    
    . local fname="xxx_token_matches_terms"
    
    . use "$usptoDir/Derivative Datasets/`dirname'/`fname'.dta", clear
    file ../../../../Data Collection/USPTO Datasets/Derivative Datasets/xxx Firms with Focal (Top 1000) Words/xxx_token_matches_terms.dta
        not found
    r(601);
    
    . confirmdir "../../../../Data Collection/USPTO Datasets/Derivative Datasets/xxx Firms with Focal (Top 1000) Words/"
    
    . disp _rc
    0
    
    . capture confirm file "../../../../Data Collection/USPTO Datasets/Derivative Datasets/xxx Firms with Focal (Top 1000) Words/xxx_token_matches_terms.dta"
    
    . disp _rc
    601
    
    . cd "../../../../Data Collection/USPTO Datasets/Derivative Datasets/xxx Firms with Focal (Top 1000) Words/"
    C:\Users\dgross\Dropbox (Personal)\Research\xxxxxx Project\Data Collection\USPTO Datasets\Derivative Datasets\xxx Firms with Focal (Top 1000) Words
    
    . capture confirm file "xxx_token_matches_terms.dta"
    
    . disp _rc
    0
    Note that in this snippet the files/directories with the x's are ones where I've masked the name for this post, replacing every character with 'x' but keeping it the same length.

    The only closely related thread I've found is this one, but it didn't help me crack this.

    Thanks in advance!
    Dan


  • #2
    Note that in this snippet the files/directories with the x's are ones where I've masked the name for this post, replacing every character with 'x' but keeping it the same length.
    This seems to contractict that.
    Code:
    . cd "../../../../Data Collection/USPTO Datasets/Derivative Datasets/xxx Firms with Focal (Top 1000) Words/"
    C:\Users\dgross\Dropbox (Personal)\Research\xxxxxx Project\Data Collection\USPTO Datasets\Derivative Datasets\xxx Firms with Focal (Top 1000) Words
    and suggests that "../../../../" is in fact "C:/Users/dgross/Dropbox (Personal)\Research/xxxxxx Project/".

    Not saying that's the problem, just saying that the statement about x's seems inaccurate.

    Comment


    • #3
      Thanks for engaging. I'm sorry if that was unclear -- it's true that the relative path "../../../../", from the pwd when the cd command was run, points to the directory you noted.

      Here's another view of the issue:

      When my file is named "ter.dta" or anything shorter, this command executes successfully:
      Code:
      use "C:/Users/dgross/Dropbox (Personal)/Research/xxxxxx Project/Paper 1 (Main)/Analysis/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/../../../../Data Collection/USPTO Datasets/Derivative Datasets/xxx Firms with Focal (Top 1000) Words/ter.dta", clear
      When the file is named "term.dta" or anything longer, the command fails with a r(601) file not found error.

      The long path issue becoming binding at 260 characters (it actually seems to be 259, by this re-testing) is curious for its intersection with the Windows 260-charater MAX_PATH limit. I have tried reproducing this behavior in python using os.path.exists, and can't reproduce it--python finds the file at the above location with the longer file name, when Stata doesn't.

      From my broader reading, it seems that even when long paths are enabled in Windows (thru the Group Policy editor or the registry), individual applications have to be written to engage with it (I don't understand the issue well enough yet to be more specific than that). I do wonder if it is a bug or functionality issue. Or maybe I'm crazy!

      Dan

      Comment


      • #4
        Like you, I cannot believe that the MAX_PATH limit being the same as your empirical limit is a coincidence. I had thought that Dropbox was perhaps getting in the way - there are occasional reports of issues with network drives, including Dropbox synchronization - but what you write seems much more likely.

        I regret that as a macOS user I cannot reproduce your empirical testing.

        I'd recommend you engage with Stata Technical Services at

        https://www.stata.com/support/tech-support/

        and see what they have to say.

        You get extra points for providing a reproducible example that doesn't depend on your system or on the pre-existence of a directory hierarchy. Here's a starting point.
        Code:
        local d `c(pwd)'
        local l : strlen local d
        
        local s 100000000
        
        while `l'<250 {
            local d `d'/`++s'
            mkdir "`d'"
            local l : strlen local d
        }
        
        macro list _l _d
        Code:
        . local d `c(pwd)'
        
        . local l : strlen local d
        
        .
        . local s 100000000
        
        .
        . while `l'<250 {
          2.     local d `d'/`++s'
          3.     mkdir "`d'"
          4.     local l : strlen local d
          5. }
        
        .
        . macro list _l _d
        _l:             256
        
                        _d:             /Users/lisowskiw/Downloads/100000001/100000002/100000003/1000000
                        > 04/100000005/100000006/100000007/100000008/100000009/100000010/100000011/10000
                        > 0012/100000013/100000014/100000015/100000016/100000017/100000018/100000019/100
                        > 000020/100000021/100000022/100000023
        
        .
        Last edited by William Lisowski; 20 Jan 2022, 08:35.

        Comment


        • #5
          Thank you! Grateful for your help, and for the pointer.

          Dan

          Comment

          Working...
          X