Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping over files in folders

    Hello,

    I have this activity that I require help in, and was wondering if the community can help me in figuring out this task.

    I have 2 folders. Folder one contains document 1 (in pdf and doc/docx) while Folder 2 contains document 2 (in pdf).

    I need to convert document files to pdf then use latex to append the two documents to save one consolidated document in any folder (preferably a new folder). I wrote a pseudo code for now

    Code:
    ssc install fs, replace
    
        * This code converts all text files to pdf inside a folder in the current directory
        cd "C:/Users/Fahad Mirza/Desktop/Test"
        
        local folder = "./cover/"
        
        local files : dir "." files "*.doc*"
        
        foreach file of local files {
        docx2pdf `"`file'"'
        local name = substr(`"`files'"', 2, strpos(`"`files'"',".")-2)
        
        file open fh using   "`name'.tex", replace write
        file write fh            \documentclass{article}
        file write fh _n        \usepackage{pdfpages}
        file write fh _n        \begin{document}
        file write fh _n        \includepdf[fitpaper=true, pages=-]{"`name'.pdf"}
    
        fs "`folder'*`name'*"
        file write fh _n        \newpage
        file write fh _n        \includepdf[fitpaper=true, pages=-]{"`name'.pdf"}
        file write fh _n        \end{document}
        
        * Ending Tex File 
        file close fh
        type "`name'.tex"
        pause
        
        cd "C:/Users/Fahad Mirza/Desktop/Test"
        
        // Texifying -- This invokes tex studio
        !texify -p -c -b --run-viewer "`name'.tex"
        pause
    
    
        }
    For this task I am using Texstudio, Miktex and Stata. Essentially the code has to convert word files to pdf, open a blank tex file, write latex code, use the package pdfpages and append two documents. In the end, I invoke Texstudio to convert tex to pdf

    I will really appreciate if someone can help me troubleshoot and review this code.

    This for me is a time sensitive task so if you need me to assist you in providing further details, then feel free to ask.

  • #2
    It's not clear to me from the post if the code you wrote worked or not. If I were to make a suggestion, I would use a combination of filelist (SSC), !mv (see help shell) for moving the files to one directory (or you can look into the community-contributed mvfiles (SSC)), docx2pdf for converting the files and python for appending all of the pdfs into one document. Al Sweigart's Automate the Boring Stuff with Python is available online and he has a worked example where he does just that.

    See
    https://automatetheboringstuff.com/2e/chapter15/


    And the "Step 1: Find All PDF Files" section for an explanation of the code below.

    Code:
    import PyPDF2, os
      
    # Get all the PDF filenames.
    pdfFiles = []
    for filename in os.listdir('.'):
        if filename.endswith('.pdf'):
            pdfFiles.append(filename)
    pdfFiles.sort(key = str.lower)
    
    pdfWriter = PyPDF2.PdfFileWriter()
    Hope this helps.

    Comment


    • #3
      I was not able to execute the loop. I kept getting an invalid file specification error from within the loop however, when i open a tex file from a single line entry in the command prompt, i am able to open one and write to the file smoothly.

      I modified the code just a bit but am unable to figure out how to go about addressing this.

      Code:
          clear all
          
          * This code converts all text files to pdf inside a folder in the current directory
          cd "C:/Users/Fahad Mirza/Desktop/Test"
              
          local files : dir "." files "*.doc*"
          
          foreach file of local files {
          docx2pdf `"`file'"', replace
          local name = substr(`"`file'"', 1, strpos(`"`file'"',".")-1)
          display "`name'"
          
          file open pdf using   "`name'.tex", replace write
          file write pdf             \documentclass{article}
          file write pdf _n        \usepackage{pdfpages}
          file write pdf _n        \begin{document}
          file write pdf _n        \includepdf[fitpaper=true, pages=-]{"`name'.pdf"}
      
          file write pdf _n        \newpage
          file write pdf _n        \includepdf[fitpaper=true, pages=-]{"./cover letter/`name'.pdf"}
          file write pdf _n        \end{document}
          
          * Ending Tex File 
          file close pdf
          type test.tex
          pause
          
          cd "C:/Users/Fahad Mirza/Desktop/Test"
          
          * Using Tex Studio to convert
          !texify -p -c -b --run-viewer test.tex
          pause
      
      
          }
      Even if I can end up just making a tex file, that will be sufficient from the loop because that way i can just add the last lines later to address this. The loop breaks at this line of code:

      Code:
       file open pdf using   "`name'.tex", replace write
      Regarding the solution you provided, I think it looks fantastic however, I am not really well versed with python so it is something that I will have to troubleshoot from scratch to get it to work.

      In any case, help will be appreciated.

      Comment

      Working...
      X