Get folder name of all files in the subdirectories of a specific directory and write them into the files

Get folder name of all files in the subdirectories of a specific directory and write them into the files

1032
Power UserPower User
1032

    Mar 21, 2018#1

    I'm trying to write a script that can handle folder names. I'll explain.

    I have a lot of HTML files that were saved and are connected with their folders.
    And I need to process them in a manner that their style files are called by their folders, like this:

    Current call:

    <link media="all" href="index.css" type="text/css" rel="stylesheet">

    Desired call:

    <link media="all" href="SubFolderBelow/index.css" type="text/css" rel="stylesheet">

    Current HTML file of the above example is in SubFolderBelow and it will be moved to an upper folder later, after processing.

    So CopyFilePath from macro commands is: C:\CurrentFolder\SubFolderBelow\CurrentFile.html

    That is the same folder of index.css: C:\CurrentFolder\SubFolderBelow\index.css

    I did a search here at the forum and found some information about this issue. But, till now, I can't find a way to achieve my goal.

    I found a very detailed and extensive script written by Mofi and other related topics:
    So far, I can see that this job is better done with a script than a macro, I think.

    Unfortunately, I spent too much time on try and error tasks without success.
    To use regular expressions from this topic, I could not truncate the full file path as desired.
    Solution is at scripts capabilities for sure.
    But it seems to be beyond my skills...
    🙁
    So, my choice is to ask for help here.

    This is the main question, but it would be very nice if I could run some script that Find/Replace href=" tag by href="SubFolderBelow/
    AND
    such script could run that Find/Replace on all HTML files that are at subfolders below C:\CurrentFolder.

    6,686585
    Grand MasterGrand Master
    6,686585

      Mar 23, 2018#2

      It is definitely no problem to write a script for this task. But there are some requirements for coding the script unclear.

      What is the directory structure exactly?
      • Root directory of offline website
        • SubdirectoryA
          • file_a1.html
          • file_a2.html
        • SubdirectoryB
          • file_b1.html
          • file_b2.html
        • file_root1.html
        • file_root2.html
      Or are there even more levels of directories like in this second example:
      • Root directory of offline website
        • SubdirectoryA
          • SubdirectoryX
            • file_a_x1.html
            • file_a_x2.html
          • SubdirectoryY
            • file_a_y1.html
            • file_a_y2.html
          • file_a1.html
          • file_a2.html
        • SubdirectoryB
          • SubdirectoryZ
            • file_b_z1.html
          • file_b1.html
          • file_b2.html
          • file_b3.html
        • file_root1.html
        • file_root2.html
        • file_root3.html
      What should all those HTML files contain? The path of each file without path of root directory? Please post examples for one file in each directory level.

      Is it possible that some directory names contain characters not being matched by character class [0-9A-Za-z_-] which would require URL encoding those characters in the paths?

      What is the character encoding of the HTML files? ANSI or UTF-8 or UTF-16?

      Is it necessary that the script should have also the capability to update/fix CSS file references instead of just adding the path (different regular expression replace)?

      Where are the CSS files in above directory structures stored referenced from the HTML files?

      It would be best if you could compress into a ZIP or RAR or 7z file some files with right directory structure before script execution and one more directory structure with same files with manually edited the CSS file references in these files for showing us how the files should look like after script execution. Then attach the archive file to your next post. This would save us a lot of time as we don't need to create directory structure and files for testing by ourselves and we could verify that the result after script execution is really exactly what you want. Before and after examples are best on an order for a coding task.
      Best regards from an UC/UE/UES for Windows user from Austria

      1032
      Power UserPower User
      1032

        Mar 23, 2018#3

        The task is much more simple than you imagine.

        Currently the file tree is like this:
        • ParentFolder
          • SubdirectoryA
            • index.html
            • index.css
            • ...
            • other files
          • SubdirectoryB
            • index.html
            • index.css
            • ...
            • other files
          • SubdirectoryC
            • index.html
            • index.css
            • ...
            • other files
        After processing the file tree will be like this:
        • ParentFolder
          • SubdirectoryA
            • index.css
            • ...
            • other files
          • SubdirectoryB
            • index.css
            • ...
            • other files
          • SubdirectoryC
            • index.css
            • ...
            • other files
          • index1.html
          • index2.html
          • index3.html
        HTML files must contain the path of each file only without path of root directory.
        Please analyze the example file tree that I got from this forum (deleted later).

        Each folder has index.html which is the main file and which will be processed.
        I'm not sure if UltraEdit scripts can rename file, but if it can, this file would be renamed to its folder name, discarding '_files' suffix.
        If this is not possible, there is no problem if it keep its original name. Later, after processing, I'd rename it and move it to its parent folder.

        Each index.html has only one line to point its style sheet file.

        Using example of "UE-UES configuration to edit batch files (.bat, .cmd) with OEM character set_files" folder, index.html has:

        <link media="all" href="index.css" type="text/css" rel="stylesheet">

        And it would be changed to following for the right path when I move index.html to its parent folder:

        <link media="all" href="UE-UES configuration to edit batch files (.bat, .cmd) with OEM character set_files/index.css" type="text/css" rel="stylesheet">

        Each index.html file has some calls to image files.

        Using example of "UE-UES configuration to edit batch files (.bat, .cmd) with OEM character set_files" folder, index.html has:

        <img class="vertical-centered logo-pic" src="logo_t.png">

        And it would be changed to:

        <img class="vertical-centered logo-pic" src="UE-UES configuration to edit batch files (.bat, .cmd) with OEM character set_files/logo.png">

        So, script has to replace all occurrences of src=" to src="UE-UES configuration to edit batch files (.bat, .cmd) with OEM character set_files/.
        And in a same manner,replace all occurrences of url(' to url('UE-UES configuration to edit batch files (.bat, .cmd) with OEM character set_files/.

        After moving all index.html to the ParentFolder, the HTML file would be able to find all style sheets and images associated.

        Resuming:
        • Find/Replace 1 occurrence of href=" to href="Folder/
        • Find/Replace all occurrences of src=" to src="Folder/
        • Find/Replace all occurrences of url(' to url('Folder/
        Bonus: If it is possible rename each index.html to its folder name.

        Addendum:
        This forum is a gold mine.
        After found it in 2013, I didn't realize that it is a wonderful place to learn a lot of things.
        But when I has revisited this year, I can see that is perfect to acquire too much information and lessons about regular expression, macro commands, tweaks/tricks and even programming skills.
        You, Mofi, do a wonderful job here, guiding a crowd of novices and newbies that are thirsty of knowledge.
        Congrats.

          Mar 23, 2018#4

          I almost forgot to send a file tree after script execution. Here you are: (deleted later)

          As I said before, there is no problem if script can't rename or move files. I can do it manually. The focus is the Find/Replace actions with correct folder names.

          I noted that a folder name with single quote, like GetString-getValue won't open the dialog box_files may cause trouble.

          This case may be worked around with underline to use instead of single quote: GetString-getValue won_t open the dialog box_files

          I use such solution in the attached (and later deleted) ZIP file. But it's not the focus on the question. If it requires complex actions with script. It is welcome if it's easy to code.

          6,686585
          Grand MasterGrand Master
          6,686585

            Mar 25, 2018#5

            Thanks for the input and output examples. They were a big help for understanding the requirements for the coding task.

            Here is the UltraEdit script for this task (updated later for additional requirements) on which the functions GetListOfFiles (with adjustment for language of UltraEdit) and GetFilePath must be added to be complete.

            Code: Select all

            var sParentFolderPath = "";   // A parent folder path can be defined here.
                                          // Note: Each backslash must be escaped with an additional backslash.
            
            // Is no parent folder path defined above?
            if (!sParentFolderPath.length)
            {
               // If there is any file opened, get path of active file.
               if (UltraEdit.document.length > 0)  // Is any file opened?
               {
                  sParentFolderPath = GetFilePath();
               }
               // Let script user enter the folder path if no file is opened
               // in UltraEdit or the active file is a new, unsaved file.
               while (!sParentFolderPath.length)
               {
                  sParentFolderPath = UltraEdit.getString("Enter path of parent folder:",1);
               }
            }
            
            // Append a backslash if parent folder path does not end already with a backslash.
            if (sParentFolderPath[sParentFolderPath.length-1] != '\\')
            {
               sParentFolderPath += '\\';
            }
            
            // Get all index.html files with full path in specified directory tree.
            if (GetListOfFiles(0,sParentFolderPath,"index.html",true))
            {
               UltraEdit.activeDocument.top();
               UltraEdit.activeDocument.findReplace.mode=0;
               UltraEdit.activeDocument.findReplace.matchCase=false;
               UltraEdit.activeDocument.findReplace.matchWord=false;
               UltraEdit.activeDocument.findReplace.regExp=false;
               UltraEdit.activeDocument.findReplace.searchDown=true;
               UltraEdit.activeDocument.findReplace.searchInColumn=false;
               UltraEdit.activeDocument.findReplace.preserveCase=false;
               UltraEdit.activeDocument.findReplace.replaceAll=true;
               UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;
               UltraEdit.activeDocument.findReplace.replace(sParentFolderPath+"index.html\r\n","");
            
               UltraEdit.activeDocument.selectAll();
               if (UltraEdit.activeDocument.isSel())
               {
                  var asFileNames = UltraEdit.activeDocument.selection.split("\r\n");
                  asFileNames.pop();   // Remove the empty string from end of array.
            
            
                  // Convert the find in files results file into a batch file for
                  // moving the modified index.html files up to parent folder.
                  // The new file name is inserted later on each line.
                  UltraEdit.activeDocument.top();
                  UltraEdit.activeDocument.findReplace.mode=0;
                  UltraEdit.activeDocument.findReplace.matchCase=false;
                  UltraEdit.activeDocument.findReplace.matchWord=false;
                  UltraEdit.activeDocument.findReplace.regExp=true;
                  UltraEdit.activeDocument.findReplace.searchDown=true;
                  UltraEdit.activeDocument.findReplace.searchInColumn=false;
                  UltraEdit.activeDocument.findReplace.preserveCase=false;
                  UltraEdit.activeDocument.findReplace.replaceAll=true;
                  UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;
                  UltraEdit.activeDocument.findReplace.replace("^(.+)$",'move /Y "\\1" "');
                  UltraEdit.activeDocument.top();
            
                  // Add a line to run later the batch file with less output to console.
                  UltraEdit.activeDocument.write("@echo off\r\n");
            
                  // Define the parameters for the replace in files executed later.
                  UltraEdit.perlReOn();
                  UltraEdit.frInFiles.regExp=true;
                  UltraEdit.frInFiles.filesToSearch=0;
                  UltraEdit.frInFiles.logChanges=true;
                  UltraEdit.frInFiles.matchWord=false;
                  UltraEdit.frInFiles.matchCase=false;
                  UltraEdit.frInFiles.searchSubs=false;
                  UltraEdit.frInFiles.directoryStart="";
                  UltraEdit.frInFiles.useEncoding=false;
                  UltraEdit.frInFiles.preserveCase=false;
                  UltraEdit.frInFiles.openMatchingFiles=false;
            
                  // Process each file name in the array.
                  for (var nFile = 0; nFile < asFileNames.length; nFile++)
                  {
                     // Get path of file without parent folder path and without \index.html at end.
                     var sFilePath = asFileNames[nFile].substring(sParentFolderPath.length,asFileNames[nFile].length-11);
                     // URL encode the file path with additionally encode also single straight quotes.
                     var sEncodedFilePath = encodeURI(sFilePath).replace(/\'/g,"%27");
                     sEncodedFilePath = sEncodedFilePath.replace(/&/g,"&amp;");
            
                     // Run the Perl regular expression replace on this index.html file
                     // which inserts the file path on href="..." or href='...' or src="..."
                     // or src='...' or url("...") or url('...') or url(&quot;...&quot;)
                     // file references if those file references do not start with "#",
                     // or "javascript:" or contain already a reference with / in string.
                     UltraEdit.frInFiles.searchInFilesTypes=asFileNames[nFile];
                     UltraEdit.frInFiles.replace("(href=[\"']|src=[\"']|url\\([\"']|url\\(&quot;)(?!#|javascript:)((?:(?![\"']|&quot;)[^/])+(?=[\"']|&quot;))","\\1"+sEncodedFilePath+"/\\2");
            
                     // Remove the string "_files" at end of file path for new file name
                     // if this string is at end of file path at all. Then insert new file
                     // name with path into already reformatted  find in files results file.
                     var sNewFileName = sFilePath.replace(/_files$/,"") + ".html";
                     UltraEdit.activeDocument.key("END");
                     UltraEdit.activeDocument.write(sParentFolderPath+sNewFileName+'"');
                     UltraEdit.activeDocument.key("DOWN ARROW");
                  }
            
                  // Append a line so that the batch file deletes itself on execution.
                  UltraEdit.activeDocument.write('del "%~f0" & exit\r\n');
            
                  // Convert the file from ANSI to OEM to work also for folder paths
                  // not consisting of only ASCII characters.
                  UltraEdit.activeDocument.ansiToOem();
            
                  // Save the reformatted find in files results file as batch file in parent folder.
                  UltraEdit.saveAs(sParentFolderPath+"MoveHtmlFiles.bat");
            
                  // Run the batch file via a user tool.
                  UltraEdit.runTool("Run Active File");
            
                  // Close the batch file which has deleted already itself.
                  UltraEdit.closeFile(UltraEdit.activeDocument.path,2);
               }
               else
               {
                  // Close the empty results file because of nothing to do.
                  UltraEdit.closeFile(UltraEdit.activeDocument.path,2);
                  UltraEdit.messageBox("No index.html file found in a subdirectory of "+sParentFolderPath);
               }
            }
            
            Please read the comments at top of the script which explain how the parent folder path is determined by this script.

            This script converts the Find in Files results file with the name of found index.html files with full path into a batch file during script execution. This batch file moves the updated index.html files into parent folder with their new file names and deletes itself. This batch file is executed by the script using a user tool with title Run Active File. So a user tool must be additionally configured with following settings:

            Tab Command:

            Menu item name: Run Active File
            Command line: "%f"
            Working directory: %p
            Toolbar bitmap/icon (file path): empty

            Tab Options:

            Program type: DOS program selected
            Save active file: checked
            Save all files first: not checked

            Tab Output:

            Command output: Append to existing selected
            Show DOS box: not checked
            Capture output: not checked
            Replace selected text with: No replace selected
            Handle output as: ANSI selected
            Handle output as Unicode: not checked

            Please note that output option Handle output as is only available in UltraEdit for Windows v25.00 or UEStudio v18.00 or any later version. This output option replaces Handle output as Unicode of UltraEdit for Windows v24.xx and UEStudio v17.xx. Even older versions of UE for Windows and UES don't have an option to specify how to interpret captured output by UE/UES.

            Some additional notes:
            1. The HTML file for third subdirectory in FileTreeAfterProcessing.zip has added the path on some URLs on which no path should have been added in my estimation. But this HTML file misses the path on favicon.ico reference.
            2. All three manually modified HTML files in FileTreeAfterProcessing.zip contain invalid URLs. For correct URL encoding see: The script uses the JavaScript core function encodeURI() and additionally replaces ' by %27 and & by &amp;. Then a directory with ' in name is no problem anymore.
            3. The script can correct process directory names containing ANSI characters in system code page according to Windows region settings (Windows-1252 in my case) as long as those ANSI characters are also defined in OEM code page according to Windows region settings (OEM 850 in my case).
              The script is not written for working for directory names containing Unicode characters. This would be possible with UltraEdit v24.00 or UEStudio v17.00 or any later version on which UltraEdit scripts support also reading Unicode strings correct from UTF-16 and UTF-8 encoded files. But that would require removing three lines from function GetListOfFiles() and of course a special coding for the batch file on which version of Windows and settings for console also matter for correct moving the index.html files with their new names containing Unicode characters.
            Best regards from an UC/UE/UES for Windows user from Austria

            1032
            Power UserPower User
            1032

              Mar 25, 2018#6

              I have a question about the use of other external functions:
              To make the script run, I had to add GetListOfFiles and GetFilePath to the same file of the script, ending with a big file (1,936 lines).
              Is there a way to use that external functions without need to insert them into the current script?
              Some additional notes: ...
              You are right in all points. And I have to say thank you once more, learning non stop with your lessons.

              If there is a way to use external functions with calls please, let me know.

              Finally, here we are reaching one more topic successfully solved.
              👏

              6,686585
              Grand MasterGrand Master
              6,686585

                Mar 26, 2018#7

                It is possible to include a script file containing one or more functions. Open help of UltraEdit, switch to tab Index, type scripting and double click on list item Scripting commands. This help page explains how to include scripts in scripts.

                I tested the posted script with UE v22.20.0.49 on Windows XP, v24.20.0.62 on Windows 7 and v25.00.0.58 on Windows 7 and it worked well. You have made something wrong if the script file contains 1,936 lines. My script file with the added functions GetListOfFiles() and GetFilePath() has only 438 lines with the comments and empty lines inside the functions.

                Please make sure to have only the lines of those two functions copied to the script file without the explaining comments at top and without the demonstration code at bottom and of course without the other file name functions not needed for this script.
                Best regards from an UC/UE/UES for Windows user from Austria

                1032
                Power UserPower User
                1032

                  Mar 26, 2018#8

                  At first try, I put all comments and the whole "FileNameFunctions.js" file inside the script. For this reason, my file reached 1,936 lines. That was my fault.

                  After cleaning comments, removing empty lines and inserting just GetFilePath() from "FileNameFunctions.js" and GetListOfFiles(), my file has decreased to 286 lines.

                  Script is doing right its job.

                    Mar 08, 2019#9

                    Mofi, after some time without work on this kind of subject, I needed to review this topic to adjust some pages I had saved.
                    Your script works very well for tags "href=" and "src=", but "url(" is not a best deal to work.

                    I'll explain.

                    In my pages, I have lines like this:

                    url("d63f77e2-a23d.woff2")

                    and
                    url(&quot;abcd.jpeg&quot;)

                    It's not my fault. Webscrapbook, an extension for Firefox, saves like this.

                    Your script only adjust the first sample, I mean:

                    url("d63f77e2-a23d.woff2")
                    becomes
                    url("ConstantPath/d63f77e2-a23d.woff2")

                    But url(&quot;abcd.jpeg&quot;) remains the same.

                    I tried to understand and make a new regular expression to include the second sample and add it to the script. But I failed.

                    I could write this: ((?:href=|src=|url\()&quot;)((?!#|javascript:).+&quot;\))
                    That can find just the second sample.

                    And I tried this to get both: ((?:href=|src=|url\()["'&quot;])((?!#|javascript:)[^/"'&quot;]+\))
                    But it won't find a match.

                    All expressions above are in Perl format, not including an extra "\" to meet scripting requirements.

                    May you, please, point me where I'm doing wrong?
                    And explain the new expression you would write?

                    Thank you.

                    6,686585
                    Grand MasterGrand Master
                    6,686585

                      Mar 09, 2019#10

                      I updated the regular expression for replace in files command and the comment block above in the script in my post above.

                      Note: The expression works now also for a file reference like like url(&quot;Hello&amp;Welcome.html&quot;).

                      Replacing ((?:href=|src=|url\()["')) which searches in a non-capturing group for href= or src= or url( followed by " or ' and capturing the found string by ((?:href=|src=|url\()&quot;) can't work because of this expression searches for href=&quot; or src=&quot; or or url(&quot; in all index.html files.

                      ["'&quot;] is a character class definition to find one character which is one of the characters inside the square brackets.
                      [^/"'&quot;] is a negative character class definition to find one character which is not one of the characters inside the square brackets.
                      So this attempt can't work, too.

                      It was necessary to adapt the OR expression in first capturing group to find href=" or href=' or src=" or src=' or url(" or url(' or url(&quot; and the expression in second capturing group to work also for file names with an ampersand in url.

                      The updated script ignores now also index.html found in folder of which path is assigned to script variable sParentFolderPath by removing such a file from list of file names before processing this list.
                      Best regards from an UC/UE/UES for Windows user from Austria

                      1032
                      Power UserPower User
                      1032

                        Mar 16, 2019#11

                        I'd like to say thank you very much.
                        To come to this forum is a good way to increase my knowledge in each visit.
                        And you are the best teacher ever. No doubts.
                        clap.gif (1.14KiB)

                        Thank you.