Script to replace start and end tag of all DIV elements in all XHTML files of a folder with a specific class attribute

Script to replace start and end tag of all DIV elements in all XHTML files of a folder with a specific class attribute

49
Basic UserBasic User
49

    Jun 11, 2019#1

    Hello everyone,

    I have a task, I'd like to change element based specific attribute class in multiple HTML files in UltraEdit.

    Like:

    Code: Select all

    <div class="secA">
    <div class="secB">
    <div class="secC">
    <div class="secD">
    </div>
    <div class="secD">
    <div class="secE">
    </div>
    </div>
    </div>
    </div>
    </div>
    After change:

    Code: Select all

    <section class="secA">
    <section class="secB">
    <section class="secC">
    <section class="secD">
    </section>
    <section class="secD">
    <section class="secE">
    </section>
    </section>
    </section>
    </section>
    </section>
    Is it possible by a script?

    I have attached input and output files for reference.
    div_to_section_test_set.zip (9.4 KiB)   0
    Example set of files for testing the script and see what to to.

    6,614548
    Grand MasterGrand Master
    6,614548

      Jun 12, 2019#2

      The script below worked with UltraEdit v22.20 on Windows XP and v26.10 on Windows 7.

      The function GetListOfFiles (only the function) must be copied into the script file with the code below. Don't forget to adjust the variables sSummaryInfo, sResultsDocTitle and bNoUnicode according to your configuration and version of UltraEdit as described in comment of the function.

      The directory path C:\\Temp\\ must be adapted by you at top of the script.

      Many thanks goes to Fleggy for the Perl regular expression search string to find matching tag.

      Code: Select all

      UltraEdit.insertMode();
      if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
      else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
      
      var nValidCount = 0;    // Counts the files successfully processed.
      var nTotalCount = 0;    // Counts the total number of files opened.
      
      // Replace the initial directory path in the function parameter list with
      // real path of the directory containing the files to modify by the script.
      if (GetListOfFiles(0,"C:\\Temp\\","*.xhtml",false))
      {
         // If there was no error and files are found in the directory, the
         // function made the search results output with the file names active.
         // Select all lines in the file and load the file names into an array.
         UltraEdit.activeDocument.selectAll();
         var sLineTerm = "\r\n";   // Default line terminator type is DOS.
         var nLineTermPos = UltraEdit.activeDocument.selection.search(/\r\n|\n|\r/);
         if (nLineTermPos >= 0)    // Any line terminator found?
         {
            // The list file is a Unix file if first character found is a line-feed.
            if (UltraEdit.activeDocument.selection[nLineTermPos] == '\n') sLineTerm = "\n";
            // The list file is a Mac file if first character found is a carriage
            // return and the next character is not a line-feed as in a DOS file.
            else if (UltraEdit.activeDocument.selection[nLineTermPos+1] != '\n') sLineTerm = "\r";
         }
         var asFileNames = UltraEdit.activeDocument.selection.split(sLineTerm);
      
         // The list is not needed anymore and therefore the results window is closed.
         UltraEdit.closeFile(UltraEdit.activeDocument.path,2);
         asFileNames.pop();  // Remove empty string at end of the list.
      
         UltraEdit.perlReOn();   // Select Perl regular expression engine.
      
         // Now open one file after the other, process it, save and close the file.
         // No file in the list should be opened already on script start! Why?
         // http://forums.ultraedit.com/viewtopic.php?f=52&t=4596#p26710
         // contains the answer on point 7 and the solution to enhance
         // this script further if this is necessary in some cases.
         for (var nFileIndex = 0; nFileIndex < asFileNames.length; nFileIndex++)
         {
            UltraEdit.open(asFileNames[nFileIndex]);
            UltraEdit.activeDocument.top();
      
            // It is assumed that opened XHTML file is a valid XHTML file.
            var bValidXHTML = true;
      
            // Define the find and replace parameters for the opened file.
            UltraEdit.activeDocument.findReplace.mode=0;
            UltraEdit.activeDocument.findReplace.matchCase=true;
            UltraEdit.activeDocument.findReplace.matchWord=false;
            UltraEdit.activeDocument.findReplace.regExp=false;
            UltraEdit.activeDocument.findReplace.searchDown=true;
            UltraEdit.activeDocument.findReplace.searchInColumn=false;
            UltraEdit.activeDocument.findReplace.preserveCase=false;
            UltraEdit.activeDocument.findReplace.replaceAll=false;
            UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;
      
            // Search case-sensitive and literal for the string below.
            while (UltraEdit.activeDocument.findReplace.find('<div class="sec'))
            {
               // Get current line number and column number of begin
               // of found string and move caret to begin of found string.
               var nLine = UltraEdit.activeDocument.currentLineNum;
               var nColumn = UltraEdit.activeDocument.currentColumnNum - 15;
               UltraEdit.activeDocument.gotoLine(nLine,nColumn);
      
               // Use a Perl regular expression search to find the
               // matching end tag as posted by Fleggy in forum post
               // http://forums.ultraedit.com/viewtopic.php?f=7&t=13703#p62244
               UltraEdit.activeDocument.findReplace.regExp=true;
               if (UltraEdit.activeDocument.findReplace.find("(?s)(?<MAIN><(?<TAG>div)\\b[^>]*+(?:(?<=/)>|(?<!/)>(?>(?:(?!<\\k<TAG>\\b)(?!</\\k<TAG>\\b).)++|(?&MAIN))*+</\\k<TAG>>))(?<!/>)"))
               {
                  // Move caret to the left which cancels the selection
                  // and sets caret at end of word "div" which is next
                  // selected and replaced by "section".
                  UltraEdit.activeDocument.key("LEFT ARROW");
                  UltraEdit.activeDocument.selectWord();
                  UltraEdit.activeDocument.write("section");
      
                  // Move caret back to start tag and replace "div" by "section".
                  UltraEdit.activeDocument.gotoLine(nLine,nColumn);
                  UltraEdit.activeDocument.findReplace.replace("div","section");
               }
               else
               {
                  // Output into output window the file name with full path and
                  // append in round brackets the line number for this invalid
                  // XHTML file on which matching end tag could not be found. It
                  // is possible to double click on this line in output window to
                  // open this file and get caret positioned on line with start tag.
                  UltraEdit.outputWindow.write(UltraEdit.activeDocument.path + "(" + nLine + "): No matching end tag found!");
                  bValidXHTML = false;
                  break;   // Do not further process this invalid XHTML file.
               }
            }
      
            nTotalCount++;
            if (bValidXHTML)
            {  // Save and close the processed valid XHTML file.
               UltraEdit.closeFile(UltraEdit.activeDocument.path,1);
               nValidCount++;
            }
            else
            {  // Close the invalid XHTML file without saving it.
               UltraEdit.closeFile(UltraEdit.activeDocument.path,2);
            }
         }
      }
      
      // Output a summary information into output window and show output window.
      UltraEdit.outputWindow.write("Processed successfully " + nValidCount + " file" +
                                   ((nValidCount==1) ? "" : "s") + " of in total " +
                                   nTotalCount + " file" + ((nTotalCount==1) ? "." : "s."));
      UltraEdit.outputWindow.showWindow(true);
      
      Best regards from an UC/UE/UES for Windows user from Austria

      49
      Basic UserBasic User
      49

        Jun 14, 2019#3

        Thanks again Mofi,

        But unfortunately I'm not using UltraEdit v22.20 or v26.10, I'm using UEStudio v12.20.

        By the way I want to try replace in files with loop by following pattern replace:

        Find:

        Code: Select all

        (?s)(<(div)( class="(?(4)[^"]*?|(?|(sec[A-Z])))")>((?>(?:(?!<\2\b)(?!</\2\b).)++|(?1))*+)</\2>)
        Replace:

        Code: Select all

        <section\3>\5</section>
        and try to create a script UEStudio v12.20 like:

        Code: Select all

        var sDirectory = "";
        var sFileExtension = "";
           sDirectory = UltraEdit.getString("Enter directory path of files to modify:",1);
           //sDirectory = sDirectory.replace(/(\/)/g,"\1\1");
           sFileExtension = UltraEdit.getString("Enter file name pattern (extension) like *.htm:",1);
           if (sFileExtension[0] != '*') sFileExtension = '*' + sFileExtension;
           if (sFileExtension[1] != '.') sFileExtension = "*." + sFileExtension.substr(1);
            var file_names = get_array_of_files(sDirectory, sFileExtension);
            var no_of_files = file_names.length;
        
        // ***** for backup *****
            UltraEdit.outputWindow.clear();
            UltraEdit.frInFiles.useOutputWindow=true;
            UltraEdit.frInFiles.find('');
            UltraEdit.outputWindow.copy();
            UltraEdit.newFile();
            UltraEdit.activeDocument.paste();
            UltraEdit.clearClipboard();
            UltraEdit.selectClipboard(0);
            UltraEdit.activeDocument.top();
        
            UltraEdit.perlReOn();
            UltraEdit.activeDocument.findReplace.mode=0;
            UltraEdit.activeDocument.findReplace.matchCase=false;
            UltraEdit.activeDocument.findReplace.matchWord=false;
            UltraEdit.activeDocument.findReplace.regExp=true;
            UltraEdit.activeDocument.findReplace.searchDown=true;
            UltraEdit.activeDocument.findReplace.replaceAll=true;
            UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;
        
            UltraEdit.activeDocument.top();
            UltraEdit.activeDocument.findReplace.replace("^Search complete, found.+\\r\\n","");
            UltraEdit.activeDocument.findReplace.replace("^(.+.)\\.(\\w+)$","copy \\1\.\\2 \\1\.bak");
            var batFile = "c:\\temp\\bak.bat";
            var toolName = "backup";
            var nPath = UltraEdit.activeDocument.selection;
            nPath = nPath.replace(/\\/g,"\\\\");
        
            UltraEdit.saveAs(batFile);
            UltraEdit.closeFile(UltraEdit.activeDocument.path,2);
        
            UltraEdit.runTool(toolName);
        
            UltraEdit.outputWindow.clear();
        
        // ***** main update *****
           UltraEdit.perlReOn();
           UltraEdit.frInFiles.regExp = true;
           UltraEdit.frInFiles.matchCase = false;
           UltraEdit.frInFiles.filesToSearch = 0;
           UltraEdit.frInFiles.searchSubs = false;
           UltraEdit.frInFiles.openMatchingFiles = false;
           UltraEdit.frInFiles.ignoreHiddenSubs = true;
           UltraEdit.frInFiles.logChanges = true;
           UltraEdit.frInFiles.matchWord = false;
           UltraEdit.frInFiles.preserveCase = false;
           UltraEdit.frInFiles.useEncoding = true;
           UltraEdit.frInFiles.encoding = 65001;
           for (var i = 0; i < no_of_files; i++) {
           //UltraEdit.messageBox(file_names[i]);
           UltraEdit.frInFiles.replace('(?s)(<(div)( class="(?(4)[^"]*?|(?|(sec[A-Z])))")>((?>(?:(?!<\\2\\b)(?!</\\2\\b).)++|(?1))*+)</\\2>)',"<section\\3>\\5</section>");
        }
        
        function get_array_of_files(dir_path, ext) {
            GetListOfFiles(0, dir_path, ext, true);
            UltraEdit.activeDocument.top();
            UltraEdit.activeDocument.selectAll();
            var file_names = UltraEdit.activeDocument.selection.split("\r\n");
            UltraEdit.closeFile(UltraEdit.activeDocument.path, 2);
            return file_names;
        }
        
        function GetListOfFiles (nFileList, sDirectory, sFileType, bSubDirs)
        {
           /* The summary info line at end of a Search - Find in Files result is
              language depended as also the name of the results document. Adapt
              the following 2 strings to your version of UE/UES. Take a look
              in Configuration/Preferences on "Search - Set Find Output Format"
              on "Find Summary" definition, or better execute once the command
              "Find in Files" from menu "Search" and adapt the two strings below
              accordingly. You can also just open GetListOfFiles.js and execute
              it with command "Run Active Script" from menu "Scripting" to see
              how the find results look like in your version of UE/UES when you
              enter during script execution the parameters correct.
              Please note that with disabling the "Find Summary" completely you
              have to initialize variable sSummaryInfo with an empty string. */
           var sSummaryInfo = "Search complete, found";
           var sResultsDocTitle = "** Find Results ** ";  // Note the space at end!
           // For German UltraEdit the default strings are:
           // var sSummaryInfo = "Suche abgeschlossen, ";
           // var sResultsDocTitle = "** Suchergebnisse ** ";
        
           /* Determine the type of output for debug messages from the global
              variable g_nDebugMessage: 1 ... debug to output window, 2 ... debug
              to message dialog, all others ... no debug messages. If the global
              variable g_nDebugMessage does not exist, display the debug message
              as popup message in a dialog box (value 2). */
           var nOutputType = (typeof(g_nDebugMessage) == "number") ? g_nDebugMessage : 2;
        
           if (typeof(nFileList) != "number" || nFileList < 0 || nFileList > 4) nFileList = 0;
        
           if (nFileList == 0)    // Search in a specified directory?
           {
              // If no directory specified, use current working directory.
              if (typeof(sDirectory) != "string" || sDirectory == "" ) sDirectory = ".\\";
              // Append a backslash if it is missing at end of the directory string.
              else if (sDirectory[sDirectory.length-1] != "\\") sDirectory += "\\";
              // Search for all files if no file type is specified.
              if (typeof(sFileType) != "string" || sFileType == "") sFileType = "*";
              if (typeof(bSubDirs) != "boolean") bSubDirs = false;
           }
           else
           {
              sDirectory = "";    // For the list of open, favorite, project
              sFileType = "";     // or solution files the other 3 parameters
              bSubDirs = false;   // have always the same default values.
           }
        
           // Remember current regular expression engine.
           var nRegexEngine = UltraEdit.regexMode;
           /* A regular expression engine must be defined or the find
              for the last line in the Unicode results could fail. */
           UltraEdit.ueReOn();
        
           /* Run a Find In Files with an empty search string to get the
              list of files stored in the specified directory in an edit
              window and delete the last line with the summary info. */
           UltraEdit.frInFiles.directoryStart=sDirectory;
           UltraEdit.frInFiles.filesToSearch=nFileList;
           UltraEdit.frInFiles.matchCase=false;
           UltraEdit.frInFiles.matchWord=false;
           UltraEdit.frInFiles.regExp=false;
           UltraEdit.frInFiles.searchInFilesTypes=sFileType;
           UltraEdit.frInFiles.searchSubs=bSubDirs;
           UltraEdit.frInFiles.unicodeSearch=false;
           UltraEdit.frInFiles.useOutputWindow=false;
           if (typeof(UltraEdit.frInFiles.openMatchingFiles) == "boolean")
           {
              UltraEdit.frInFiles.openMatchingFiles=false;
           }
           UltraEdit.frInFiles.find("");
        
           /* If the Find In Files results window was open already the results
              of the search above are appended, but the results document does
              not get automatically the focus as it does if there was no results
              document open from a previous search. Therefore care must be taken
              that the document with the Find In Files results is the active
              document after the search to continue on correct file. */
           var bListCreated = false;
           if (UltraEdit.activeDocument.path == sResultsDocTitle) bListCreated = true;
           else
           {
              for (var nDocIndex = 0; nDocIndex < UltraEdit.document.length; nDocIndex++)
              {
                 if (UltraEdit.document[nDocIndex].path == sResultsDocTitle)
                 {
                    UltraEdit.document[nDocIndex].setActive();
                    bListCreated = true;
                    break;
                 }
              }
           }
           if (bListCreated == true && sSummaryInfo.length)
           {
              // Search for the summary info at bottom of the results.
              UltraEdit.activeDocument.findReplace.searchDown=false;
              UltraEdit.activeDocument.findReplace.matchCase=true;
              UltraEdit.activeDocument.findReplace.matchWord=false;
              UltraEdit.activeDocument.findReplace.regExp=false;
              UltraEdit.activeDocument.findReplace.find(sSummaryInfo);
              bListCreated = UltraEdit.activeDocument.isFound();
           }
           UltraEdit.activeDocument.findReplace.searchDown=true;
           switch (nRegexEngine)     // Restore original regular expression engine.
           {
              case 1:  UltraEdit.unixReOn(); break;
              case 2:  UltraEdit.perlReOn(); break;
              default: UltraEdit.ueReOn();   break;
           }
           /* Check now if the Find above has had success finding the last line in
              the active document which should contain the Find In Files results. */
           if (bListCreated == false)
           {
              if (nOutputType == 2)
              {
                 UltraEdit.messageBox("There is a problem with frInFiles command or the strings of the script variables\n\"sSummaryInfo\" or \"sResultsDocTitle\" are not adapted to your version of UE/UES!","GetListOfFiles Error");
              }
              else if (nOutputType == 1)
              {
                 if (UltraEdit.outputWindow.visible == false) UltraEdit.outputWindow.showWindow(true);
                 UltraEdit.outputWindow.write("GetListOfFiles: There is a problem with frInFiles command or the strings of the script variables");
                 UltraEdit.outputWindow.write("                \"sSummaryInfo\" or \"sResultsDocTitle\" are not adapted to your version of UE/UES!");
              }
              return false;
           }
           /* If last line with summary info found, delete this line. Next convert
              the file into an ASCII text file for better handling of the file
              names. Unicode file names are not supported by this script function.
              If the file with the results is already an ASCII file from a
              previous execution, there is no need for the conversion. */
           if (sSummaryInfo.length) UltraEdit.activeDocument.deleteLine();
           UltraEdit.activeDocument.top();
           UltraEdit.activeDocument.key("RIGHT ARROW");
           if (UltraEdit.activeDocument.currentPos > 1) UltraEdit.activeDocument.unicodeToASCII();
           else UltraEdit.activeDocument.top();
        
           // If top of file is also end of file, no files were found.
           if (UltraEdit.activeDocument.isEof())
           {
              if (nOutputType == 2)
              {
                 var sMessage = "";
                 switch (nFileList)
                 {
                    case 0: sMessage = "No file "+sFileType+" was found in directory\n\n"+sDirectory; break;
                    case 1: sMessage = "There are no opened files."; break;
                    case 2: sMessage = "There are no favorite files."; break;
                    case 3: sMessage = "There are no project files or no project is opened."; break;
                    case 4: sMessage = "There are no solution files or no solution is opened."; break;
                 }
                 UltraEdit.messageBox(sMessage,"GetListOfFiles Error");
              }
              else if (nOutputType == 1)
              {
                 var sMessage = "";
                 switch (nFileList)
                 {
                    case 0: sMessage = "No file "+sFileType+" was found in directory "+sDirectory; break;
                    case 1: sMessage = "There are no opened files."; break;
                    case 2: sMessage = "There are no favorite files."; break;
                    case 3: sMessage = "There are no project files or no project is opened."; break;
                    case 4: sMessage = "There are no solution files or no solution is opened."; break;
                 }
                 if (UltraEdit.outputWindow.visible == false) UltraEdit.outputWindow.showWindow(true);
                 UltraEdit.outputWindow.write("GetListOfFiles: "+sMessage);
              }
              UltraEdit.closeFile(UltraEdit.activeDocument.path,2);
              return false;
           }
           return true;
        }  // End of function GetListOfFiles
        
        But loop replace not work properly.

        Please kindly solve this script.

        Thanks Samir

        18572
        MasterMaster
        18572

          Jun 14, 2019#4

          Hi Samir,

          I can't help you with the script, but your regex can be simplified to a little bit more legible form:
          F: (?s)(<(div)\b(?(3)[^>]*+|( class="sec[A-Z]"))>((?>(?:(?!<\2\b)(?!</\2\b).)++|(?1))*+)</\2>)
          R: <section\3>\4</section>

          BR, Fleggy

          49
          Basic UserBasic User
          49

            Jun 15, 2019#5

            Thanks fleggy for simplified pattern,
            I made the pattern above after following your pattern.
            but
            I'm still waiting for mofi's reply

            6,614548
            Grand MasterGrand Master
            6,614548

              Jun 16, 2019#6

              The script below is rewritten for this task using the Perl regular expression posted by Fleggy, tested only with UEStudio v15.20 on Windows XP.

              Code: Select all

              var sDirectory = "";
              var sFileExtension = "";
              
              while (!sDirectory.length)
              {
                 sDirectory = UltraEdit.getString("Enter directory path of files to modify:",1);
              }
              if (sDirectory[sDirectory.length-1] != "\\") sDirectory += "\\";
              
              while (!sFileExtension.length)
              {
                 sFileExtension = UltraEdit.getString("Enter file name pattern (extension) like *.htm:",1);
              }
              if (sFileExtension[0] != '*') sFileExtension = '*' + sFileExtension;
              if (sFileExtension[1] != '.') sFileExtension = "*." + sFileExtension.substr(1);
              
              /* Not tested backup of all files in a directory matching the wildcard pattern.
                 This simplied solution does not work for files in an entire directory tree.
                 It would be better to use xcopy or robocopy to copy all files of a directory
                 or a directory tree to modify to a backup directory with same file name.
              UltraEdit.newFile();
              UltraEdit.activeDocument.write('copy /Y "' + sDirectory + sFileExtension + '" "' + sDirectory + '*.bak"');
              UltraEdit.saveAs("C:\\Temp\\bak.bat");
              UltraEdit.closeFile(UltraEdit.activeDocument.path,2);
              UltraEdit.runTool("backup");
              */
              
              // ***** main update *****
              
              UltraEdit.selectClipboard(9);
              UltraEdit.perlReOn();
              UltraEdit.frInFiles.regExp=true;
              UltraEdit.frInFiles.matchCase=true;
              UltraEdit.frInFiles.logChanges=true;
              UltraEdit.frInFiles.filesToSearch=0;
              UltraEdit.frInFiles.searchInFilesTypes=sFileExtension;
              UltraEdit.frInFiles.directoryStart=sDirectory;
              UltraEdit.frInFiles.ignoreHiddenSubs=true;
              UltraEdit.frInFiles.searchSubs=false;
              UltraEdit.frInFiles.useEncoding=false;    // Using UTF-8 encoding for replace in files
              // UltraEdit.frInFiles.useEncoding=true;  // is not really needed in this case with
              // UltraEdit.frInFiles.encoding=65001;    // find and replace string not containing
              UltraEdit.frInFiles.matchWord=false;      // any non-ASCII character.
              UltraEdit.frInFiles.preserveCase=false;
              UltraEdit.frInFiles.openMatchingFiles=false;
              
              // Run the Perl regular expression replace in files in a loop until nothing
              // is replaced anymore in any of the matched files. This is necessary because
              // the first replaced div element ends in file often below other div elements
              // to replace. UltraEdit does not seek back to beginning of a found string
              // in file after a replace was done before it continues the search for next
              // occurrence of a string (block) in file matching the search expression.
              // So it is necessary to run the replace in files multiple times until the
              // most outer div element of class secX to most inner div element of class
              // secX are modified to section elements.
              
              do
              {
                 UltraEdit.outputWindow.clear();
                 UltraEdit.frInFiles.replace('(?s)(<(div)\\b(?(3)[^>]*+|( class="sec[A-Z]"))>((?>(?:(?!<\\2\\b)(?!</\\2\\b).)++|(?1))*+)</\\2>)','<section\\3>\\4</section>');
                 UltraEdit.outputWindow.copy();
              }
              while(UltraEdit.clipboardContent.search(/\b0\b items replaced/) < 0);
              
              UltraEdit.clearClipboard();
              UltraEdit.selectClipboard(0);
              
              Best regards from an UC/UE/UES for Windows user from Austria

              49
              Basic UserBasic User
              49

                Jun 17, 2019#7

                Thanks again Mofi,

                The script worked fine with UEStudio v12.20.