Indexing a word in a file

SouravN · Feb 13, 2017#12017-02-13T07:54+00:00

Dear Mofi,

I have a problem. I have an XHTML file as in attached RAR archive which was later deleted.
I pick the text in between link tag like <a href="#ind@@">state of exception</a> that means 1st pick state of exception text.
After that find this text where in front of this text has id present. The tag is <a id="ind226"/>
text(?!</a>) this kind of find will happen.

I write a find pattern, hope thats help you: <[^<>]*?id="ind\w+"[^<>]*?>

All emphasis tag and page tag will remove before find. But main file have no change. All HEX character present Unicode characters.

I attach a file, we manually edited as the script should do. I hope this sample help you.

Mofi · Feb 19, 2017#22017-02-19T12:18+00:00

Here is a script for this task.

Code: Select all

if (UltraEdit.document.length > 0)
{
   UltraEdit.insertMode();
   UltraEdit.columnModeOff();

   // Load entire file contents into memory as string to search later
   // in this string for the identifier before each referenced string
   // to avoid searching in document window with display updates.
   UltraEdit.activeDocument.selectAll();
   var sFileContents = UltraEdit.activeDocument.selection;
   UltraEdit.activeDocument.top();

   // Use a case-sensitive Perl regular expression Find to search for
   // the string references in active file from top to bottom of file.
   UltraEdit.perlReOn();
   UltraEdit.activeDocument.findReplace.mode=0;
   UltraEdit.activeDocument.findReplace.matchCase=true;
   UltraEdit.activeDocument.findReplace.matchWord=false;
   UltraEdit.activeDocument.findReplace.regExp=true;
   UltraEdit.activeDocument.findReplace.searchDown=true;
   UltraEdit.activeDocument.findReplace.searchInColumn=false;

   // Search in file for the referenced strings.
   var nUpdated = 0;
   while(UltraEdit.activeDocument.findReplace.find('<a href="#ind.+?">.+?</a>'))
   {
      // Get referenced string with removing all tags in referenced
      // string and with trimming leading and trailing spaces.
      var sText = UltraEdit.activeDocument.selection.replace(/<a href="#ind.+?">(.+?)<\/a>/,"$1");
      sText = sText.replace(/<\/?.+?>/g,"");
      sText = sText.replace(/^[\t ]*(.+?)[\t ]*$/,"$1");

      // Build a JavaScript regular expression search object with
      // that string to find the string with its identifier before.
      var sRegSearch = '<a id="ind[^"]+"\/>' + sText;
      var rRegSearch = new RegExp(sRegSearch,"i");

      // Find in entire file contents with that regular expression.
      var nFoundPosition = sFileContents.search(rRegSearch);

      // Is referenced string with its identifier found in file?
      if (nFoundPosition >= 0)
      {
         // Get identifier tag from entire file contents.
         var nEndPosition = sFileContents.indexOf(">",nFoundPosition);
         var sId = sFileContents.substring(nFoundPosition,nEndPosition+1);
         // Get the identifier as reference string.
         var sId = sId.replace(/<a id="ind(.+?)"\/>$/,"#ind$1");
         // Replace in string found in file the reference.
         var sWrite = UltraEdit.activeDocument.selection.replace(/#ind[^\"]+/,sId);
         // Is found and selected string different to the string with updated identifier?
         if (sWrite != UltraEdit.activeDocument.selection)
         {
            // Update the found and still selected string with updated identifier.
            UltraEdit.activeDocument.write(sWrite);
            nUpdated++;
         }
         else
         {  // There is nothing to change, cancel selection.
            UltraEdit.activeDocument.key("LEFT ARROW");
         }
      }
      else
      {  // The referenced string with an identifier is not found in
         // entire file. It is not defined what to do with the reference
         // in this case and therefore just cancel the selection.
         UltraEdit.activeDocument.key("LEFT ARROW");
      }
   }

   // Write a single line task summary to output window without explicitly
   // showing the output window because this information is needed only at
   // beginning on testing the script.
   if (nUpdated)  // Was there any reference updated?
   {
      UltraEdit.save();
      var sPluralS = (nUpdated != 1) ? "s" : "";
      UltraEdit.outputWindow.write("Updated " + nUpdated + " reference" + sPluralS + " in active file.");
   }
   else
   {
      UltraEdit.outputWindow.write("No reference needed to be updated in active file.");
   }
}

It was not clear for me if the referenced string should be searched only from current position in file to end of file to get its identifier or in entire file. Well, this should not make a difference in general, but in the attached file there are several identical referenced strings with different identifiers. The script as posted here searches always for the referenced string and its identifier in entire file (loaded into memory).