Replace alphanumeric string with another (less one)

Replace alphanumeric string with another (less one)

3
NewbieNewbie
3

    Aug 13, 2014#1

    I have a large text file similar to the below and need to edit using PERL REGEX in UltraEdit:

    þABC0118789þ|þABC0118789þ|þABC0118789þ|þABC0118789þ|þþ|þþ|þþ|þþ|þþ|þþ|þþ|þþ
    þABC0118793þ|þABC0118793þ|þABC0118793þ|þABC0118793þ|þþ|þþ|þþ|þþ|þþ|þþ|þþ|þþ
    þABC0118807þ|þABC0118807þ|þABC0118807þ|þABC0118807þ|þþ|þþ|þþ|þþ|þþ|þþ|þþ|þþ
    þABC0118808þ|þABC0118808þ|þABC0118808þ|þABC0118808þ|þþ|þþ|þþ|þþ|þþ|þþ|þþ|þþ

    Need to replace the 2nd and 4th ABCxxxxxxx string with the ABCxxxxxxx string at the beginning of the next line less one number as follows:

    þABC0118789þ|þABC0118792þ|þABC0118789þ|þABC0118792þ|þþ|þþ|þþ|þþ|þþ|þþ|þþ|þþ
    þABC0118793þþABC0118806þþABC0118793þþABC0118806þ|þþ|þþ|þþ|þþ|þþ|þþ|þþ|þþ
    þABC0118807þþABC0118807þþABC0118807þþABC0118807þ|þþ|þþ|þþ|þþ|þþ|þþ|þþ|þþ
    þABC0118808þþABC0118808þþABC0118808þþABC0118808þ|þþ|þþ|þþ|þþ|þþ|þþ|þþ|þþ

    Your help is tremendously appreciated! Thanks.

    6,686585
    Grand MasterGrand Master
    6,686585

      Aug 13, 2014#2

      A regular expression replace can only search for a string and replace it with another one. It cannot convert a string into an integer, do some mathematical operations on integer like a subtraction by one, and convert the integer back to string.

      However, this is possible with an UltraEdit script and I will code one for you if you need such a script and you do not want to code it by yourself.

      How large is the file in MB?
      Best regards from an UC/UE/UES for Windows user from Austria

      3
      NewbieNewbie
      3

        Aug 13, 2014#3

        The current file I'm working on is less than 1 MB and will have at least 1000 occurrences of the find/replace operation. But I will also be working on 2 more files that will need the same find/replace that will be much larger containing at least 40,000+ occurrences. I'm thinking of doing the following for each alphanumeric ending in 9, then 8, then 7, etc.(for all digits down to 0), as follows:

        Find:

        (þ)(NUMAL\d\d\d\d\d\d\d)(þþ)(NUMAL\d\d\d\d\d\d\d)(þþ)(NUMALd\d\d\d\d\d\d)(þþ)(NUMAL\d\d\d\d\d\d\d)(þþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþþ
        þ)(NUMAL\d\d\d\d\d\d)(9)

        Replace:

        \1\2\3\10need_to_insert_the_number_8_here\5\6\7\10need_to_insert_the_number_8_here\9\10\11

        Not sure how to make the replace put an 8 after the 10th pattern (\10). Do you understand what I'm thinking? That should work, right? Is there a better way?

        6,686585
        Grand MasterGrand Master
        6,686585

          Aug 14, 2014#4

          Your approach is not good if you take into account what must be done to convert a number string like 1000000 to 0999999.

          Here is the script which works even for large files, but will take some time to finish because of the large number of display updates.

          Save this code into a file with extension js and add this file via Scripting - Scripts to the list of scripts. Open your CSV file and run the script via menu Scripting or double click on the script in scripts list opened with View - Views/Lists - Script List.

          Code: Select all

          if (UltraEdit.document.length > 0)  // Is any file opened?
          {
             // Define environment for this script.
             UltraEdit.insertMode();
             if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
             else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
          
             // Define all parameters for the regular expression find/replace.
             UltraEdit.perlReOn();
             UltraEdit.activeDocument.findReplace.mode=0;
             UltraEdit.activeDocument.findReplace.matchCase=true;
             UltraEdit.activeDocument.findReplace.matchWord=false;
             UltraEdit.activeDocument.findReplace.regExp=true;
             UltraEdit.activeDocument.findReplace.searchDown=true;
             if (typeof(UltraEdit.activeDocument.findReplace.searchInColumn) == "boolean") {
                UltraEdit.activeDocument.findReplace.searchInColumn=false;
             }
             UltraEdit.activeDocument.findReplace.preserveCase=false;
             UltraEdit.activeDocument.findReplace.replaceAll=false;
             UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;
          
             // Move caret to top of the active file.
             UltraEdit.activeDocument.top();
             // This string is necessary to keep length of number
             // string by inserting right number of leading zeros.
             var sLeadingZeros = "00000000";
          
             while(UltraEdit.activeDocument.findReplace.find("^(?:.*?\\|){4}.*\r?\n(?:.*?\\|){4}"))
             {
                var sLines = UltraEdit.activeDocument.selection;
                // Get number in data column 2 in line 2.
                var sReadNumber1 = sLines.replace(/^.*\r?\n.*?\|.*?ABC(\d+)[\s\S]+$/,"$1");
                // Get number in data column 4 in line 2.
                var sReadNumber2 = sLines.replace(/^.*\r?\n(?:.*?\|){3}.*?ABC(\d+).*$/,"$1");
                // Convert the 2 number strings to integer numbers and subtract 1.
                var nIntNumber1 = parseInt(sReadNumber1,10) - 1;
                var nIntNumber2 = parseInt(sReadNumber2,10) - 1;
                // Convert the 2 integer numbers to number strings.
                var sWriteNumber1 = nIntNumber1.toString(10);
                var sWriteNumber2 = nIntNumber2.toString(10);
                // Insert leading zeros if new number string is shorter than it should be.
                if (sWriteNumber1.length < sReadNumber1.length)
                {
                   sWriteNumber1 = sLeadingZeros.substr(0,sReadNumber1.length-sWriteNumber1.length) + sWriteNumber1;
                }
                if (sWriteNumber2.length < sReadNumber2.length)
                {
                   sWriteNumber2 = sLeadingZeros.substr(0,sReadNumber2.length-sWriteNumber2.length) + sWriteNumber2;
                }
                // Build the string for the regular expression replace.
                var sReplace = "\\1" + sWriteNumber1 + "\\2" + sWriteNumber2;
                // Switch to a find and replace in selected text.
                UltraEdit.activeDocument.findReplace.mode=1;
                // Replace the 2 numbers to replace in first line of selection.
                UltraEdit.activeDocument.findReplace.replace("^(.*?\\|.*?ABC)\\d+(.*?\\|.*?\\|.*?ABC)\\d+",sReplace);
                // Switch back to a find in current file.
                UltraEdit.activeDocument.findReplace.mode=0;
             }
             UltraEdit.activeDocument.top();
          }
          PS: There should be no number string equal 0 as subtraction by 1 results in this case most likely in unexpected output.
          Best regards from an UC/UE/UES for Windows user from Austria

          3
          NewbieNewbie
          3

            Aug 14, 2014#5

            Hi Mofi,

            Thanks so much for your replies and for the script. Yes, it hit me that numbers ending in more than one zero would have to be dealt with specially. It's clear that you script will be much more efficient and effective. I'm assuming I'll need to make some edits to it to get it to work correctly in my environment, right? Is there a place online that I can learn more about scripting in UltraEdit? I appreciate your time and assistance.

            Anthony

            6,686585
            Grand MasterGrand Master
            6,686585

              Aug 14, 2014#6

              admullins wrote:I'm assuming I'll need to make some edits to it to get it to work correctly in my environment, right?
              I don't think so. I executed it on the snippets you posted and verified the result. So you should need to do only what I wrote in my previous post above the script: copy, paste, save, add to scripts list and execute it with the file being opened and active which the script should modify.
              admullins wrote:Is there a place online that I can learn more about scripting in UltraEdit?
              On Support - Power Tips & Tutorials page there is a link to Integrated scripting engine tutorial which explains with screenshots what I wrote briefly in my previous post.

              Take also a look on forum announcement topic JavaScript tutorial, power tips and more which can be found always at top of the Scripts forum.

              Helpful is also to open the tag list with View - Views/Lists - Tag List and select tag group UE/UES Script Commands containing all UltraEdit specific methods and properties with a short description. In help of UltraEdit there is the page Scripting commands with more information about those functions and properties.

              The JavaScript core features are not described in help of UltraEdit as there are lots of public webpages with lessons on how to use the JavaScript core objects and their methods and properties.
              Best regards from an UC/UE/UES for Windows user from Austria