Tapatalk

Replace multiple numeric characters with single alpha character

Replace multiple numeric characters with single alpha character

2
NewbieNewbie
2

    Aug 21, 2015#1

    Hi

    I've recently starting using UE [v22.10/Win7x64] and am having problems with a data file that I need to import into another system. The file is 1.8M lines, but contains several 1000 rogue entries, so it's not really something I want to do manually!! 8O

    It's a postcode file; the correct values [second field onwards] are all alpha characters, but the second field in the CSV file can contain various numeric values (some also contain a decimal point).

    I need to replace these rogue entries with a single 'R' (my current attempts produce multiple 'R's). This is a sample of the data:

    Code: Select all

    BB1 1RY,B,B,G,G
    BB1 1RZ,B,B,G,G
    BB1 1SA,B,B,G,G
    BB1 1SB,427398,B,B,G
    BB1 1SD,B,B,G,G
    BB1 1SL,427723,B,B,G
    BB1 1SU,B,B,G,G
    BB1 1SW,B,B,G,G
    BB1 1SX,B,B,G,G
    BB1 1SY,B,B,G,G
    BB1 1SZ,B,B,G,G
    BB1 1TA,B,B,G,G
    BB1 1TB,427348.3,B,B,G
    BB1 1TD,B,B,G,G
    BB1 1TH,B,B,G,G
    BB1 1TJ,B,B,G,G
    BB1 1TL,427678.3,B,B,G
    BB1 1UE,B,B,G,G
    BB1 1UF,B,B,G,G
    BB1 1UG,B,B,G,G
    BB1 1UH,B,B,G,G
    BB1 1UJ,B,B,G,G
    BB1 1UL,B,B,G,G
    BB1 1UN,428011.9,B,B,G
    BB1 1UP,428066.5,B,B,G
    TIA
    Michael

    115
    Power UserPower User
    115

      Aug 21, 2015#2

      This is simple with REGEX.

      This is the Find String for Unix REGEX or Perl REGEX
      ,[0-9.]+,
      This is the Find String for UltraEdit REGEX
      ,[0-9.]+,
      and this is the Replace String
      ,R,

      This works on any numeric-only value between two commas, not just in the second field. Given the example you provided, there is at least one alphabetic character in the other fields so they will be ignored. If you do have other all numeric fields that need to remain, then a different REGEX is needed. Search this forum and you should be able to find examples from Mofi on how to find the start of a line, save the first field, change the second field, and restore the first field.

      2
      NewbieNewbie
      2

        Aug 21, 2015#3

        Thank you! That worked perfectly.
        I was along the right lines, but not quite there. Think it was the + (or lack of) that was messing me up

        6,685587
        Grand MasterGrand Master
        6,685587

          Aug 21, 2015#4

          Mick, I modified both search expressions in your post which is the reason why both are now identical.

          With a character set definition which means within [...] no character with a special meaning outside the square brackets must be escaped with \ (Unix/Perl) or ^ (UltraEdit), except
          1. the character set closing character ] for being interpreted as literal character instead of closing bracket.
            I usually also escape [ within a character set definition although not necessary for easier reading.
          2. The escape character \ (Unix/Perl) or ^ (UltraEdit) must be escaped if it should be interpreted as literal character in the character set.
          3. ^ (Unix/Perl) or ~ (UltraEdit) when being first character after opening [ and the character should be interpreted as literal character.
            Otherwise [^...] respectively [~...] would be a negative character set definition.
            ^ (Unix/Perl) or ~ (UltraEdit) must not be escaped for being interpreted as literal character when not being the first character in the square brackets.
          4. And last the hyphen character - must be escaped when it should be interpreted as literal character and not meaning FROM-TO.
            Hyphen character must not be escaped when being first or last character within the square brackets as in this case FROM-TO is not possible and each regular expression engine interprets the character then literally.
            But I escape a hyphen character within square brackets also at begin or end of character list as this makes it clear that it should be interpreted as literal character for the regexp engine as well as for humans.
          All other characters having a special meaning outside a character set definition are always interpreted inside as literal character. So escaping them is not necessary making reading the expression easier. It is of course never wrong to escape a character which must not be escaped in a character set definition.
          Best regards from an UC/UE/UES for Windows user from Austria

          115
          Power UserPower User
          115

            Aug 21, 2015#5

            Thanks, Mofi for the correction on escaping the period inside of the brackets. I was in a hurry (which often leads to mistakes) so I just did a quick test of the REGEX strings and they worked the way I expected. I should have remembered that I didn't need to escape certain special characters inside brackets, though. I'm the section code-reviewer and standards guy so I've reminded others in my office when they should and shouldn't escape values - and then I did it myself.