Making regular expression for finding IP addresses

Making regular expression for finding IP addresses

3
NewbieNewbie
3

    Nov 09, 2010#1

    Hello all,

    I am quite new too UltraEdit, and have a big database. Now I want to find, within that database, a range of IP-adresses, lets say: 21.111.22.1 to 21.111.27.8. How can I make a Regular Expression for this? I have tried alot already, but it does'nt seem to work. So I start with 'Search' , 'Find in Files' , and then what do I type?

    Sorry if this question is already posted on here: I couldn't find it.

    Thanks so much in forward for your help.

    With kind regards,
    Leeuwarden

    6,686585
    Grand MasterGrand Master
    6,686585

      Nov 09, 2010#2

      With the UltraEdit regexp engine search for [0-9]+.[0-9]+.[0-9]+.[0-9]+ to find any "Number.Number.Number.Number" string. With the Unix/Perl regexp engine the same search can be done with \d+\.\d+\.\d+\.\d+

      You can replace parts of the general expression by fixed parts, for example for your address range use 21.111.2[2-7].[0-9]+ (UE engine) or 21\.111\.2[2-7]\.\d+ (Unix/Perl engine).
      Best regards from an UC/UE/UES for Windows user from Austria

      901
      MasterMaster
      901

        Nov 09, 2010#3

        If you want the four octets to be limited to 3 numeric digits, you can use this Perl regular expression:

        (\d{1,3}\.){3}\d{1,3}

        \d{1,3}\. matches 1 to 3 numeric digits (one octet) followed by a period
        Surrounding the whole thing in parentheses and adding {3} repeats the pattern 3 times (three octets followed by periods)
        The \d{1,3} at the end adds the final 1 to 3 digit octet after the last period

        If you want to validate each octet to ensure that they fall between 0 and 255, you can do that as well...although it makes for a pretty long regular expression:

        (([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.){3}([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])

        [01]?[0-9]?[0-9] validates the numbers 0 through 199 with optional leading zeros
        2[0-4][0-9] validates the numbers 200 through 249
        25[0-5] validates the numbers 250 through 255
        The | character in between them acts as an OR

        Replace each [0-9] with \d to snug things up a little more:

        (([01]?\d?\d|2[0-4]\d|25[0-5])\.){3}([01]?\d?\d|2[0-4]\d|25[0-5])

        3
        NewbieNewbie
        3

          Nov 10, 2010#4

          Thank you so much for your quick and very helpful responses! I tried both suggestions, but stumble upon another (just small) problem. When I for example use 21.111.2[2-7].[0-9]+, UltraEdit also finds 255.21.111.22.2. How can I make the software only find the addresses that BEGIN with 21.111.xxx. I tried it with %, so that would make %(21.111.2[2-7].[0-9]+) or %21.111.2[2-7].[0-9]+, , right? But that doesn't work!

          Again: really thanks for your help!

          Kind regards,
          Leeuwarden

          EDIT: I thought about it a bit longer, and understand it better now. I searched with the IP address at the starting of a line, but in this data they are not at the beginning of a line. So it should become something like: [NO digit/point OR digit directly before it, but an empty space)21.111.2[2-7].[0-9]+. Am I on the right track? ;)

          6,686585
          Grand MasterGrand Master
          6,686585

            Nov 10, 2010#5

            Don't make it more complicated as necessary. If there is always a space character left an IP address you want find, just use a space character as first character of the regular expression search string.

            Or with the UltraEdit regexp engine search for [~.0-9]21.111.2[2-7].[0-9]+ which is simplified for what you want, the character left the fixed string 21.111.2 must be any character except a dot or a digit. Of course this character is also selected. A non matching look-behind is not possible with the UltraEdit regexp engine.
            Best regards from an UC/UE/UES for Windows user from Austria

            901
            MasterMaster
            901

              Nov 10, 2010#6

              Yes, you are on the right track. My previous Perl examples were more general in nature (validating any IP address). Here is a another Perl regular expression for you to consider:

              (?<![\d\.])21\.111\.2[2-7]\.\d{1,3}(?![\d\.]).

              If you don't know what will come before or after your IP address, then you can specify what should not be there using negative look-arounds:

              (?<![\d\.]) is a negative look-behind that avoids strings preceded by a numeric digit or a period
              (?![\d\.]) is a negative look-ahead that avoids strings followed by a numeric digit or a period
              The ! character indicates that the look-around is negative (looking for the absence of the character rather than the presence of the character)
              The only difference between the two is the < character which indicates that one of them is looking behind while the other is looking ahead
              \d{1,3} for the last octet looks for any number up to 3 digits long. The [0-9]+ in your regular expression will match numbers of any length

              If you want to ensure that your last octet is a valid number between 0 and 255, then you can get that from yesterday's post

              30
              Basic UserBasic User
              30

                Nov 10, 2010#7

                Leeuwarden wrote:I tried it with %, so that would make %(21.111.2[2-7].[0-9]+) or %21.111.2[2-7].[0-9]+, , right? But that doesn't work!

                EDIT: So it should become something like: [NO digit/point OR digit directly before it, but an empty space)21.111.2[2-7].[0-9]+. Am I on the right track?
                There are a few problems with the RegEx you tried. The first problem is that you are using the "%" character - I'm not sure what the purpose of that is, but what you want to use is "\b" to indicate a word boundary. That's all you need on either side of the expression.

                The second problem is that you're not escaping your periods. That's why you're finding IPs different than the pattern you're trying to get.


                In addition, the problem with some of the other examples that have been given in this thread is that they will catch octets like "999", which would be an invalid IP address.


                If you want IP addresses that always start with "21.111", than use this (Perl):

                \b21\.111\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b


                If you want to grab any IP address, use this (Perl):

                \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b

                901
                MasterMaster
                901

                  Nov 10, 2010#8

                  Bracket wrote:The second problem is that you're not escaping your periods. That's why you're finding IPs different than the pattern you're trying to get.
                  Not true, Bracket. The reason this user is not escaping his periods is because he is using the UltraEdit form of regular expressions instead of Perl regular expressions. By the way, the original poster has already been given Perl regular expressions to validate an entire IP address (shorter ones) and your addition of \b boundary markers does not fix the last issue raised by the user... namely the regular expression matching what appears to be a valid IP address immediately following "255." This is why Mofi suggested preceding the expression with a space and I instructed the user on the use of negative look-arounds.

                  3
                  NewbieNewbie
                  3

                    Nov 11, 2010#9

                    Again; thank you all for the replies! I agree with Bracket that you should not count on my intelligent, especially not in this particular situation ;) However, all replies were helpful in understanding UE a little bit better. Funny how the most simple reply fixed my last problem: just add a space before the string.

                    This is the winning string: ' 21.111.2[2-7].[0-9]+'

                    By the way: in this set of data it's impossible to find invalid IP addresses, because the data only has valid addresses. When I have new questions, I will post them!

                    901
                    MasterMaster
                    901

                      Nov 11, 2010#10

                      I'm glad Mofi's solution worked for you. ;) As he so well put it, no need to make it more complicated than necessary.