Regular expression to search for attributes with missing a quote

Regular expression to search for attributes with missing a quote

74
Advanced UserAdvanced User
74

    Oct 13, 2021#1

    I need a Perl regular expression to select content only if the content is missing either the beginning quote or ending quote. The beginning quote will always be preceded by an equal symbol =. The ending quote can be followed by a space, more text or carriage return. In one given line there can be many attributes (quote pairs) to check.

    I tried (?<!")(.*?)" but that was a disaster. I thought maybe I could just do a simple regular expression find the equal symbol, look at next character and check if it's a quote followed by text and an end quote. But if there is no quote on the beginning or end of the text, then add it.

    Things to note the text in between the quotes will always be character data. There will be no symbols or spaces.

    Code: Select all

    <table pgwide="0" id="dvr_config_firmware>
    <title>DFR Firmware</title>
    <tgroup cols="2">
    <colspec colname="col1>
    <colspec colname="col2">
    I tried this regular expression which worked in regex101, but UES doesn't register it.

    Code: Select all

    /=(?|(")([^"<>\s]*)()(?=[\s>]|\/>)|(?!")()([^"<>\s]*)("))/
    You can see the regex works at this website. https://regex101.com/r/2qvpLr/1

    Thank you for the help. Max

    6,688587
    Grand MasterGrand Master
    6,688587

      Oct 14, 2021#2

      The Perl regular  expression search string works fine with UltraEdit if / at beginning and at end is removed from the search string which are for applications like Linux sed or a JavaScript RegExp object or a Perl script regular expression where / marks the beginning and the end of the regular expression. In UltraEdit must be used just the regular expression itself between first and last /.
      Best regards from an UC/UE/UES for Windows user from Austria

      74
      Advanced UserAdvanced User
      74

        Oct 14, 2021#3

        It works but I found a new condition instance. There are some attributes that do have spaces in them. For example 

        Code: Select all

        <applicdef verdate="18 Jan 2019 verstatus="ver">
        The regular expression matches up to

        Code: Select all

        <applicdef verdate="18
        But I need to evaluate each attribute as a whole. So in this instance '18 Jan 2019' should be matched for needing a quote added. Is there some way to check if the attribute is missing the quote before the next attribute or end > tag?

        thank you,
        Max

        6,688587
        Grand MasterGrand Master
        6,688587

          Oct 14, 2021#4

          To modify the invalid XML block

          Code: Select all

          <table pgwide=0" id="dvr_config_firmware>
          <title>DFR Firmware</title>
          <tgroup cols="3">
          <colspec colname=col1>
          <colspec colname="col2">
          <colspec colname="col3 attrib="xyz">
          <applicdef verdate="18 Jan 2019 verstatus="ver">
          to

          Code: Select all

          <table pgwide="0" id="dvr_config_firmware">
          <title>DFR Firmware</title>
          <tgroup cols="3">
          <colspec colname="col1">
          <colspec colname="col2">
          <colspec colname="col3" attrib="xyz">
          <applicdef verdate="18 Jan 2019" verstatus="ver">
          run first a Perl regular expression replace all with search string \w=\K([^"=>]+)(?=>) and replace string "$1" or "\1" to insert " around attribute values like col1 in the example above on which both double quotes are missing.

          Next run a Perl regular expression replace all with the search string \w=\K(?:(?!")|"[^">]*\K(?=>)|"[^ >"]++(?= \w+=)\K|"(?:[^ >"]++(?![>"])(?! \w+=) )+[^ ">]+\K) and as replace string just " to insert the other four missing double quotes in the example block.

          The two Perl regular expression replace all were tested by me with UltraEdit v28.20.0.44 which has enhanced support for Perl regular expression finds/replaces in comparison to former versions. So I don't know which former versions of UltraEdit support the second search expression at all.
          Best regards from an UC/UE/UES for Windows user from Austria

          19176
          MasterMaster
          19176

            Oct 21, 2021#5

            Good news, the next public UE build 28.20.0.70 has fixed an issue reported by me to UltraEdit support detected during testing several variations of Perl regular expressions. :)

            The following single Perl replace will work after updating to this version. (It was tested by me with non-public user verification build  UE 28.20.0.68.)

            F: \w=\K(")?([\w ]+)(?(1)(?!")|"?)(?!\w*[="])
            R: "\2"

            BR, Fleggy

            6,688587
            Grand MasterGrand Master
            6,688587

              Oct 31, 2021#6

              Yes, the single Perl regular expression replace as posted by Fleggy makes a fantastic job on inserting the missing double quotes around the attribute values with UltraEdit for Windows v28.20.0.70 and with UEStudio v21.10.0.24. It must be just avoided to run it a second time on same file as also my much longer search expression to find a single missing double quote as otherwise the attribute value "18 Jan 2019" on being already enclosed in double quotes would be modified once again by inserting wrong one more double quote.
              Best regards from an UC/UE/UES for Windows user from Austria

              19176
              MasterMaster
              19176

                Nov 01, 2021#7

                Hmm, I forgot to make the regexp safe, thanks Mofi. It's time to use some control verbs :)
                The safe version:

                F: \w=\K(")?([\w ]+)(?(1)(?(?=")(*SKIP)(*FAIL))|"?)(?!\w*=)
                R: "\2"

                BR, Fleggy

                74
                Advanced UserAdvanced User
                74

                  Nov 08, 2021#8

                  Thank you for the response. The code works great!