Find blanks in XML

Find blanks in XML

5

    6:08 - Jul 22#1

    I have a structure, with a folder called tables and subfolders table 1, 2 etc.
    In each sbbfolder is placed a XML file called table[Folder].xml.

    there are a totol af + 600.000 lines.

    a few of them looks like:

    <c13>                    hej</c13>

    My problem is the blanks - '> ' , where the number of blanks can be whatever (less than 1000).

    So the example should be changed to:

    <c13>hej</c13>
     The blank problem can come in all lines, so i should look for >blank(s), not <c13>blank(s)

    The easy solution would take all xml files in the subfoldrs in one go, but as its a 'Few of' task, its ok with opening one xml file at a time.

    Best regards

    Edvard Korsbæk

    19176
    MasterMaster
    19176

      9:18 - Jul 22#2

      Hi Edvard,

      it is not clear how the element c13 (or similar which should be corrected) could look like. I suppose the simplest scenario - there are no nested elements and no EOLs inside the element body. Here is a Perl regexp which finds such lines and remove leading spaces from the body.

      Find: (?<TAG><(?<TAGNAME>\w+\d+)\b[^>]*+>) +(?<BODY>(?:(?!<|</).)++)(?=</\k<TAGNAME>>)
      Replace: $+{TAG}$+{BODY}

      This regexp searches elements with name from letters/underscores followed by some numbers. The sequence < or </ is not allowed inside the element body.
      Let me know if it does not work as you expect

      BR, Fleggy