Remove all spaces within an xml file except for within tags

Remove all spaces within an xml file except for within tags

1
NewbieNewbie
1

    Jul 17, 2007#1

    Hi everyone,

    hope you can help, I have been through most of the postings and couldn't find anything to help me with this problem. Not sure if it can be done so any help would be greatly appreciated.

    I need to clean up a large number of xml files and combine them all into one big file(which I can create the macro for to go through them all) but need some help with regexp.

    I want to eliminate any space characters (tabs, spaces and carriage returns) but not any spaces within the start and end tags.
    For example:
    <starttag>this is a test</starttag>
    <start2>helpappreciated</start2>

    Should become:
    <starttag>this is a test</starttag><start2>helpappreciated</start2>

    Thanks for your help
    I'm using UE 12.00+3 and Ultraedit expressions

    262
    MasterMaster
    262

      Jul 17, 2007#2

      Since UE version 12.10 UE comes integrated with the XML parser XML Lint. That is of course not helping you directly.

      But as a version 12.00+3 user you could download XML Lint yourself and install it as an external tool.

      See this post for futher details: XML Lint - how to install

      I ran XMLlint with the option --noblanks ("drop ignorable blank spaces") and it did exactly what you want and that is dropped all tabs, spaces and carriage returns between tags and between comments and tags. "Inline" spaces "survives".

        Jul 17, 2007#3

        Reedited between 00:00 and 00:15 CET.

        Back again. Of course with UE style regex a more simplistic approach could be taken:

        Search for
        >[^t^p ]++<

        replace
        ><

        (only very minor disadvantages, for example bad empty tags <tag1> </tag1> with blanks are nullified as <tag1></tag1>).