Tapatalk

Replacing text between two sets of tags

Replacing text between two sets of tags

12
Basic UserBasic User
12

    Apr 13, 2018#1

    I have 200+ webpages in a folder. What I want to do is replace the text between the <h1></h1> tags with what is between the <title></title> tags so that both are the same. Is there an easy way to do this?

    6,685587
    Grand MasterGrand Master
    6,685587

      Apr 14, 2018#2

      Yes, this is possible, for example with a Perl regular expression Replace in Files with search string (?s)(?<=<title>)(.+?)(</title>.+?<h1>).*?(?=</h1>) and with replace string \1\2\1 as long as element title is in every file above element h1.

      (?s) ... dot matches also newline characters, see "." (dot) in Perl regular expressions doesn't include newline characters CRLF? for details.

      (?<=<title>) a positive lookbehind to find <title> without matching it as part of found string.

      (...) ... first marking/capturing group. The string found by the expression inside can be back-referenced in search or replace string with \1 (or $1).

      .+? ... find one or more characters non-greedy. This expression matches the string between the start and end tag of element title.

      (...) ... second marking/capturing group. The string found by the expression inside can be back-referenced in search or replace string with \2 (or $2).

      </title>.+?<h1> ... matches everything from beginning of end tag of element title to end of start tag of element h1 including newline characters because of (?s) at beginning of search string. The size of the block matched by this expression is not unlimited. But I suppose the block size is no problem for your task as HTML/XHTML files usually don't have several MiBs between element title and element h1.

      .*? ... find zero or more characters non-greedy. This expression matches the string between the start and end tag of element h1 which can be also an empty string in case of <h1></h1> present in file.

      (?=</h1>) ... a positive lookahead  to stop matching zero or more characters on finding the end tag of element h1 without matching it as part of found string.
      Best regards from an UC/UE/UES for Windows user from Austria

      12
      Basic UserBasic User
      12

        Apr 14, 2018#3

        Brilliant! Just ran it in "Replace in Files" and it worked a treat. Thank you so much.