User to user discussion and support for UltraEdit, UEStudio, UltraCompare, and other IDM applications.

Help with writing and running scripts
8 posts Page 1 of 1
At first I run a Find In Files to search for those titles with the find string <p class="(ch-title|h1(\w+)?")
Kindly help me to generate a table of contents from the search result.

I copied the find results content and also the expected output file.

Code: Select all
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\04_Chapter01.xhtml':
D:\Somenath\Project\Text\04_Chapter01.xhtml(11): <p class="ch-title">VIJF- EN -TWINTIG JAREN.</p>
D:\Somenath\Project\Text\04_Chapter01.xhtml(26): <p class="h1" id="h1_1"><a id="page_2"></a>NA DEN STORM!</p>
D:\Somenath\Project\Text\04_Chapter01.xhtml(107): <p class="h1" id="h1_2">Mr. S. M S. MODDERMAN,</p>
D:\Somenath\Project\Text\04_Chapter01.xhtml(177): <p class="h1" id="h1_3"><a id="page_9"></a>De ramp op Ceram.</p>
D:\Somenath\Project\Text\04_Chapter01.xhtml(235): <p class="h1" id="h1_4">De Zuster-Republieken in Zuid-Afrika.</p>
D:\Somenath\Project\Text\04_Chapter01.xhtml(271): <p class="h1" id="h1_5">Een nachtelijke aanval op Soestdijk.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\05_Chapter02.xhtml':
D:\Somenath\Project\Text\05_Chapter02.xhtml(11): <p class="ch-title">NA DEN STORM!</p>
D:\Somenath\Project\Text\05_Chapter02.xhtml(63): <p class="h1" id="h1_6">Naar den kratef Van den Gedeh.</p>
D:\Somenath\Project\Text\05_Chapter02.xhtml(82): <p class="h1" id="h1_7">De Zuster-Republieken in Zuid-Afrika.</p>
D:\Somenath\Project\Text\05_Chapter02.xhtml(121): <p class="h1" id="h1_8">Op Weg naar Alaska&#x2019;s goudvelden.</p>
D:\Somenath\Project\Text\05_Chapter02.xhtml(167): <p class="h1" id="h1_9">Een wandeling in het Velpsche Broek.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\06_Chapter03.xhtml':
D:\Somenath\Project\Text\06_Chapter03.xhtml(11): <p class="ch-title">NA DEN STORM!</p>
D:\Somenath\Project\Text\06_Chapter03.xhtml(52): <p class="h1" id="h1_10"><a id="page_36"></a>Een Veteraan in de Tropen</p>
D:\Somenath\Project\Text\06_Chapter03.xhtml(81): <p class="h1" id="h1_11">Een wandeling in het Velpsche Broek.</p>
D:\Somenath\Project\Text\06_Chapter03.xhtml(136): <p class="h1" id="h1_12">Op Weg naar Alaska&#x2019;s goudvelden.</p>
D:\Somenath\Project\Text\06_Chapter03.xhtml(209): <p class="h1" id="h1_13"><a id="page_45"></a>&#x201E;LONG TOM.&#x201D;</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\07_Chapter04.xhtml':
D:\Somenath\Project\Text\07_Chapter04.xhtml(11): <p class="ch-title">Muskus-Menschen.</p>
D:\Somenath\Project\Text\07_Chapter04.xhtml(96): <p class="h1" id="h1_14">De schutterij in Nederlandsch-Indi&#x00EB;</p>
D:\Somenath\Project\Text\07_Chapter04.xhtml(128): <p class="h1" id="h1_15">Maretakken</p>
D:\Somenath\Project\Text\07_Chapter04.xhtml(146): <p class="h1" id="h1_16">Een wandeling in het Velpsche Broek.</p>
D:\Somenath\Project\Text\07_Chapter04.xhtml(205): <p class="h1" id="h1_17">Een sollicitatie.</p>
D:\Somenath\Project\Text\07_Chapter04.xhtml(281): <p class="h1" id="h1_18">Aardschuivingen in de Preanger,</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\08_Chapter05.xhtml':
D:\Somenath\Project\Text\08_Chapter05.xhtml(11): <p class="ch-title">Muskas-Menschen.</p>
D:\Somenath\Project\Text\08_Chapter05.xhtml(136): <p class="h1" id="h1_19"><a id="page_69"></a>GENERAAL VAN DER HEIJDEN.</p>
D:\Somenath\Project\Text\08_Chapter05.xhtml(183): <p class="h1" id="h1_20">Een sollicitatie.</p>
D:\Somenath\Project\Text\08_Chapter05.xhtml(252): <p class="h1" id="h1_21">De Zuster-Republieken in Zuid-Afrika.</p>
D:\Somenath\Project\Text\08_Chapter05.xhtml(293): <p class="h1" id="h1_22">Een Engelsch kanon in Boerenhanden.</p>
D:\Somenath\Project\Text\08_Chapter05.xhtml(320): <p class="h1" id="h1_23">Herinnering aan Soekaboemi</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\09_Chapter06.xhtml':
D:\Somenath\Project\Text\09_Chapter06.xhtml(11): <p class="ch-title">Muskus-Menschen.</p>
D:\Somenath\Project\Text\09_Chapter06.xhtml(132): <p class="h1" id="h1_24">Ludwig Knaus.</p>
D:\Somenath\Project\Text\09_Chapter06.xhtml(148): <p class="h1" id="h1_25"><a id="page_87"></a>In verre zee&#x00EB;n voor driehonderd jaar</p>
D:\Somenath\Project\Text\09_Chapter06.xhtml(170): <p class="h1" id="h1_26">De Zuster-Republieken in Zuid-Afrika.</p>
D:\Somenath\Project\Text\09_Chapter06.xhtml(199): <p class="h1" id="h1_27">Uit het leven van een Zeeofficier in Indi&#x00EB;.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\10_Chapter07.xhtml':
D:\Somenath\Project\Text\10_Chapter07.xhtml(11): <p class="ch-title">Muskus-Menschen.</p>
D:\Somenath\Project\Text\10_Chapter07.xhtml(90): <p class="h1" id="h1_28">Biddende Vrouw van Nicolaes Maes.</p>
D:\Somenath\Project\Text\10_Chapter07.xhtml(101): <p class="h1" id="h1_29">Het stationschip in West-Indi&#x00EB;</p>
D:\Somenath\Project\Text\10_Chapter07.xhtml(164): <p class="h1" id="h1_30">In verre zee&#x00EB;n voor driehonderd jaar</p>
D:\Somenath\Project\Text\10_Chapter07.xhtml(180): <p class="h1" id="h1_31">Dr. L. A. J. Burgersdijk.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\11_Chapter08.xhtml':
D:\Somenath\Project\Text\11_Chapter08.xhtml(11): <p class="ch-title">Muskus-Menschen.</p>
D:\Somenath\Project\Text\11_Chapter08.xhtml(91): <p class="h1" id="h1_32">Kijkjes in de Koninklijke Marine</p>
D:\Somenath\Project\Text\11_Chapter08.xhtml(135): <p class="h1" id="h1_33">Het sprookje Van de robijnen.</p>
D:\Somenath\Project\Text\11_Chapter08.xhtml(245): <p class="h1" id="h1_34">De Z&#x00FC;ster-Hepublieken in Zuid-Afrika.</p>
D:\Somenath\Project\Text\11_Chapter08.xhtml(276): <p class="h1" id="h1_35">In verre zee&#x00EB;n voor driehonderd jaar</p>
D:\Somenath\Project\Text\11_Chapter08.xhtml(294): <p class="h1" id="h1_36">Impressie</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\12_Chapter09.xhtml':
D:\Somenath\Project\Text\12_Chapter09.xhtml(11): <p class="ch-title">Muskus-Menschen.</p>
D:\Somenath\Project\Text\12_Chapter09.xhtml(95): <p class="h1" id="h1_37">De Drinkebroer van Frans Hals.</p>
D:\Somenath\Project\Text\12_Chapter09.xhtml(131): <p class="h1" id="h1_38">Het sprookje van de robijnen.</p>
D:\Somenath\Project\Text\12_Chapter09.xhtml(216): <p class="h1" id="h1_39">Het nieuwe Slachthuis te Roermond.</p>
D:\Somenath\Project\Text\12_Chapter09.xhtml(273): <p class="h1" id="h1_40"><a id="page_139"></a>Kijkjes bij de Schutterij</p>
D:\Somenath\Project\Text\12_Chapter09.xhtml(338): <p class="h1" id="h1_41">Anna van Nievelt.</p>
D:\Somenath\Project\Text\12_Chapter09.xhtml(369): <p class="h1" id="h1_42">Fusains</p>
D:\Somenath\Project\Text\12_Chapter09.xhtml(387): <p class="h1" id="h1_43">Het Sneeuwklokje.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 8 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\13_Chapter10.xhtml':
D:\Somenath\Project\Text\13_Chapter10.xhtml(11): <p class="ch-title">Benjamin-af,</p>
D:\Somenath\Project\Text\13_Chapter10.xhtml(97): <p class="h1" id="h1_44"><a id="page_148"></a>F. Adama van Scheltema.</p>
D:\Somenath\Project\Text\13_Chapter10.xhtml(119): <p class="h1" id="h1_45">Van een Engelschman die Ladysmith niet vinden kon.</p>
D:\Somenath\Project\Text\13_Chapter10.xhtml(153): <p class="h1" id="h1_46"><a id="page_154"></a>Het sprookje van de robijnen.</p>
D:\Somenath\Project\Text\13_Chapter10.xhtml(271): <p class="h1" id="h1_47">De Narcis.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\14_Chapter11.xhtml':
D:\Somenath\Project\Text\14_Chapter11.xhtml(11): <p class="ch-title">Benjamin-af,</p>
D:\Somenath\Project\Text\14_Chapter11.xhtml(114): <p class="h1" id="h1_48">De Stadhouders van Friesland</p>
D:\Somenath\Project\Text\14_Chapter11.xhtml(143): <p class="h1" id="h1_49">Waarmee onze vaderen maaltijd hielden en hoe.</p>
D:\Somenath\Project\Text\14_Chapter11.xhtml(204): <p class="h1" id="h1_50">Soerabaia in vogelvlucht.</p>
D:\Somenath\Project\Text\14_Chapter11.xhtml(249): <p class="h1" id="h1_51">De Zusten-Republieken in Zuid-Afrika.</p>
D:\Somenath\Project\Text\14_Chapter11.xhtml(283): <p class="h1" id="h1_52">Van een Engelschman die Ladysmith niet vinden kon.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\15_Chapter12.xhtml':
D:\Somenath\Project\Text\15_Chapter12.xhtml(11): <p class="ch-title">Nog bijtijds</p>
D:\Somenath\Project\Text\15_Chapter12.xhtml(60): <p class="h1" id="h1_53">De Stadhouders van Friesland</p>
D:\Somenath\Project\Text\15_Chapter12.xhtml(106): <p class="h1" id="h1_54">Soerabaia in vogelvlucht.</p>
D:\Somenath\Project\Text\15_Chapter12.xhtml(156): <p class="h1" id="h1_55">Waarmee onze vaderen maaltijd hielden en hoe.</p>
D:\Somenath\Project\Text\15_Chapter12.xhtml(188): <p class="h1" id="h1_56">De Zuster-Republieken in Zuid-Afrika.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\16_Chapter13.xhtml':
D:\Somenath\Project\Text\16_Chapter13.xhtml(11): <p class="ch-title"><a id="page_193"></a>Maartje Bot</p>
D:\Somenath\Project\Text\16_Chapter13.xhtml(173): <p class="h1" id="h1_57">De Grot van ledjoe (Java).</p>
D:\Somenath\Project\Text\16_Chapter13.xhtml(242): <p class="h1" id="h1_58">Waarmee onze vaderen maaltijd hielden en hoe.</p>
D:\Somenath\Project\Text\16_Chapter13.xhtml(261): <p class="h1" id="h1_59">&#x201E;Baldadig.&#x201D;</p>
D:\Somenath\Project\Text\16_Chapter13.xhtml(289): <p class="h1" id="h1_60">Hoe en Waarom slapen de planten?</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\17_Chapter14.xhtml':
D:\Somenath\Project\Text\17_Chapter14.xhtml(11): <p class="ch-title">Maartje Bot</p>
D:\Somenath\Project\Text\17_Chapter14.xhtml(154): <p class="h1" id="h1_61">Het Ontzaglijke.</p>
D:\Somenath\Project\Text\17_Chapter14.xhtml(162): <p class="h1" id="h1_62"><a id="page_214"></a>Waarmee onze vaderen maaltijd hielden en hoe.</p>
D:\Somenath\Project\Text\17_Chapter14.xhtml(245): <p class="h1" id="h1_63">De Zuster-Republieken in Zuid-Afrika.</p>
D:\Somenath\Project\Text\17_Chapter14.xhtml(274): <p class="h1" id="h1_64">Uit de vogelenwereld.</p>
D:\Somenath\Project\Text\17_Chapter14.xhtml(311): <p class="h1" id="h1_65">De Siboga-expeditie.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\18_Chapter15.xhtml':
D:\Somenath\Project\Text\18_Chapter15.xhtml(11): <p class="ch-title">Maartje Bot</p>
D:\Somenath\Project\Text\18_Chapter15.xhtml(218): <p class="h1" id="h1_66">Batikken.</p>
D:\Somenath\Project\Text\18_Chapter15.xhtml(250): <p class="h1" id="h1_67"><a id="page_232"></a>Alleen op de wereld.</p>
D:\Somenath\Project\Text\18_Chapter15.xhtml(275): <p class="h1" id="h1_68">Uit de vogelenwereld.</p>
D:\Somenath\Project\Text\18_Chapter15.xhtml(321): <p class="h1" id="h1_69">Van &#x2019;t oude en &#x2019;t nieuwe Zeist.</p>
D:\Somenath\Project\Text\18_Chapter15.xhtml(375): <p class="h1" id="h1_70">De reuzenkijker van de Parijsche tentoonstelling.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\19_Chapter16.xhtml':
D:\Somenath\Project\Text\19_Chapter16.xhtml(11): <p class="ch-title">Een mooie werkavond</p>
D:\Somenath\Project\Text\19_Chapter16.xhtml(96): <p class="h1" id="h1_71">Van &#x2019;t oude en &#x2019;t nieuwe Zeist.</p>
D:\Somenath\Project\Text\19_Chapter16.xhtml(152): <p class="h1" id="h1_72">Eene schutjaspartij in Friesland.</p>
D:\Somenath\Project\Text\19_Chapter16.xhtml(186): <p class="h1" id="h1_73">Om een kooltje vuur.</p>
D:\Somenath\Project\Text\19_Chapter16.xhtml(193): <p class="h1" id="h1_74"><a id="page_250"></a>Sneeuwdag.</p>
D:\Somenath\Project\Text\19_Chapter16.xhtml(246): <p class="h1" id="h1_75">De Zending in Java&#x2019;s Oosthoek.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\20_Chapter17.xhtml':
D:\Somenath\Project\Text\20_Chapter17.xhtml(11): <p class="ch-title">Verborgen Ego&#x00EF;sme.</p>
D:\Somenath\Project\Text\20_Chapter17.xhtml(130): <p class="h1" id="h1_76">Een pionier voor onze West-Indische Koloni&#x00EB;n.</p>
D:\Somenath\Project\Text\20_Chapter17.xhtml(151): <p class="h1" id="h1_77">De Zending in Java&#x2019;s Oosthoek.</p>
D:\Somenath\Project\Text\20_Chapter17.xhtml(180): <p class="h1" id="h1_78">Gouden Bruiloft.</p>
D:\Somenath\Project\Text\20_Chapter17.xhtml(241): <p class="h1" id="h1_79">Van &#x2019;t oude en &#x2019;t nieuwe Zeist.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\21_Chapter18.xhtml':
D:\Somenath\Project\Text\21_Chapter18.xhtml(11): <p class="ch-title">BERGAF!</p>
D:\Somenath\Project\Text\21_Chapter18.xhtml(88): <p class="h1" id="h1_80">Van oude zeelui.</p>
D:\Somenath\Project\Text\21_Chapter18.xhtml(130): <p class="h1" id="h1_81">Herinneringen aan Holland in Spanje.</p>
D:\Somenath\Project\Text\21_Chapter18.xhtml(138): <p class="h1" id="h1_82"><a id="page_281"></a>Onder de Palmen.</p>
D:\Somenath\Project\Text\21_Chapter18.xhtml(146): <p class="h1" id="h1_83">Gouden Bruiloft.</p>
D:\Somenath\Project\Text\21_Chapter18.xhtml(288): <p class="h1" id="h1_84">Op bezoek bij een Hollandsche schilder in de Saharah.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 6 time(s).
----------------------------------------
Find '<p class="(ch-tit|h1(\w+)?")' in 'D:\Somenath\Project\Text\22_Chapter19.xhtml':
D:\Somenath\Project\Text\22_Chapter19.xhtml(11): <p class="ch-title">BERGAF!</p>
D:\Somenath\Project\Text\22_Chapter19.xhtml(73): <p class="h1" id="h1_85"><a id="page_292"></a>Uit het land der Salanganen.</p>
D:\Somenath\Project\Text\22_Chapter19.xhtml(107): <p class="h1" id="h1_86"><a id="page_296"></a>Varnde Liute.</p>
D:\Somenath\Project\Text\22_Chapter19.xhtml(131): <p class="h1" id="h1_87">Een herinnering aan de koninginne week te Amsterdam.</p>
D:\Somenath\Project\Text\22_Chapter19.xhtml(180): <p class="h1" id="h1_88">HET LOT.</p>
Found '<p class="(ch-tit|h1(\w+)?")' 5 time(s).

Below is the expected output file:

Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
     <head>
        <meta http-equiv="default-style" content="text/html; charset=utf-8"/>
        <title>XXX</title>
        <link rel="stylesheet" href="../Styles/stylesheet.css" type="text/css"/>
    </head>
    <body>
    <nav id="toc" epub:type="toc">
      <h1>Inhoud</h1>
<ol>
<li id="NavPoint-1"><a href="02_Titlepage.xhtml">Titelpagina</a></li>
<li id="NavPoint-2"><a href="04_Chapter01.xhtml#Chapter01">VIJF- EN -TWINTIG JAREN.</a>
<ol>
<li id="NavPoint-1"><a href="04_Chapter01.xhtml#h1_1">NA DEN STORM!</a></li>
<li id="NavPoint-2"><a href="04_Chapter01.xhtml#h1_2">Mr. S. M S. MODDERMAN,</a></li>
<li id="NavPoint-3"><a href="04_Chapter01.xhtml#h1_3">De ramp op Ceram.</a></li>
<li id="NavPoint-4"><a href="04_Chapter01.xhtml#h1_4">De Zuster-Republieken in Zuid-Afrika.</a></li>
<li id="NavPoint-5"><a href="04_Chapter01.xhtml#h1_5">Een nachtelijke aanval op Soestdijk.</a></li></ol></li>
<li id="NavPoint-6"><a href="05_Chapter02.xhtml#Chapter02">NA DEN STORM!</a>
<ol>
<li id="NavPoint-7"><a href="05_Chapter02.xhtml#h1_6">Naar den kratef Van den Gedeh.</a></li>
<li id="NavPoint-8"><a href="05_Chapter02.xhtml#h1_7">De Zuster-Republieken in Zuid-Afrika.</a></li>
<li id="NavPoint-9"><a href="05_Chapter02.xhtml#h1_8">Op Weg naar Alaska’s goudvelden.</a></li>
<li id="NavPoint-10"><a href="05_Chapter02.xhtml#h1_9">Een wandeling in het Velpsche Broek.</a></li></ol></li>
<li id="NavPoint-11"><a href="06_Chapter03.xhtml#Chapter03">NA DEN STORM!</a>
<ol>
<li id="NavPoint-12"><a href="06_Chapter03.xhtml#h1_10">Een Veteraan in de Tropen</a></li>
<li id="NavPoint-13"><a href="06_Chapter03.xhtml#h1_11">Een wandeling in het Velpsche Broek.</a></li>
<li id="NavPoint-14"><a href="06_Chapter03.xhtml#h1_12">Op Weg naar Alaska’s goudvelden.</a></li>
<li id="NavPoint-15"><a href="06_Chapter03.xhtml#h1_13">„LONG TOM.”</a></li></ol></li>
<li id="NavPoint-16"><a href="07_Chapter04.xhtml#Chapter04">Muskus-Menschen.</a>
<ol>
<li id="NavPoint-17"><a href="07_Chapter04.xhtml#h1_14">De schutterij in Nederlandsch-Indi&#x00EB;</a></li>
<li id="NavPoint-18"><a href="07_Chapter04.xhtml#h1_15">Maretakken</a></li>
<li id="NavPoint-19"><a href="07_Chapter04.xhtml#h1_16">Een wandeling in het Velpsche Broek.</a></li>
<li id="NavPoint-20"><a href="07_Chapter04.xhtml#h1_17">Een sollicitatie.</a></li>
<li id="NavPoint-21"><a href="07_Chapter04.xhtml#h1_18">Aardschuivingen in de Preanger,</a></li></ol></li>
<li id="NavPoint-22"><a href="08_Chapter05.xhtml#Chapter05">Muskas-Menschen.</a>
<ol>
<li id="NavPoint-23"><a href="08_Chapter05.xhtml#h1_19">GENERAAL VAN DER HEIJDEN.</a></li>
<li id="NavPoint-24"><a href="08_Chapter05.xhtml#h1_20">Een sollicitatie.</a></li>
<li id="NavPoint-25"><a href="08_Chapter05.xhtml#h1_21">De Zuster-Republieken in Zuid-Afrika.</a></li>
<li id="NavPoint-26"><a href="08_Chapter05.xhtml#h1_22">Een Engelsch kanon in Boerenhanden.</a></li>
<li id="NavPoint-27"><a href="08_Chapter05.xhtml#h1_23">Herinnering aan Soekaboemi</a></li></ol></li>
<li id="NavPoint-28"><a href="09_Chapter06.xhtml#Chapter06">Muskus-Menschen.</a>
<ol>
<li id="NavPoint-29"><a href="09_Chapter06.xhtml#h1_24">Ludwig Knaus.</a></li>
<li id="NavPoint-30"><a href="09_Chapter06.xhtml#h1_25">In verre zee&#x00EB;n voor driehonderd jaar</a></li>
<li id="NavPoint-31"><a href="09_Chapter06.xhtml#h1_26">De Zuster-Republieken in Zuid-Afrika.</a></li>
<li id="NavPoint-32"><a href="09_Chapter06.xhtml#h1_27">Uit het leven van een Zeeofficier in Indi&#x00EB;.</a></li></ol></li>
<li id="NavPoint-33"><a href="10_Chapter07.xhtml#Chapter07">Muskus-Menschen.</a>
<ol>
<li id="NavPoint-34"><a href="10_Chapter07.xhtml#h1_28">Biddende Vrouw van Nicolaes Maes.</a></li>
<li id="NavPoint-35"><a href="10_Chapter07.xhtml#h1_29">Het stationschip in West-Indi&#x00EB;</a></li>
<li id="NavPoint-36"><a href="10_Chapter07.xhtml#h1_30">In verre zee&#x00EB;n voor driehonderd jaar</a></li>
<li id="NavPoint-37"><a href="10_Chapter07.xhtml#h1_31">Dr. L. A. J. Burgersdijk.</a></li></ol></li>
<li id="NavPoint-38"><a href="11_Chapter08.xhtml#Chapter08">Muskus-Menschen.</a>
<ol>
<li id="NavPoint-39"><a href="11_Chapter08.xhtml#h1_32">Kijkjes in de Koninklijke Marine</a></li>
<li id="NavPoint-40"><a href="11_Chapter08.xhtml#h1_33">Het sprookje Van de robijnen.</a></li>
<li id="NavPoint-41"><a href="11_Chapter08.xhtml#h1_34">De Z&#x00FC;ster-Hepublieken in Zuid-Afrika.</a></li>
<li id="NavPoint-42"><a href="11_Chapter08.xhtml#h1_35">In verre zee&#x00EB;n voor driehonderd jaar</a></li>
<li id="NavPoint-43"><a href="11_Chapter08.xhtml#h1_36">Impressie</a></li></ol></li>
<li id="NavPoint-44"><a href="12_Chapter09.xhtml#Chapter09">Muskus-Menschen.</a>
<ol>
<li id="NavPoint-45"><a href="12_Chapter09.xhtml#h1_37">De Drinkebroer van Frans Hals.</a></li>
<li id="NavPoint-46"><a href="12_Chapter09.xhtml#h1_38">Het sprookje van de robijnen.</a></li>
<li id="NavPoint-47"><a href="12_Chapter09.xhtml#h1_39">Het nieuwe Slachthuis te Roermond.</a></li>
<li id="NavPoint-48"><a href="12_Chapter09.xhtml#h1_40">Kijkjes bij de Schutterij</a></li>
<li id="NavPoint-49"><a href="12_Chapter09.xhtml#h1_41">Anna van Nievelt.</a></li>
<li id="NavPoint-50"><a href="12_Chapter09.xhtml#h1_42">Fusains</a></li>
<li id="NavPoint-51"><a href="12_Chapter09.xhtml#h1_43">Het Sneeuwklokje.</a></li></ol></li>
<li id="NavPoint-52"><a href="13_Chapter10.xhtml#Chapter10">Benjamin-af,</a>
<ol>
<li id="NavPoint-53"><a href="13_Chapter10.xhtml#h1_44">F. Adama van Scheltema.</a></li>
<li id="NavPoint-54"><a href="13_Chapter10.xhtml#h1_45">Van een Engelschman die Ladysmith niet vinden kon.</a></li>
<li id="NavPoint-55"><a href="13_Chapter10.xhtml#h1_46">Het sprookje van de robijnen.</a></li>
<li id="NavPoint-56"><a href="13_Chapter10.xhtml#h1_47">De Narcis.</a></li></ol></li>
<li id="NavPoint-57"><a href="14_Chapter11.xhtml#Chapter11">Benjamin-af,</a>
<ol>
<li id="NavPoint-58"><a href="14_Chapter11.xhtml#h1_48">De Stadhouders van Friesland</a></li>
<li id="NavPoint-59"><a href="14_Chapter11.xhtml#h1_49">Waarmee onze vaderen maaltijd hielden en hoe.</a></li>
<li id="NavPoint-60"><a href="14_Chapter11.xhtml#h1_50">Soerabaia in vogelvlucht.</a></li>
<li id="NavPoint-61"><a href="14_Chapter11.xhtml#h1_51">De Zusten-Republieken in Zuid-Afrika.</a></li>
<li id="NavPoint-62"><a href="14_Chapter11.xhtml#h1_52">Van een Engelschman die Ladysmith niet vinden kon.</a></li></ol></li>
<li id="NavPoint-63"><a href="15_Chapter12.xhtml#Chapter12">Nog bijtijds</a>
<ol>
<li id="NavPoint-64"><a href="15_Chapter12.xhtml#h1_53">De Stadhouders van Friesland</a></li>
<li id="NavPoint-65"><a href="15_Chapter12.xhtml#h1_54">Soerabaia in vogelvlucht.</a></li>
<li id="NavPoint-66"><a href="15_Chapter12.xhtml#h1_55">Waarmee onze vaderen maaltijd hielden en hoe.</a></li>
<li id="NavPoint-67"><a href="15_Chapter12.xhtml#h1_56">De Zuster-Republieken in Zuid-Afrika.</a></li></ol></li>
<li id="NavPoint-68"><a href="16_Chapter13.xhtml#Chapter13">Maartje Bot</a>
<ol>
<li id="NavPoint-69"><a href="16_Chapter13.xhtml#h1_57">De Grot van ledjoe (Java).</a></li>
<li id="NavPoint-70"><a href="16_Chapter13.xhtml#h1_58">Waarmee onze vaderen maaltijd hielden en hoe.</a></li>
<li id="NavPoint-71"><a href="16_Chapter13.xhtml#h1_59">„Baldadig.”</a></li>
<li id="NavPoint-72"><a href="16_Chapter13.xhtml#h1_60">Hoe en Waarom slapen de planten?</a></li></ol></li>
<li id="NavPoint-73"><a href="17_Chapter14.xhtml#Chapter14">Maartje Bot</a>
<ol>
<li id="NavPoint-74"><a href="17_Chapter14.xhtml#h1_61">Het Ontzaglijke.</a></li>
<li id="NavPoint-75"><a href="17_Chapter14.xhtml#h1_62">Waarmee onze vaderen maaltijd hielden en hoe.</a></li>
<li id="NavPoint-76"><a href="17_Chapter14.xhtml#h1_63">De Zuster-Republieken in Zuid-Afrika.</a></li>
<li id="NavPoint-77"><a href="17_Chapter14.xhtml#h1_64">Uit de vogelenwereld.</a></li>
<li id="NavPoint-78"><a href="17_Chapter14.xhtml#h1_65">De Siboga-expeditie.</a></li></ol></li>
<li id="NavPoint-79"><a href="18_Chapter15.xhtml#Chapter15">Maartje Bot</a>
<ol>
<li id="NavPoint-80"><a href="18_Chapter15.xhtml#h1_66">Batikken.</a></li>
<li id="NavPoint-81"><a href="18_Chapter15.xhtml#h1_67">Alleen op de wereld.</a></li>
<li id="NavPoint-82"><a href="18_Chapter15.xhtml#h1_68">Uit de vogelenwereld.</a></li>
<li id="NavPoint-83"><a href="18_Chapter15.xhtml#h1_69">Van ’t oude en ’t nieuwe Zeist.</a></li>
<li id="NavPoint-84"><a href="18_Chapter15.xhtml#h1_70">De reuzenkijker van de Parijsche tentoonstelling.</a></li></ol></li>
<li id="NavPoint-85"><a href="19_Chapter16.xhtml#Chapter16">Een mooie werkavond</a>
<ol>
<li id="NavPoint-86"><a href="19_Chapter16.xhtml#h1_71">Van ’t oude en ’t nieuwe Zeist.</a></li>
<li id="NavPoint-87"><a href="19_Chapter16.xhtml#h1_72">Eene schutjaspartij in Friesland.</a></li>
<li id="NavPoint-88"><a href="19_Chapter16.xhtml#h1_73">Om een kooltje vuur.</a></li>
<li id="NavPoint-89"><a href="19_Chapter16.xhtml#h1_74">Sneeuwdag.</a></li>
<li id="NavPoint-90"><a href="19_Chapter16.xhtml#h1_75">De Zending in Java’s Oosthoek.</a></li></ol></li>
<li id="NavPoint-91"><a href="20_Chapter17.xhtml#Chapter17">Verborgen Ego&#x00EF;sme.</a>
<ol>
<li id="NavPoint-92"><a href="20_Chapter17.xhtml#h1_76">Een pionier voor onze West-Indische Koloni&#x00EB;n.</a></li>
<li id="NavPoint-93"><a href="20_Chapter17.xhtml#h1_77">De Zending in Java’s Oosthoek.</a></li>
<li id="NavPoint-94"><a href="20_Chapter17.xhtml#h1_78">Gouden Bruiloft.</a></li>
<li id="NavPoint-95"><a href="20_Chapter17.xhtml#h1_79">Van ’t oude en ’t nieuwe Zeist.</a></li></ol></li>
<li id="NavPoint-96"><a href="21_Chapter18.xhtml#Chapter18">BERGAF!</a>
<ol>
<li id="NavPoint-97"><a href="21_Chapter18.xhtml#h1_80">Van oude zeelui.</a></li>
<li id="NavPoint-98"><a href="21_Chapter18.xhtml#h1_81">Herinneringen aan Holland in Spanje.</a></li>
<li id="NavPoint-99"><a href="21_Chapter18.xhtml#h1_82">Onder de Palmen.</a></li>
<li id="NavPoint-100"><a href="21_Chapter18.xhtml#h1_83">Gouden Bruiloft.</a></li>
<li id="NavPoint-101"><a href="21_Chapter18.xhtml#h1_84">Op bezoek bij een Hollandsche schilder in de Saharah.</a></li>
</ol>
</li>
<li id="NavPoint-102"><a href="22_Chapter19.xhtml#Chapter19">BERGAF!</a>
<ol>
<li id="NavPoint-103"><a href="22_Chapter19.xhtml#h1_85">Uit het land der Salanganen.</a></li>
<li id="NavPoint-104"><a href="22_Chapter19.xhtml#h1_86">Varnde Liute.</a></li>
<li id="NavPoint-105"><a href="22_Chapter19.xhtml#h1_87">Een herinnering aan de koninginne week te Amsterdam.</a></li>
<li id="NavPoint-106"><a href="22_Chapter19.xhtml#h1_88">HET LOT.</a></li>
</ol>
</li>
</ol>
    </nav>
  </body>
</html>
Here is the script which runs the Find in Files (not tested by me) and reformats the results file to the expected output file whereby I made some minor changes for a better structure. Read the comments in code of script for details.

Code: Select all
// Define line ending/termination of results file as JavaScript
// string and as Perl regular expression search string.
var sStringLineEnd = "\r\n";
var sRegExpLineEnd = "\\r\\n";

// Define indent tabs/spaces, here 4 spaces.
// Can be also \t for a horizontal tab.
var sIndent1 = "    ";
var sIndent2 = sIndent1 + sIndent1;
var sIndent3 = sIndent2 + sIndent1;

// Define directory containing the *.xhtml files.
var sDirectory = "D:\\Somenath\\Project\\Text\\";

// Define Find in Files summary information and title of results file.
var sSummaryInfo = "Search complete, found";
var sResultsDocTitle = "** Find Results ** ";  // Note the space at end!

// Define environment for this script.
UltraEdit.insertMode();
UltraEdit.columnModeOff();

// Find all chapter titles and headings in paragraphs of specific class.
UltraEdit.perlReOn();
UltraEdit.frInFiles.filesToSearch=0;
UltraEdit.frInFiles.useOutputWindow=false;
UltraEdit.frInFiles.searchInFilesTypes="*.xhtml";
UltraEdit.frInFiles.directoryStart=sDirectory;
UltraEdit.frInFiles.displayLinesDoNotMatch=false;
UltraEdit.frInFiles.openMatchingFiles=false;
UltraEdit.frInFiles.ignoreHiddenSubs=true;
UltraEdit.frInFiles.reverseSearch=false;
UltraEdit.frInFiles.searchSubs=false;
UltraEdit.frInFiles.matchWord=false;
UltraEdit.frInFiles.useEncoding=true;
UltraEdit.frInFiles.encoding=65001;
UltraEdit.frInFiles.matchCase=true;
UltraEdit.frInFiles.regExp=true;
UltraEdit.frInFiles.find('<p class="(?:ch-title|h1\\w*?)"');

// The results file becomes the active file if it was not already opened before.
var bListCreated = false;
if (UltraEdit.activeDocument.path == sResultsDocTitle) bListCreated = true;
else
{
   for (var nDocIndex = 0; nDocIndex < UltraEdit.document.length; nDocIndex++)
   {
      if (UltraEdit.document[nDocIndex].path == sResultsDocTitle)
      {
         UltraEdit.document[nDocIndex].setActive();
         bListCreated = true;
         break;
      }
   }
}

if (bListCreated)
{
   UltraEdit.activeDocument.findReplace.mode=0;
   UltraEdit.activeDocument.findReplace.matchCase=true;
   UltraEdit.activeDocument.findReplace.matchWord=false;
   UltraEdit.activeDocument.findReplace.regExp=false;
   UltraEdit.activeDocument.findReplace.searchInColumn=false;

   if (sSummaryInfo.length)
   {
      // Search for the summary info at bottom of the Find in Files
      // results and delete this summary if found at all as expected.
      UltraEdit.activeDocument.findReplace.searchDown=false;
      if (UltraEdit.activeDocument.findReplace.find(sSummaryInfo))
      {
         UltraEdit.activeDocument.key("HOME");
         UltraEdit.activeDocument.selectToBottom();
         UltraEdit.activeDocument.deleteText();
      }
   }

   // Move caret to top of the file.
   UltraEdit.activeDocument.top();

   // Clean up the results file with replaces.
   UltraEdit.activeDocument.findReplace.regExp=true;
   UltraEdit.activeDocument.findReplace.searchDown=true;
   UltraEdit.activeDocument.findReplace.preserveCase=false;
   UltraEdit.activeDocument.findReplace.replaceAll=true;
   UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;
   UltraEdit.activeDocument.findReplace.replace("^(?:-----|Find |Found).+" + sRegExpLineEnd,"");
   UltraEdit.activeDocument.top();

   // Insert list tag with with identifier with incrementing number.
   UltraEdit.activeDocument.write(sStringLineEnd);
   UltraEdit.activeDocument.top();
   UltraEdit.columnModeOn();
   UltraEdit.activeDocument.columnInsert('<li id="NavPoint-">');
   UltraEdit.activeDocument.gotoLine(1,18);
   UltraEdit.activeDocument.columnInsertNum(1,1,true,false);
   UltraEdit.columnModeOff();
   UltraEdit.activeDocument.top();
   UltraEdit.activeDocument.findReplace.replace('(?<=<li id="NavPoint-)0+',"");

   // Reformat chapter titles to list items of level 1.
   sDirectory   = sDirectory.replace(/\\/g,"\\\\");
   var sSearch  = '(<li id="NavPoint-\\d+">)' + sDirectory + '(\\d+_)(.+?)(\\..+?)\\(\\d+\\):[\\t ]*<p class="ch-title">(?:<a id=".+?"></a>)?(.+?)</p>';
   var sReplace = sIndent1 + '\\1<a href="\\2\\3\\4#\\3">\\5</a></li>';
   UltraEdit.activeDocument.findReplace.replace(sSearch,sReplace);
   UltraEdit.activeDocument.top();

   // Reformat headings to list items of level 2.
   sSearch  = '(<li id="NavPoint-\\d+">)' + sDirectory + '(.+?)\\(\\d+\\):[\\t ]*<p class="h.+" id="(.+?)">(?:<a id=".+?"></a>)?(.+?)</p>';
   sReplace = sIndent3 + '\\1<a href="\\2#\\3">\\4</a></li>';
   UltraEdit.activeDocument.findReplace.replace(sSearch,sReplace);
   UltraEdit.activeDocument.top();

   // Replace end list tag by an ordered list start tag of level 2
   // wherever a list item of level 2 follows a list item of level 1.
   sSearch  = "^(" + sIndent1 + "<li.+?)</li>(?=" + sRegExpLineEnd + sIndent3 + "<li)";
   sReplace = "\\1" + sRegExpLineEnd + sIndent2 + "<ol>";
   UltraEdit.activeDocument.findReplace.replace(sSearch,sReplace);
   UltraEdit.activeDocument.top();

   // Insert an ordered list end tag of level 2 and a list item end tag of
   // level 1 wherever a list item of level 1 follows a list item of level 2.
   sSearch  = "^(" + sIndent3 + "<li.+?)$(?=" + sRegExpLineEnd + sIndent1 + "<li)";
   sReplace = "\\1" + sRegExpLineEnd + sIndent2 + "</ol>" + sRegExpLineEnd + sIndent1 + "</li>";
   UltraEdit.activeDocument.findReplace.replace(sSearch,sReplace);
   UltraEdit.activeDocument.top();

   // Insert an ordered list end tag of level 2 and a list item end tag of level 1
   // before last line of file if table of contents ends with list item of level 2.
   sSearch  = "^(" + sIndent3 + "<li.+?)$(?=" + sRegExpLineEnd + "<li)";
   sReplace = "\\1" + sRegExpLineEnd + sIndent2 + "</ol>" + sRegExpLineEnd + sIndent1 + "</li>";
   UltraEdit.activeDocument.findReplace.replace(sSearch,sReplace);
   UltraEdit.activeDocument.top();
   // Replace the first line by header of XHTML TOC file.
   UltraEdit.activeDocument.selectLine();
   UltraEdit.activeDocument.write(
      '<?xml version="1.0" encoding="UTF-8"?>' + sStringLineEnd +
      '<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">' + sStringLineEnd +
      '<head>' + sStringLineEnd + sIndent1 +
      '<meta http-equiv="default-style" content="text/html; charset=utf-8"/>' + sStringLineEnd + sIndent1 +
      '<title>XXX</title>' + sStringLineEnd + sIndent1 +
      '<link rel="stylesheet" href="../Styles/stylesheet.css" type="text/css"/>' + sStringLineEnd +
      '</head>' + sStringLineEnd +
      '<body>' + sStringLineEnd +
      '<nav id="toc" epub:type="toc">' + sStringLineEnd +
      '<h1>Inhoud</h1>' + sStringLineEnd +
      '<ol>' + sStringLineEnd + sIndent1 +
      '<li id="NavPoint-1"><a href="02_Titlepage.xhtml">Titelpagina</a></li>' + sStringLineEnd
   );

   // Replace the last line by footer of XHTML TOC file.
   UltraEdit.activeDocument.bottom();
   UltraEdit.activeDocument.selectLine();

   UltraEdit.activeDocument.write(
      '</ol>'   + sStringLineEnd +
      '</nav>'  + sStringLineEnd +
      '</body>' + sStringLineEnd +
      '</html>' + sStringLineEnd
   );
   UltraEdit.activeDocument.top();

   // Convert the results file from UTF-16 LE to UTF-8. UltraEdit detects
   // automatically that the active file is UTF-16 LE encoded and not an
   // ASCII/ANSI file and makes the conversion right, i.e. changes nothing
   // than the information to save the file next time with UTF-8 instead of
   // UTF-16 LE encoding.
   UltraEdit.activeDocument.ASCIIToUTF8();
}

Please note that for each backslash in a JavaScript string which is passed to the Perl regular expression find/replace function of UltraEdit/UEStudio one more backslash must be inserted to escape the backslash making the regular expression search strings a bit difficult to read in the script code.

The output file created for the posted Find in Files results is:

Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>
    <meta http-equiv="default-style" content="text/html; charset=utf-8"/>
    <title>XXX</title>
    <link rel="stylesheet" href="../Styles/stylesheet.css" type="text/css"/>
</head>
<body>
<nav id="toc" epub:type="toc">
<h1>Inhoud</h1>
<ol>
    <li id="NavPoint-1"><a href="02_Titlepage.xhtml">Titelpagina</a></li>
    <li id="NavPoint-2"><a href="04_Chapter01.xhtml#Chapter01">VIJF- EN -TWINTIG JAREN.</a>
        <ol>
            <li id="NavPoint-3"><a href="04_Chapter01.xhtml#h1_1">NA DEN STORM!</a></li>
            <li id="NavPoint-4"><a href="04_Chapter01.xhtml#h1_2">Mr. S. M S. MODDERMAN,</a></li>
            <li id="NavPoint-5"><a href="04_Chapter01.xhtml#h1_3">De ramp op Ceram.</a></li>
            <li id="NavPoint-6"><a href="04_Chapter01.xhtml#h1_4">De Zuster-Republieken in Zuid-Afrika.</a></li>
            <li id="NavPoint-7"><a href="04_Chapter01.xhtml#h1_5">Een nachtelijke aanval op Soestdijk.</a></li>
        </ol>
    </li>
    <li id="NavPoint-8"><a href="05_Chapter02.xhtml#Chapter02">NA DEN STORM!</a>
        <ol>
            <li id="NavPoint-9"><a href="05_Chapter02.xhtml#h1_6">Naar den kratef Van den Gedeh.</a></li>
            <li id="NavPoint-10"><a href="05_Chapter02.xhtml#h1_7">De Zuster-Republieken in Zuid-Afrika.</a></li>
            <li id="NavPoint-11"><a href="05_Chapter02.xhtml#h1_8">Op Weg naar Alaska’s goudvelden.</a></li>
            <li id="NavPoint-12"><a href="05_Chapter02.xhtml#h1_9">Een wandeling in het Velpsche Broek.</a></li>
        </ol>
    </li>
    <li id="NavPoint-13"><a href="06_Chapter03.xhtml#Chapter03">NA DEN STORM!</a>
        <ol>
            <li id="NavPoint-14"><a href="06_Chapter03.xhtml#h1_10">Een Veteraan in de Tropen</a></li>
            <li id="NavPoint-15"><a href="06_Chapter03.xhtml#h1_11">Een wandeling in het Velpsche Broek.</a></li>
            <li id="NavPoint-16"><a href="06_Chapter03.xhtml#h1_12">Op Weg naar Alaska’s goudvelden.</a></li>
            <li id="NavPoint-17"><a href="06_Chapter03.xhtml#h1_13">„LONG TOM.”</a></li>
        </ol>
    </li>
    <li id="NavPoint-18"><a href="07_Chapter04.xhtml#Chapter04">Muskus-Menschen.</a>
        <ol>
            <li id="NavPoint-19"><a href="07_Chapter04.xhtml#h1_14">De schutterij in Nederlandsch-Indi&#x00EB;</a></li>
            <li id="NavPoint-20"><a href="07_Chapter04.xhtml#h1_15">Maretakken</a></li>
            <li id="NavPoint-21"><a href="07_Chapter04.xhtml#h1_16">Een wandeling in het Velpsche Broek.</a></li>
            <li id="NavPoint-22"><a href="07_Chapter04.xhtml#h1_17">Een sollicitatie.</a></li>
            <li id="NavPoint-23"><a href="07_Chapter04.xhtml#h1_18">Aardschuivingen in de Preanger,</a></li>
        </ol>
    </li>
    <li id="NavPoint-24"><a href="08_Chapter05.xhtml#Chapter05">Muskas-Menschen.</a>
        <ol>
            <li id="NavPoint-25"><a href="08_Chapter05.xhtml#h1_19">GENERAAL VAN DER HEIJDEN.</a></li>
            <li id="NavPoint-26"><a href="08_Chapter05.xhtml#h1_20">Een sollicitatie.</a></li>
            <li id="NavPoint-27"><a href="08_Chapter05.xhtml#h1_21">De Zuster-Republieken in Zuid-Afrika.</a></li>
            <li id="NavPoint-28"><a href="08_Chapter05.xhtml#h1_22">Een Engelsch kanon in Boerenhanden.</a></li>
            <li id="NavPoint-29"><a href="08_Chapter05.xhtml#h1_23">Herinnering aan Soekaboemi</a></li>
        </ol>
    </li>
    <li id="NavPoint-30"><a href="09_Chapter06.xhtml#Chapter06">Muskus-Menschen.</a>
        <ol>
            <li id="NavPoint-31"><a href="09_Chapter06.xhtml#h1_24">Ludwig Knaus.</a></li>
            <li id="NavPoint-32"><a href="09_Chapter06.xhtml#h1_25">In verre zee&#x00EB;n voor driehonderd jaar</a></li>
            <li id="NavPoint-33"><a href="09_Chapter06.xhtml#h1_26">De Zuster-Republieken in Zuid-Afrika.</a></li>
            <li id="NavPoint-34"><a href="09_Chapter06.xhtml#h1_27">Uit het leven van een Zeeofficier in Indi&#x00EB;.</a></li>
        </ol>
    </li>
    <li id="NavPoint-35"><a href="10_Chapter07.xhtml#Chapter07">Muskus-Menschen.</a>
        <ol>
            <li id="NavPoint-36"><a href="10_Chapter07.xhtml#h1_28">Biddende Vrouw van Nicolaes Maes.</a></li>
            <li id="NavPoint-37"><a href="10_Chapter07.xhtml#h1_29">Het stationschip in West-Indi&#x00EB;</a></li>
            <li id="NavPoint-38"><a href="10_Chapter07.xhtml#h1_30">In verre zee&#x00EB;n voor driehonderd jaar</a></li>
            <li id="NavPoint-39"><a href="10_Chapter07.xhtml#h1_31">Dr. L. A. J. Burgersdijk.</a></li>
        </ol>
    </li>
    <li id="NavPoint-40"><a href="11_Chapter08.xhtml#Chapter08">Muskus-Menschen.</a>
        <ol>
            <li id="NavPoint-41"><a href="11_Chapter08.xhtml#h1_32">Kijkjes in de Koninklijke Marine</a></li>
            <li id="NavPoint-42"><a href="11_Chapter08.xhtml#h1_33">Het sprookje Van de robijnen.</a></li>
            <li id="NavPoint-43"><a href="11_Chapter08.xhtml#h1_34">De Z&#x00FC;ster-Hepublieken in Zuid-Afrika.</a></li>
            <li id="NavPoint-44"><a href="11_Chapter08.xhtml#h1_35">In verre zee&#x00EB;n voor driehonderd jaar</a></li>
            <li id="NavPoint-45"><a href="11_Chapter08.xhtml#h1_36">Impressie</a></li>
        </ol>
    </li>
    <li id="NavPoint-46"><a href="12_Chapter09.xhtml#Chapter09">Muskus-Menschen.</a>
        <ol>
            <li id="NavPoint-47"><a href="12_Chapter09.xhtml#h1_37">De Drinkebroer van Frans Hals.</a></li>
            <li id="NavPoint-48"><a href="12_Chapter09.xhtml#h1_38">Het sprookje van de robijnen.</a></li>
            <li id="NavPoint-49"><a href="12_Chapter09.xhtml#h1_39">Het nieuwe Slachthuis te Roermond.</a></li>
            <li id="NavPoint-50"><a href="12_Chapter09.xhtml#h1_40">Kijkjes bij de Schutterij</a></li>
            <li id="NavPoint-51"><a href="12_Chapter09.xhtml#h1_41">Anna van Nievelt.</a></li>
            <li id="NavPoint-52"><a href="12_Chapter09.xhtml#h1_42">Fusains</a></li>
            <li id="NavPoint-53"><a href="12_Chapter09.xhtml#h1_43">Het Sneeuwklokje.</a></li>
        </ol>
    </li>
    <li id="NavPoint-54"><a href="13_Chapter10.xhtml#Chapter10">Benjamin-af,</a>
        <ol>
            <li id="NavPoint-55"><a href="13_Chapter10.xhtml#h1_44">F. Adama van Scheltema.</a></li>
            <li id="NavPoint-56"><a href="13_Chapter10.xhtml#h1_45">Van een Engelschman die Ladysmith niet vinden kon.</a></li>
            <li id="NavPoint-57"><a href="13_Chapter10.xhtml#h1_46">Het sprookje van de robijnen.</a></li>
            <li id="NavPoint-58"><a href="13_Chapter10.xhtml#h1_47">De Narcis.</a></li>
        </ol>
    </li>
    <li id="NavPoint-59"><a href="14_Chapter11.xhtml#Chapter11">Benjamin-af,</a>
        <ol>
            <li id="NavPoint-60"><a href="14_Chapter11.xhtml#h1_48">De Stadhouders van Friesland</a></li>
            <li id="NavPoint-61"><a href="14_Chapter11.xhtml#h1_49">Waarmee onze vaderen maaltijd hielden en hoe.</a></li>
            <li id="NavPoint-62"><a href="14_Chapter11.xhtml#h1_50">Soerabaia in vogelvlucht.</a></li>
            <li id="NavPoint-63"><a href="14_Chapter11.xhtml#h1_51">De Zusten-Republieken in Zuid-Afrika.</a></li>
            <li id="NavPoint-64"><a href="14_Chapter11.xhtml#h1_52">Van een Engelschman die Ladysmith niet vinden kon.</a></li>
        </ol>
    </li>
    <li id="NavPoint-65"><a href="15_Chapter12.xhtml#Chapter12">Nog bijtijds</a>
        <ol>
            <li id="NavPoint-66"><a href="15_Chapter12.xhtml#h1_53">De Stadhouders van Friesland</a></li>
            <li id="NavPoint-67"><a href="15_Chapter12.xhtml#h1_54">Soerabaia in vogelvlucht.</a></li>
            <li id="NavPoint-68"><a href="15_Chapter12.xhtml#h1_55">Waarmee onze vaderen maaltijd hielden en hoe.</a></li>
            <li id="NavPoint-69"><a href="15_Chapter12.xhtml#h1_56">De Zuster-Republieken in Zuid-Afrika.</a></li>
        </ol>
    </li>
    <li id="NavPoint-70"><a href="16_Chapter13.xhtml#Chapter13">Maartje Bot</a>
        <ol>
            <li id="NavPoint-71"><a href="16_Chapter13.xhtml#h1_57">De Grot van ledjoe (Java).</a></li>
            <li id="NavPoint-72"><a href="16_Chapter13.xhtml#h1_58">Waarmee onze vaderen maaltijd hielden en hoe.</a></li>
            <li id="NavPoint-73"><a href="16_Chapter13.xhtml#h1_59">„Baldadig.”</a></li>
            <li id="NavPoint-74"><a href="16_Chapter13.xhtml#h1_60">Hoe en Waarom slapen de planten?</a></li>
        </ol>
    </li>
    <li id="NavPoint-75"><a href="17_Chapter14.xhtml#Chapter14">Maartje Bot</a>
        <ol>
            <li id="NavPoint-76"><a href="17_Chapter14.xhtml#h1_61">Het Ontzaglijke.</a></li>
            <li id="NavPoint-77"><a href="17_Chapter14.xhtml#h1_62">Waarmee onze vaderen maaltijd hielden en hoe.</a></li>
            <li id="NavPoint-78"><a href="17_Chapter14.xhtml#h1_63">De Zuster-Republieken in Zuid-Afrika.</a></li>
            <li id="NavPoint-79"><a href="17_Chapter14.xhtml#h1_64">Uit de vogelenwereld.</a></li>
            <li id="NavPoint-80"><a href="17_Chapter14.xhtml#h1_65">De Siboga-expeditie.</a></li>
        </ol>
    </li>
    <li id="NavPoint-81"><a href="18_Chapter15.xhtml#Chapter15">Maartje Bot</a>
        <ol>
            <li id="NavPoint-82"><a href="18_Chapter15.xhtml#h1_66">Batikken.</a></li>
            <li id="NavPoint-83"><a href="18_Chapter15.xhtml#h1_67">Alleen op de wereld.</a></li>
            <li id="NavPoint-84"><a href="18_Chapter15.xhtml#h1_68">Uit de vogelenwereld.</a></li>
            <li id="NavPoint-85"><a href="18_Chapter15.xhtml#h1_69">Van ’t oude en ’t nieuwe Zeist.</a></li>
            <li id="NavPoint-86"><a href="18_Chapter15.xhtml#h1_70">De reuzenkijker van de Parijsche tentoonstelling.</a></li>
        </ol>
    </li>
    <li id="NavPoint-87"><a href="19_Chapter16.xhtml#Chapter16">Een mooie werkavond</a>
        <ol>
            <li id="NavPoint-88"><a href="19_Chapter16.xhtml#h1_71">Van ’t oude en ’t nieuwe Zeist.</a></li>
            <li id="NavPoint-89"><a href="19_Chapter16.xhtml#h1_72">Eene schutjaspartij in Friesland.</a></li>
            <li id="NavPoint-90"><a href="19_Chapter16.xhtml#h1_73">Om een kooltje vuur.</a></li>
            <li id="NavPoint-91"><a href="19_Chapter16.xhtml#h1_74">Sneeuwdag.</a></li>
            <li id="NavPoint-92"><a href="19_Chapter16.xhtml#h1_75">De Zending in Java’s Oosthoek.</a></li>
        </ol>
    </li>
    <li id="NavPoint-93"><a href="20_Chapter17.xhtml#Chapter17">Verborgen Ego&#x00EF;sme.</a>
        <ol>
            <li id="NavPoint-94"><a href="20_Chapter17.xhtml#h1_76">Een pionier voor onze West-Indische Koloni&#x00EB;n.</a></li>
            <li id="NavPoint-95"><a href="20_Chapter17.xhtml#h1_77">De Zending in Java’s Oosthoek.</a></li>
            <li id="NavPoint-96"><a href="20_Chapter17.xhtml#h1_78">Gouden Bruiloft.</a></li>
            <li id="NavPoint-97"><a href="20_Chapter17.xhtml#h1_79">Van ’t oude en ’t nieuwe Zeist.</a></li>
        </ol>
    </li>
    <li id="NavPoint-98"><a href="21_Chapter18.xhtml#Chapter18">BERGAF!</a>
        <ol>
            <li id="NavPoint-99"><a href="21_Chapter18.xhtml#h1_80">Van oude zeelui.</a></li>
            <li id="NavPoint-100"><a href="21_Chapter18.xhtml#h1_81">Herinneringen aan Holland in Spanje.</a></li>
            <li id="NavPoint-101"><a href="21_Chapter18.xhtml#h1_82">Onder de Palmen.</a></li>
            <li id="NavPoint-102"><a href="21_Chapter18.xhtml#h1_83">Gouden Bruiloft.</a></li>
            <li id="NavPoint-103"><a href="21_Chapter18.xhtml#h1_84">Op bezoek bij een Hollandsche schilder in de Saharah.</a></li>
        </ol>
    </li>
    <li id="NavPoint-104"><a href="22_Chapter19.xhtml#Chapter19">BERGAF!</a>
        <ol>
            <li id="NavPoint-105"><a href="22_Chapter19.xhtml#h1_85">Uit het land der Salanganen.</a></li>
            <li id="NavPoint-106"><a href="22_Chapter19.xhtml#h1_86">Varnde Liute.</a></li>
            <li id="NavPoint-107"><a href="22_Chapter19.xhtml#h1_87">Een herinnering aan de koninginne week te Amsterdam.</a></li>
            <li id="NavPoint-108"><a href="22_Chapter19.xhtml#h1_88">HET LOT.</a></li>
        </ol>
    </li>
</ol>
</nav>
</body>
</html>

The script does not encode each non ASCII character with hexadecimal HTML entity as this is not necessary for a file declared as UTF-8 encoded and being really UTF-8 encoded.
Best regards from Austria
Thanks thanks and thanks a lot. It's really working.
At first I run a Find in Files to search for those titles with the find string class="(article-full-headline|article-full-subhead|article-full-body)". Copy the search result into a new file in UEStudio.

Code: Select all
Find 'class="(article-full-headline|article-full-subhead|article-full-body)"' in 'C:\Project\article_13-1.xml':
C:\Project\article_13-1.xml(18): <h2 class="article-full-headline">Room to roam</h2>
C:\Project\article_13-1.xml(19): <p class="article-full-subhead">Whatever the adventure, the beautifully spacious Nissan Pulsar makes life on the road comfortable, fun and easy</p>
C:\Project\article_13-1.xml(26): <p class="article-full-body">There&#x2019;s no need for compromise with your choice of family car &#x2013; the Nissan Pulsar is bold, adventure-ready and seriously spacious. Whether you&#x2019;re off on a romantic getaway, doing the weekly grocery shop or ferrying the kids around, the Nissan Pulsar is the family car you&#x2019;ve been waiting for. With an abundance of head and leg room in the cabin, and a boot that&#x2019;s biggest in its class, the Nissan Pulsar has plenty of room for all life&#x2019;s great adventures.</p>
C:\Project\article_13-1.xml(27): <p class="article-full-body"><span class="ld_bold"><span class="ld_underline">INNOVATIVE TECHNOLOGY YOU&#x2019;LL ACTUALLY USE:</span></span></p>
C:\Project\article_13-1.xml(28): <p class="article-full-body">Make and take calls safely via the Bluetooth<span class="ld_superscript">&#x00AE;</span> system when you link your compatible phone. And in the ST-L and SSS Sedan models, the advanced sat-nav system features a 5.8&#x201D; colour screen with 3D bird&#x2019;s eye view graphics and a reversing camera, to help keep young ones safe.</p>
C:\Project\article_13-1.xml(32): <p class="article-full-body"><span class="ld_bold"><span class="ld_underline">CLASS LEADING BOOT SPACE:</span></span></p>
C:\Project\article_13-1.xml(33): <p class="article-full-body">With the largest boot in its segment at 510 litres, the Nissan Pulsar has loads of room for everything you need for your next outing &#x2013; be it picnicware, shopping or sports gear</p>
C:\Project\article_13-1.xml(40): <p class="article-full-body"><span class="ld_bold"><span class="ld_underline">SAFE TRAVELS:</span></span></p>
C:\Project\article_13-1.xml(41): <p class="article-full-body">The Nissan Pulsar is built to keep you and your loved ones safe. It features dual front, dual side and dual curtain SRS airbags, as well as ABS, brake assist, electronic stability control and more.</p>
C:\Project\article_13-1.xml(42): <p class="article-full-body"><span class="ld_bold"><span class="ld_underline">SUPERIOR COMFORT:</span></span></p>
C:\Project\article_13-1.xml(43): <p class="article-full-body">The Nissan Pulsar is spacious and comfortable, making road trips a breeze. Large windows allow higher visibility, leather accented seats add luxury in the SSS model and theatre view seating in the rear provides a higher ride position, making it easier to get in and out.</p>
Found 'class="(article-full-headline|article-full-subhead|article-full-body)"' 11 time(s).
----------------------------------------
Find 'class="(article-full-headline|article-full-subhead|article-full-body)"' in 'C:\Project\article_17-1.xml':
C:\Project\article_17-1.xml(18): <h2 class="article-full-headline">WE CHALLENGED AUSTRALIAN WOMEN TO PUT THE #1 ANTI-WRINKLE MOISTURISER TO THE TEST.*</h2>
C:\Project\article_17-1.xml(25): <p class="article-full-body">PROJECT MANAGER</p>
C:\Project\article_17-1.xml(26): <p class="article-full-body"><span class="ld_bold">What were your anti-ageing concerns before trying Revitalift? And what results were you hoping to see?</span></p>
C:\Project\article_17-1.xml(27): <p class="article-full-body">Probably around my eyes. Fine lines around the eyes is where I expected to see results, and that&#x2019;s where I did see a vast improvement. I&#x2019;m always quite skeptical of anti-ageing creams. I must say I thought the results were quite amazing. This one I can honestly say does work!</p>
C:\Project\article_17-1.xml(31): <p class="article-full-body"><span class="ld_bold">What did you think of the Revitalift 14 Day Challenge?</span></p>
C:\Project\article_17-1.xml(32): <p class="article-full-body">I actually noticed results straight away so I&#x2019;ve become a big fan. My skin looks a lot smoother, particularly around my eyes. I was surprised. Before the Challenge I was around a 5 or 4 on the wrinkle reader. In just 2 weeks of using Revitalift day moisturiser, I had probably changed to a 2.</p>
C:\Project\article_17-1.xml(33): <p class="article-full-body"><span class="ld_bold">And what did you think about the product?</span></p>
C:\Project\article_17-1.xml(34): <p class="article-full-body">I was delighted with the rapid results and thrilled with the improved appearance of my skin. It was so easy to apply and the cream absorbed into my skin quickly without leaving any greasy residue. And the big bonus - a little goes a long way!</p>
C:\Project\article_17-1.xml(35): <p class="article-full-body"><span class="ld_bold">How do you feel after using Revitalift?</span></p>
C:\Project\article_17-1.xml(36): <p class="article-full-body">Revitalift has made me more confident, definitely. I wouldn&#x2019;t say it if I didn&#x2019;t believe it. I think I&#x2019;m going to be a life-long user now!</p>
C:\Project\article_17-1.xml(42): <p class="article-full-body">ORGANIC GARDENER</p>
C:\Project\article_17-1.xml(43): <p class="article-full-body"><span class="ld_bold">What did you think of the Revitalift 14 Day Challenge?</span></p>
C:\Project\article_17-1.xml(44): <p class="article-full-body">It&#x2019;s been great actually, doing the Challenge. After a couple of days I had people saying &#x201C;Wow your skin&#x2019;s looking really good&#x201D;. The results were much better than I expected.</p>
C:\Project\article_17-1.xml(45): <p class="article-full-body"><span class="ld_bold">What sort of results did you see with Revitalift?</span></p>
C:\Project\article_17-1.xml(46): <p class="article-full-body">Seeing the visible improvement with Revitalift was absolutely fantastic. I started at about a 6 on the wrinkle reader, but after the 14 Day Challenge, Revitalift actually took me up to about a 2. The wrinkles seemed to just disappear! I can say it really does work!</p>
C:\Project\article_17-1.xml(47): <p class="article-full-body"><span class="ld_bold">How do you feel after using Revitalift?</span></p>
C:\Project\article_17-1.xml(48): <p class="article-full-body">I feel fantastic! Immediately my skin felt better and I can see the results. Now I&#x2019;m feeling much better about my age and what woman doesn&#x2019;t want that! I would buy Revitalift again in a heartbeat, I really would. It does what it says.</p>
C:\Project\article_17-1.xml(50): <p class="article-full-body"><span class="ld_bold">Revitalift&#x2019;s powerful formula with advanced Pro-Retinol A and Fibrelastyl visibly reduces wrinkles and improves skin&#x2019;s elasticity, leaving no wrinkle untouched.</span></p>
Found 'class="(article-full-headline|article-full-subhead|article-full-body)"' 19 time(s).
Search complete, found 'class="(article-full-headline|article-full-subhead|article-full-body)"' 29 time(s). (2 file(s)).

The example output below shows where we need to replace the Article with their respective file name and ID, title with their respective <h2 class="article-full-headline"> and the Summary portion with their corresponding <p class="article-full-subhead"> or if it is not available then first occurrence of <p class="article-full-body">.

Expected output file after reformatting the Find in Files results:

Code: Select all
<Articles>
<Article id="article_13-1" file="article_13-1.xml">
<title ref="headline">Room to roam</title>
<Summary>There&#x2019;s no need for compromise with your choice  ...</Summary>
</Article>
<Article id="article_17-1" file="article_17-1.xml">
<title ref="headline">WE CHALLENGED AUSTRALIAN WOMEN TO PUT THE #1 ANTI-WRINKLE MOISTURISER TO THE TEST.*</title>
<Summary>Whatever the adventure, the beautifully spacious Nissan ...</Summary>
</Article>

...
</Articles>

The input file looks like:

Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="nl">
<head>
<title>BetterHome</title>
<link rel="stylesheet" type="text/css" href="css/TablesAndFloats.css"/>
<link rel="page-template" type="application/vnd.adobe-page-template+xml" href="page-template.xpgt"/>
</head>
<body>
<div class="clean"/>
<div id="header" class="masthead">
<div class="masthead-text">
<div id="header_title" class="masthead-section">
</div>
</div>
</div>
<div id="article_13-1">
<h2 class="article-full-headline">Room to roam</h2>
<p class="article-full-subhead">Whatever the adventure, the beautifully spacious Nissan Pulsar makes life on the road comfortable, fun and easy</p>
<div class="article-in-image" id="image_13_1">
<img class="article-in-image" alt="" id="image_13_1_img" src="images/page_13_1.jpg"/>
</div>
<div class="article-in-image" id="image_13_2">
<img class="article-in-image" alt="" id="image_13_2_img" src="images/page_13_2.jpg"/>
</div>
<p class="article-full-body">There&#x2019;s no need for compromise with your choice of family car &#x2013; the Nissan Pulsar is bold, adventure-ready and seriously spacious. Whether you&#x2019;re off on a romantic getaway, doing the weekly grocery shop or ferrying the kids around, the Nissan Pulsar is the family car you&#x2019;ve been waiting for. With an abundance of head and leg room in the cabin, and a boot that&#x2019;s biggest in its class, the Nissan Pulsar has plenty of room for all life&#x2019;s great adventures.</p>
<p class="article-full-body"><span class="ld_bold"><span class="ld_underline">INNOVATIVE TECHNOLOGY YOU&#x2019;LL ACTUALLY USE:</span></span></p>
<p class="article-full-body">Make and take calls safely via the Bluetooth<span class="ld_superscript">&#x00AE;</span> system when you link your compatible phone. And in the ST-L and SSS Sedan models, the advanced sat-nav system features a 5.8&#x201D; colour screen with 3D bird&#x2019;s eye view graphics and a reversing camera, to help keep young ones safe.</p>
<div class="article-in-image" id="image_14_1">
<img class="article-in-image" alt="" id="image_14_1_img" src="images/page_14_1.jpg"/>
</div>
<p class="article-full-body"><span class="ld_bold"><span class="ld_underline">CLASS LEADING BOOT SPACE:</span></span></p>
<p class="article-full-body">With the largest boot in its segment at 510 litres, the Nissan Pulsar has loads of room for everything you need for your next outing &#x2013; be it picnicware, shopping or sports gear</p>
<div class="article-in-image" id="image_14_2">
<img class="article-in-image" alt="" id="image_14_2_img" src="images/page_14_2.jpg"/>
</div>
<div class="article-in-image" id="image_14_3">
<img class="article-in-image" alt="" id="image_14_3_img" src="images/page_14_3.jpg"/>
</div>
<p class="article-full-body"><span class="ld_bold"><span class="ld_underline">SAFE TRAVELS:</span></span></p>
<p class="article-full-body">The Nissan Pulsar is built to keep you and your loved ones safe. It features dual front, dual side and dual curtain SRS airbags, as well as ABS, brake assist, electronic stability control and more.</p>
<p class="article-full-body"><span class="ld_bold"><span class="ld_underline">SUPERIOR COMFORT:</span></span></p>
<p class="article-full-body">The Nissan Pulsar is spacious and comfortable, making road trips a breeze. Large windows allow higher visibility, leather accented seats add luxury in the SSS model and theatre view seating in the rear provides a higher ride position, making it easier to get in and out.</p>
<div class="article-in-image" id="image_14_4">
<img class="article-in-image" alt="" id="image_14_4_img" src="images/page_14_4.jpg"/>
</div>
</div>
</body>
</html>
Here is the code for this task using a Find in Files and reformatting the results file to wanted output.

Code: Select all
// Define line ending/termination of results file as JavaScript
// string and as Perl regular expression search string.
var sStringLineEnd = "\r\n";
var sRegExpLineEnd = "\\r\\n";

// Define directory containing the *.xml files.
var sDirectory = "C:\\Project\\";

// Define Find in Files summary information and title of results file.
var sSummaryInfo = "Search complete, found";
var sResultsDocTitle = "** Find Results ** ";  // Note the space at end!

// Define environment for this script.
UltraEdit.insertMode();
UltraEdit.columnModeOff();

// Find all paragraphs of specific classes as defined below.
UltraEdit.perlReOn();
UltraEdit.frInFiles.filesToSearch=0;
UltraEdit.frInFiles.useOutputWindow=false;
UltraEdit.frInFiles.searchInFilesTypes="*.xml";
UltraEdit.frInFiles.directoryStart=sDirectory;
UltraEdit.frInFiles.displayLinesDoNotMatch=false;
UltraEdit.frInFiles.openMatchingFiles=false;
UltraEdit.frInFiles.ignoreHiddenSubs=true;
UltraEdit.frInFiles.reverseSearch=false;
UltraEdit.frInFiles.searchSubs=false;
UltraEdit.frInFiles.matchWord=false;
UltraEdit.frInFiles.useEncoding=true;
UltraEdit.frInFiles.encoding=65001;
UltraEdit.frInFiles.matchCase=true;
UltraEdit.frInFiles.regExp=true;
UltraEdit.frInFiles.find('class="article-full-(?:headline|subhead|body)');

// The results file becomes the active file if it was not already opened before.
var bListCreated = true;
if (UltraEdit.activeDocument.path == sResultsDocTitle) bListCreated = true;
else
{
   for (var nDocIndex = 0; nDocIndex < UltraEdit.document.length; nDocIndex++)
   {
      if (UltraEdit.document[nDocIndex].path == sResultsDocTitle)
      {
         UltraEdit.document[nDocIndex].setActive();
         bListCreated = true;
         break;
      }
   }
}

if (bListCreated)
{
   UltraEdit.activeDocument.findReplace.mode=0;
   UltraEdit.activeDocument.findReplace.matchCase=true;
   UltraEdit.activeDocument.findReplace.matchWord=false;
   UltraEdit.activeDocument.findReplace.regExp=false;
   UltraEdit.activeDocument.findReplace.searchInColumn=false;

   if (sSummaryInfo.length)
   {
      // Search for the summary info at bottom of the Find in Files
      // results and delete this summary if found at all as expected.
      UltraEdit.activeDocument.bottom();
      UltraEdit.activeDocument.findReplace.searchDown=false;
      if (UltraEdit.activeDocument.findReplace.find(sSummaryInfo))
      {
         UltraEdit.activeDocument.key("HOME");
         UltraEdit.activeDocument.selectToBottom();
         UltraEdit.activeDocument.deleteText();
      }
   }

   // Move caret to botton of the file and append there the Articles end tag.
   UltraEdit.activeDocument.bottom();
   UltraEdit.activeDocument.write("</Articles>" + sStringLineEnd);

   // Move caret to top of the file and insert there the Articles start tag.
   UltraEdit.activeDocument.top();
   UltraEdit.activeDocument.write("<Articles>" + sStringLineEnd);

   // Reformat the results file.
   UltraEdit.activeDocument.findReplace.regExp=true;
   UltraEdit.activeDocument.findReplace.searchDown=true;
   UltraEdit.activeDocument.findReplace.preserveCase=false;
   UltraEdit.activeDocument.findReplace.replaceAll=true;
   UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;

   // Remove the separator lines.
   UltraEdit.activeDocument.findReplace.replace("^----+" + sRegExpLineEnd,"");
   UltraEdit.activeDocument.top();

   // Reformat begin of find results of a file to Article start tag.
   UltraEdit.activeDocument.findReplace.replace("^Find '.+?' in '.+\\\\(.+)(\\..+)':",'<Article id="\\1" file="\\1\\2">');
   UltraEdit.activeDocument.top();

   // Reformat end of find results of a file to Article end tag.
   UltraEdit.activeDocument.findReplace.replace("^Found.+$","</Article>");
   UltraEdit.activeDocument.top();

   // Reformat each paragraph of class headline to title element with value.
   UltraEdit.activeDocument.findReplace.replace('^.+?<h2 class="article-full-headline">(.+)</h2>','<title ref="headline">\\1</title>');
   UltraEdit.activeDocument.top();

   // Reformat each paragraph of class subhead to Summary element with value.
   UltraEdit.activeDocument.findReplace.replace('^.+?<p class="article-full-subhead">(.+)</p>','<Summary>\\1</Summary>');
   UltraEdit.activeDocument.top();

   // Reformat each first paragraph of class body of a file without a
   // Summary element already present to Summary element with value.
   UltraEdit.activeDocument.findReplace.replace("(</title>" + sRegExpLineEnd + ').+<p class="article-full-body">(.+)</p>','\\1<Summary>\\2</Summary>');
   UltraEdit.activeDocument.top();

   // Remove all remaining paragraphs of class body.
   UltraEdit.activeDocument.findReplace.replace('^(?:.+<p class="article-full-body">.+' + sRegExpLineEnd + ')+',"");
   UltraEdit.activeDocument.top();

   // Remove inline elements.
   UltraEdit.activeDocument.findReplace.replace("(?:</?(?:[biu]|span|strong|em)[^<>]*>)+","");
   UltraEdit.activeDocument.top();

   // Modify start and end tag of summaries with 0 to 60 characters.
   UltraEdit.activeDocument.findReplace.replace("<(Summary>.{0,60})</S","<#\\1</!S");
   UltraEdit.activeDocument.top();

   // Truncate summaries with more than 60 characters and last character
   // of last word within the first 56 character is the 56th character
   // of the summary.
   UltraEdit.activeDocument.findReplace.replace("(<Summary>.{56}\\>).+</Sum","\\1</#!#!Sum");
   UltraEdit.activeDocument.top();

   // Truncate summaries with more than 60 characters to first
   // 56 characters and all remaining word characters of last word.
   UltraEdit.activeDocument.findReplace.replace("(<Summary>.{56}\\w*).+(?=</Summary>)","\\1#!#!");
   UltraEdit.activeDocument.top();

   // Next remove all word characters left to #!#! and the possible space(s)
   // before the last word of still too long summaries and replace #!#! by
   // a space and 3 points.
   UltraEdit.activeDocument.findReplace.replace(" *\\w+#!#!(?=</Summary>)"," ...");
   UltraEdit.activeDocument.top();

   // And replace all remaining #!#! with a non word character left of
   // this string at end of a summary and #!#! within summary end tag of
   // those summaries with last character of last word being 56th character
   // by a space and 3 points left to end tag of Summary element.
   UltraEdit.activeDocument.findReplace.replace(" *#!#!</Sum|</#!#!Sum"," ...</Sum");
   UltraEdit.activeDocument.top();

   // And last restore the tags of summaries with 0 to 60 characters.
   UltraEdit.activeDocument.findReplace.replace("<#(.+)</!S","<\\1</S");

   // Convert the results file from UTF-16 LE to UTF-8. UltraEdit/UEStudio
   // detects automatically that the active file is UTF-16 LE encoded and
   // not an ASCII/ANSI file and makes the conversion right, i.e. changes
   // nothing than the information to save the file next time with UTF-8
   // instead of UTF-16 LE encoding.
   UltraEdit.activeDocument.ASCIIToUTF8();
   UltraEdit.activeDocument.top();
}

The Perl regular expression search string was modified to class="article-full-(?:headline|subhead|body) as this makes the Find in Files a little bit faster.
Best regards from Austria
Thanks for all your support, it's working fine.

There are few issues kindly help us to resolve.

  1. After find result all the emphasis like </?[iub]>, both opening and closing tag, will be deleted also the </?span[^<>]*>.
  2. After running the script getting an extra data mention below which will be deleted.
Code: Select all
C:\Project\article_24-1.xml(36): <p class="article-full-body">Varsha is one among many in this situation. Across India, where for centuries a life&#x2019;s possibilities were circumscribed by the caste into which you were born, housemaids, sharecroppers and bricklayers are sending their children to school like never before. In primary school, there is almost universal enrollment, and for the first time in the country&#x2019;s history, girls are as likely to be enrolled in primary school as boys. On every reporting trip across India, I am struck by this remarkable shift, and when I ask their mothers why they bother, I hear answers as vague as this: I will educate my daughter because I want her life to be different from mine.</p>
C:\Project\article_24-1.xml(103): <p class="article-full-body"><span class="ld_bold">IN MAY 2013,</span> Varsha nails her Class 10 exams, earning the second highest score in her class. Papa agrees to let her enroll in a local high school, but insists that she wear only trousers. He softens that stance after a few months, but keeps a tight leash on her. No going to the library after school. No going to a friend&#x2019;s house, not even to do homework together.</p>

And one more issue when we are taking the <Summary> data from <p class="article-full-body"> it must not exceed more than 60 characters followed by ( ...) at the end of the summary data. The example below.

Code: Select all
<Summary>For her new book, Nancy Jo Sales spoke to 200 teen ...</Summary>
I enhanced the script code in my previous post for those 2 new requirements. The truncation of the too long summaries was not easy as it can be seen on code above with looking on the last 6 Perl regular expression replaces. It took me a quite long time to find this solution.
Best regards from Austria
Thanks a lot Mofi, it's working fine. There are some issues, but I solved those issues creating a macro. Thanks for your support.
8 posts Page 1 of 1