User to user discussion and support for UltraEdit, UEStudio, UltraCompare, and other IDM applications.

Find, replace, find in files, replace in files, regular expressions
2 posts Page 1 of 1

Code: Select all
<p>Agassiz, Louis, 1840, Etudes sur les glaciers: Neuchatel, privately published, 346 p.</p>
<p>Andrews, Edmund, 1870, The North American lakes considered as chronometers of post-glacial time: Chicago Acad. Sci. Trans., v. 2, 1870, p. 1-23</p>
<p>Baker, F. C, 1920, The life of the Pleistocene or glacial period as recorded in the deposits laid down by the great ice sheets: Univ. Illinois Bull., v. 17, no. 41, 476 p.</p>
<p>Capps, S. R., 1915, <p>An estimate of the age of the last great glaciation in Alaska: Washington Acad. Sci. J., v. 5, p. 108-115</p>
<p>Chamberlin, T. C, 1877, Geology of eastern Wisconsin, in Geology of Wisconsin, survey of 1873-1877, v. 2: Madison, Commissioners of Public Printing, p. 97-246</p>
<p>—1878, on the extent and significance of the Wisconsin kettle moraine: Wisconsin Acad. Sci., Arts Lett. Trans., v. 4 (1876-77), p. 201-234</p>
<p>—1883, Terminal moraine<p> of the second <p>glacial epoch: U.S. Geol. Surv. Ann. Rep. 3, p. 291-402</p>
<p>—1888, The rock-scorings of the great ice invasions: U.S. Geol. Surv., Ann. Rep. 7, p. 147-248</p>
<p>—1895, The classification of American glacial deposits: J. Geol., v. 3, p. 270-277</p>
<p>—1896, Nomenclature of glacial formations: J. Geol., v. 4, p. 872-876</p>
<p>Conrad, T. A., 1839, Notes on American geology: Amer. J. Sci., v. 35, p. 237-251</p>
<p>Daly, R. A., 1910, Pleistocene glaciation and the coral reef problem: Amer. J. Sci., v. 30, p. 297-308</p>
<p>—1915, The glacial-control theory of coral reefs: Amer. Acad. Arts Sci. Proc, v. 51, p. 157-251</p>
<p>Dana, J. D., 1863, Manual of geology: Philadelphia, Theodore Bliss & Co., 1st ed., 798 p.
<p>—1873, On the Glacial and Champlain eras in New England: Amer. J. Sci., v. 5, p. 198-211</p>
<p>—1875, Manual of geology: New York, Ivison, Blake-man, Taylor & Co., 2d. ed., 828 p.</p>
<p>—1895, Manual of geology: New York, American Book Co., 4th ed., 1087 p.</p>
<p>Darton, N. H., 1902, Description of the Norfolk quadrangle: U.S. Geol. Surv. Geol. Atlas, Folio 80, 4 p.</p>
<p>Dobson, Peter, 1826, <c><p>Remarks on bowlders</p></c>: Amer. J. Sci., v. 10, p. 217-218</p>
<p>Gale, H. S., 1914, Salines in the Owens, Searles, and Pana-mint basins, southeastern California: U.S. Geol. Surv. Bull. 580, p. 251-323</p>
<p>Geikie, James, 1874, The great ice age and its relation to the antiquity of man: London, W. Isbister, 575 p.</p>
<p>—1894, The great ice age and its relation to the antiquity of man: London, Stanford, 3d. ed., 850 p.</p>
<p>Gilbert, G. K., 1871, On certain glacial and post-glacial phenomena of the Maumee Valley: Amer. J. Sci., v. 1, p. 339-345</p>
<p>—1890, Lake Bonneville: U.S. Geol. Surv. Monogr. 1, 438 p.</p>
<p>Gray, Asa, 1878, Forest geography and archaeology: Amer. J. Sci., v. 16, p. 85-94, 183-196</p>
<p>Hay, O. P., 1914, The Pleistocene mammals of Iowa: Iowa Geol. Surv., v. 42, Ann. Rep. for 1912, p. 1-662</p>
<p>—1923, 1924, 1927, The Pleistocene of North America and its vertebrated animals . . . : Carnegie Instn. Publ. 322, 322A, 322B (3 v.)</p>
<p>Hitchcock, C. H., 1878, Surface geology, in The geology of New Hampshire: Concord, v. 3, pt. 3, 340 p.</p>
<p>Hitchcock, Edward, 1841, First anniversary address before the Association of American Geologists . . . : Amer. J. Sci., v. 41, p. 232-275</p>

I am looking for a regex find which will find me only the strings "<p>...<p>" inside which there is no "</p>" tag present.

I'm using the Perl expression "<p>.*(?<!</p>)?<p>" which partially does the job, but it doesn't find the multi-line string even after adding "(?s)" before the expression

Code: Select all
<p>Dana, J. D., 1863, Manual of geology: Philadelphia, Theodore Bliss & Co., 1st ed., 798 p.

And it also doesn't work properly on the string

Code: Select all
<p>—1883, Terminal moraine<p> of the second <p>

where I want it to find "<p>—1883, Terminal moraine<p>" and "<p> of the second <p>" separately and not the total.

Can anyone help me on this :|
The Perl regular expression search string (?s)<p\b(?:.(?!</p>)(?!<p\b))+.(?:</p>)? finds paragraphs with or without </p> at end.

The Perl regular expression search string (?s)<p\b(?:.(?!</p>)(?!<p\b))+.(?=<p\b) is the one you need to find a paragraph without an end tag before start tag of next paragraph is in file.

(?s) ... . matches also newline characters.

<p\b ... a paragraph start tag with or without attributes. Character p must be a complete word verified by \b (word boundary).

(?:...)+ ... the expression in this non marking group must be applied 1 or more times.

.(?!</p>)(?!<p\b) ... find any character where next to this character is whether a paragraph end tag or a new paragraph start tag verified by two non matching negative lookaheads.

. ... the last character before paragraph end / start tag must be additionally matched in any case.

On first search string only:

(?:</p>)? ... match also the end tag if being optionally found next.

On second search string only:

(?=<p\b) ... the entire search is only positive if after last character of paragraph there is a paragraph start tag verified by a non matching positive lookahead.
Best regards from Austria
2 posts Page 1 of 1