I'm trying to figure out how I could go through an entire site and perform a search for a string in the <body>. If string is found, I would like to output the page <title>, date modified, which would be found with the following format , and file name with full path and extension.
For example we'll search for "fast food" (case insensitive) would output the following results to a file or to the results window where .
I'm pretty sure it involves using the FileNameFunctions.js and find in files. But we don't really want to look for title and date unless the string is found. So basically it's IF "fast food" is found after <body> THEN get title and date and insert into new line in a file or to the "results window".
Maybe the script would first use the getlistoffiles found containing the string, and then processing the list to extract title, date and filename?
Any help would be much appreciated once again.
I'm using UE 14.20.
Code: Select all
modified" scheme="W3CDTF" content="[\d]{4}-[\d]{2}-[\d]{2}"
For example we'll search for "fast food" (case insensitive) would output the following results to a file or to the results window where .
I don't need to display the lines it is found in, just the title, date and filename. File extensions could be .htm, .html, .asp and the fields could be separated by ; for importing into a spreadsheet.MacDiddles made record profits again;2012-11-08;\news\food\restaurants\somefile.htm
Government of lala land goes on battle against obesity;2010-10-25;\government\policies\obesityxyz.html
...
Code: Select all
<html>
<head>
<title>MacDiddles made record profits again</title>
.....
<meta ...modified" scheme="W3CDTF" content="2012-11-08">
......
<head>
<body>
............... The fast food giant reports.....
</body>
</html>
Maybe the script would first use the getlistoffiles found containing the string, and then processing the list to extract title, date and filename?
Any help would be much appreciated once again.
I'm using UE 14.20.