Next up...
martix wrote:...
The next part is optional reading, but I believe if I'm not mistaken that it shows a bug: ...
Code: Select all
var txt = "Sample (Entry 1)- 1st line lalala\n2nd line from entry 1 to match\nSecond line to match (Entry 2)- one liner test";
//var RE = new RegExp("^.*?\\(.*?[0-9]\\).*?(\\n((?!\\(.*?[0-9]\\)).)*$|$)", "gm");
var RE = /^.*?\(.*?[0-9]\).*?(\n((?!\(.*?[0-9]\)).)*$|$)/gm;
var Mucho = RE.exec(txt);
UltraEdit.outputWindow.write(Mucho[0]);
var i = Mucho.length;
UltraEdit.outputWindow.showWindow(true);
UltraEdit.outputWindow.write("Matches found:\r\n" + i);
UltraEdit.newFile();
for (var x = 0; x<i; x++){
UltraEdit.activeDocument.write("{"+Mucho[x]+"}");
UltraEdit.activeDocument.write("[Match "+x+"]")
}
...
No bug. Your script runs just fine and its output is correct (although at first glance, it may appear to be giving erroneous results). And both forms of the regex ("string" and /RegExp/) work equally well. Running this script from UE v14 produces the following in the output window:
Code: Select all
Running script: martix_20090823.js
========================================================================================================
Sample (Entry 1)- 1st line lalala
2nd line from entry 1 to match
Matches found:
3
Script succeeded.
And this is what appears in the newly created "UE.activeDocument":
Code: Select all
{Sample (Entry 1)- 1st line lalala
2nd line from entry 1 to match}[Match 0]{
2nd line from entry 1 to match}[Match 1]{h}[Match 2]
Now you are probably thinking: "3 matches? WTF? the regex should have found only two matches!" And... "just what the heck is that third "h" match[2] anyway?!" Well, the problem here is that you are using the RE.exec() method, which always returns an array containing
all the details of just one match. That is, when RE.exec() is run, it finds the first match (which is the first two whole lines of the subject string), and returns an array with the details of that match having the following members:
Code: Select all
Mucho[0] = the overall match = (the first two lines from the subject).
Mucho[1] = the contents of capturing group 1 = "\n2nd line from entry 1 to match"
Mucho[2] = the contents of capturing group 2 = "h"
These are indeed the correct details of the first match of the given regex and subject string. Your regex has two capturing groups; one that matches the optional second line and another which repeatedly matches one char of the second line - thus there are 3 "Matches". (If your regex had 4 capturing groups, the array returned by RE.exec() would have 5 members.) Now, if you run the RE.exec() a second time, it will find the second overall match and return the details of that match in another array. This is because the RE object has a "lastIndex" property, which keeps track of the position of the last match and tells it where to begin the next search on a target string. (If you inspect RE.lastIndex with your given script, you will see that it is set to 64 after the first and only run of RE.exec().)
To get an array containing all the matches in your subject text, use the string "match()" method (instead of the RegExp "exec()" method). In other words, change this:
To this:
I think that you will find that this change will provide the results you expected... Note for more detailed information about all the regular expression methods available in Javascript (i.e. there are five of them: str.search(), str.replace(), str.match(), re.exec() and re.test()), I highly recommend reading:
"Javascript: The Definitive Guide - 5th Edition" by David Flanagan. In fact, you can read most of chapter 11 (which covers regex) for free
at this link (start on page 199).