About functions within a block comment:
As I wrote, code in comments are scanned by the regular expressions as well. It would simply make the function string search done in background slower if first the entire file must be scanned for block and line comments and then the remaining parts are scanned by the regular expressions. In other words the users have to wait longer before the functions are displayed in function list view. Do we want that just to not see commented functions?
My C/C++ files contain many, many comments, but no commented functions. I do not want to wait longer to see the functions after opening a file or switching a file just because UltraEdit first runs a comment recognition scan. For my smaller projects I have also often enabled showing functions of all project files. Showing this list would take even longer with first scanning for comments.
Also UltraEdit is a general text editor not only used for programming languages. Others like me too use it for other text files and have created also syntax highlighting languages with function strings for those files. Some of those files can be very large (several hundred MB or even GB). Scanning first those file for comments (which also exist in those files) and then run the regular expressions on the remaining text is a challenge for such large files and would result in a dramatic decrease in speed.
I know for C/C++ compilers that they remove in preprocessor step all comments before really interpret the code. But compilers usually do not have to deal with files of several 100 MB as UltraEdit has to do (not for source files, but for other file types). And nesting of block comments is also often possible making it quite more difficult (= more time consuming) to find comments and remove them before scanning for function names (or for whatever the function strings are defined). Sure, UltraEdit finds already comments as they are highlighted as block and line comments and therefore it would be possible to run the regular expression searches only on remaining code with some enhancements.
But what if a user wants in the function list also a function within a block comment because the function is just temporarily commented?
In my point of view functions which are not used should be commented out just temporarily, but not permanently. If an entire function is of no use (anymore), it should be deleted completely from the file and not just commented. For C/C++ programmers writing secure code this is highly recommend according to MISRA C and MISRA C++. Not used functions can be stored in a separate file not being part of a project as good as in a project file.
About parameters of PHP functions:
First, thanks for adding parameters to test file
test.php5. It has made my work really much easier.
As a regular expression search is not working like a compiler or interpreter taking comments, strings or hierarchical information into account (= the context of text), function definitions with string parameters with string values containing 1 or more commas, parentheses, braces or other characters with special meaning outside a string are a huge problem. It is perhaps impossible to define regular expressions taking such string values into account. I have not even tried to find one.
In many programming languages it is not allowed to initialize a function parameter in the function definition line. In C it is not possible at all. In C++ it is possible to define functions with default values, but the default values can be only specified on function declaration, not on function definition. It looks like PHP is different because it is possible to define default values for function parameters within the function definition. Perhaps good for PHP programmers, but bad for correct detection of function parameters with regular expression searches.
Also arrays within arguments list of a function defined by matching
( ...
) are a problem because of the parentheses and the additional commas. I don't know how often this occurs in PHP files. If this could be avoided by defining such functions different (= using a different coding style), it should be done different.
We have to take 2 different use cases into account.
The first one is usage of a hierarchical function list view with displaying the parameters of a function in a subgroup below the function name.
The advantage here is that UltraEdit is searching in this case first for the opening tag
( and next for
matching closing tag
). Therefore a definition of 1 or more arrays with parentheses within the function definition is no problem. Also string values containing both parentheses are no problem. Just a string value containing only
) is a problem as this is being interpreted as matching closing tag. But I'm quite sure that a string value within a function definition containing only
) are very, very rare in PHP files. So it does not matter that the search for finding closing tag returns in this special case the wrong result.
Using the following function definitions results in a quite good hierarchical function list.
Code: Select all
/TGBegin "Function"
/TGFindStr = "%[^t ]++function[^t^p ]+^([a-z_-ÿ][a-z0-9_-ÿ]++^)[^t^p ]++("
/TGFindStr = "%[^t ]++[afps][birtu][abinos][altv][aeilr][^t act][a-filnopr-v ^t]++function[^t^p ]+^([a-z_-ÿ][a-z0-9_-ÿ]++^)[^t^p ]++("
/TGBegin "Parameter"
/TGFindStr = "[^t ^p]++^([~,]+^)"
/TGFindBStart = "("
/TGFindBEnd = ")"
/TGEnd
/TGBegin "Variable"
/TGFindStr = "%[^t ]++^(^$[a-z_-ÿ][a-z0-9_-ÿ]++^)[^t ]++=*;"
/TGFindBStart = "{"
/TGFindBEnd = "}"
/TGEnd
/TGEnd
The only problems I can see are those functions in
test.php5 having 1 or more initialized arrays as the array values are separated also by commas like the function parameters. Especially
phpArguments_13 is a function with an argument list which I think can never be correct listed in function list view because of an initialized array containing 2 initialized arrays. And that's just the easiest possible array definition with more than 1 dimension. No regular expression search will be ever possible to correct detect initialized multi-dimensional arrays.
BTW: One
) was missing at function
phpArguments_12 in supplied
test.php5. I corrected this mistake, packed
test.php5 and replaced
test.zip in above post with this corrected version.
rhapdog, please note that I have also slightly changed the function string for local variables within a PHP function according to general naming rule of PHP.
However, what can we do to improve the function list view result for those functions with initialized arrays. Well, we can add additional regular expressions to subgroup
Parameter like following:
Code: Select all
/TGBegin "Parameter"
/TGFindStr = "[^t ^p]++^(^$[~,()]+(*)^)"
/TGFindBStart = "("
/TGFindBEnd = ")"
/TGFindStr = "[^t ^p]++^([~,]+^)"
/TGFindBStart = "("
/TGFindBEnd = ")"
/TGEnd
Now simple initialized arrays are displayed at least also complete in the function list view additional to the wrong parameter list. For example
phpArguments_11 and
phpArguments_12
Code: Select all
phpArguments_11
Parameter
$foo = array(1,2,3,4)
$foo = array(1
2
3
4)
phpArguments_12
Parameter
$foo = array('test' => 5, 'hello', 'foo' => "hehe")
$foo = array('test' => 5
'hello'
'foo' => "hehe")
Unfortunately UltraEdit does not take out already found strings within the block defined by opening and matching closing tag because that would avoid that second regular expression would find also strings.
I will think about redefining the second expression to be less general so that it does not find strings already found by first regular expression for parameters with a simple initialized array. Perhaps with 2 or more regular expressions not finding parameters with an array we can avoid that garbage in the parameter list in function list view.
The second use case is using a flat list which is the only available list for UltraEdit prior v16.00. We do not have much options here because no search for a matching
) can be done with a regular expression without additional code support like available for the hierarchical function list view.
I can only offer
Code: Select all
/Function String = "%[^t ]++function[^t^p ]+^([a-z_-ÿ][a-z0-9_-ÿ]++[^t^p ]++([~)]++)^)"
/Function String 1 = "%[^t ]++[afps][birtu][abinos][altv][aeilr][^t act][a-filnopr-v ^t]++function[^t^p ]+^([a-z_-ÿ][a-z0-9_-ÿ]++[^t^p ]++([~)]++)^)"
which stops matching characters after
( on first found
) instead of really matching
), or
Code: Select all
/Function String = "%[^t ]++function[^t^p ]+^([a-z_-ÿ][a-z0-9_-ÿ]++[^t^p ]++([~{]+^)"
/Function String 1 = "%[^t ]++[afps][birtu][abinos][altv][aeilr][^t act][a-filnopr-v ^t]++function[^t^p ]+^([a-z_-ÿ][a-z0-9_-ÿ]++[^t^p ]++([~{]+^)"
which matches everything after
( up to beginning of function body. UltraEdit removes line terminating characters on found string resulting in a single line string for the function list view.