Nested function list for all modules in all files of a project

harleygnuya · Oct 06, 2017#12017-10-06T00:32+00:00

I'm writing macros for Excel workbooks. I have a lot of Functions, Subs and Properties declared, which is making the function list rather long and difficult to navigate. For example, other than actually double-clicking on the function, there's no way to tell which module a particular function is in if functions defined in two different modules have the same name.

To make it easier to navigate the function list, I'd like to add a level to the hierarchy to group the functions into their respective modules, so the list would look something like this:

Code: Select all

Modules
    - Module1
        - Functions
            - Functon1
                - Parameters
                    Param1
           + Function2
        + Subs
            .
            .
           .
    + Modulen

This not only groups them nicely, but it would allow me to collapse the modules I in which I have no interest, while keeping visible those in which I do have an interest. (More flexible than deselecting List for all project files.) This would also be useful for object-oriented languages, in that it would allow one to group methods and their attendant parameters by class.

Looking at the Modify Groups dialog and at the .UEW file itself, it appears that one can nest the groups to more than the two levels that I've seen in all the .UEW files I've looked at, so I tried the following, first by using the Modify Groups dialog, then later by modifying the .UEW file directly. This is the code I wound up with:

Code: Select all

/TGBegin "Modules"
    /TGFindStr = "^\s*Attribute VB_Name\s+=\s+.*\<(\w+)\W\s*$"
    /TGBegin "Functions"
        /TGFindStr = "^[\t ]*(?:public[\t ]+|private[\t ]+)*function[\t ]+([a-z][0-9a-z_]*)[\t ]*\("
        /TGBegin "Parameters"
            /TGFindStr = "\s*([^,]+)"
            /TGFindBStart = "\("
            /TGFindBEnd = "\)"
        /TGEnd
        /TGBegin "Return value"
            /TGFindStr = "[\t ]*(.+)"
            /TGFindBStart = "\)[\t ]+as[\t ]+"
            /TGFindBEnd = "$"
        /TGEnd
    /TGEnd
    /TGBegin "Subs"
        /TGFindStr = "^[\t ]*(?:public[\t ]+|private[\t ]+)*sub[\t ]+([a-z][0-9a-z_]*)[\t ]*\("
        /TGBegin "Parameters"
            /TGFindStr = "\s*([^,]+)"
            /TGFindBStart = "\("
            /TGFindBEnd = "\)"
        /TGEnd
    /TGEnd
    /TGBegin "Properties"
        /TGFindStr = "^[\t ]*(?:public[\t ]+|private[\t ]+|friend[\t ]+)*(?:static[\t ]+)*property[\t ]+get[\t ]+([a-z][a-z0-9_]*)[\t ]*\("
        /TGBegin "Parameters"
            /TGFindStr = "\s*([^,]+)"
            /TGFindBStart = "\("
            /TGFindBEnd = "\)"
        /TGEnd
    /TGEnd
/TGEnd

Unfortunately, when I apply the .UEW file and refresh my function list, the modules show up, but none of the underlying Functions, Subs or Properties. When I remove the top layer (Modules) from the .UEW file, it goes back to correctly showing the Functions, Subs and Properties.

I'm pretty sure that I know what the various /TGx statements do, but I don't feel confident about my understanding of how they work together to create a hierarchy that works. For example, what is required at each level? What are the subtleties about what TGFindBStart and TBFindBEnd actually do?

Is there a limit that prevents one from nesting the groups to more than two levels? If not, what do I need to do to my code above to make it work the way I want it to? Is there some documentation somewhere that discusses this? If so, I've not been able to find it.

Thanks

Mofi · Oct 06, 2017#22017-10-06T05:12+00:00

I would need an example file being syntax highlighted with the wordfile containing the posted regular expressions.

I suppose that the wordfile contains also /Regexp Type = Perl as otherwise the Perl regular expression strings would not work at all.

I have already one hint:
Don't use \s as replacement for [\t ]. \s matches any whitespace character according to Unicode standard which includes the newline characters carriage return and line-feed. For that reason a Perl regular expression starting with ^\s*Attribute VB_Name is problematic because of matching also all empty and blank lines above the line with Attribute VB_Name at begin. So for UltraEdit the function does not start necessarily in line with Attribute VB_Name at begin, but in first empty/blank line above the line with Attribute at begin.

harleygnuya · Oct 06, 2017#32017-10-06T16:37+00:00

Thanks for responding, Mofi.

I made the change to the \s that you suggested. It's nice to know the difference between the two, but it made no difference in the functioning in this case. I used the visualbasic.uew wordfile as a template and modified it to get what I have now.

I've attached a zip file that contains my .uew file, two sample .bas files and also my project definition files (.prj, .pui).

Again, what I'm trying to do is nest things in this way:

Code: Select all

Modules (each file is a module)
    Functions
    Subs
    Properties

As I am sending it to you, the wordfile has all of these active, and all that shows up in the function list are the modules. If you comment out lines 2, 3 and 33, which removes the Modules level, the function list does properly show all of the Functions, Subs and Properties.

Yes, the wordfile does have the /Regexp Type = Perl statement.

Thanks for your help on this.

Mofi · Oct 07, 2017#42017-10-07T13:01+00:00

Thanks for the example files and the project/workspace which were very helpful to understand the problem.

The definitions for Functions, Subs and Properties miss the definition for the Open tag and Close tag, i.e. the /TGFindBStart = and /TGFindBEnd = definitions for those subgroups within group Modules. A subgroup definition without the regular expressions to define begin and end of the character stream to search results in an empty character stream and so the regular expression search(es) of the subgroup are not executed at all.

/TGFindBStart = must be a regular expression string which is executed for the groups Functions, Subs and Properties from the start of the line on which module string was found. /TGFindBEnd = must be a regular expression string which is executed from end of the found string of /TGFindBStart =. The character stream between the matches of those two regular expressions define the block for the regular expression searches of the subgroup.

That explains why \s*([^,]+) works for Parameters as the block searched by this expression is just the character stream between opening parenthesis ( and next found closing parenthesis ).

Use the following definition in the wordfile to get your grouped function list working as expected with using List for all project files option of Function List enabled.

Code: Select all

/TGBegin "Modules"
/TGFindStr = "^[\t ]*Attribute VB_Name[\t ]+=[\t ]+"\<(\w+)"[\t ]*$"
/TGBegin "Functions"
/TGFindStr = "^[\t ]*(?:public[\t ]+|private[\t ]+)*function[\t ]+([a-z][0-9a-z_]*)[\t ]*\("
/TGFindBStart = "Attribute VB_Name"
/TGFindBEnd = "(?=Attribute VB_Name)|\z"
/TGBegin "Parameters"
/TGFindStr = "\s*([^,]+)"
/TGFindBStart = "\("
/TGFindBEnd = "\)"
/TGEnd
/TGBegin "Return value"
/TGFindStr = "[\t ]*(.+)"
/TGFindBStart = "\)[\t ]+as[\t ]+"
/TGFindBEnd = "$"
/TGEnd
/TGEnd
/TGBegin "Subs"
/TGFindStr = "^[\t ]*(?:public[\t ]+|private[\t ]+)*sub[\t ]+([a-z][0-9a-z_]*)[\t ]*\("
/TGFindBStart = "Attribute VB_Name"
/TGFindBEnd = "(?=Attribute VB_Name)|\z"
/TGBegin "Parameters"
/TGFindStr = "\s*([^,]+)"
/TGFindBStart = "\("
/TGFindBEnd = "\)"
/TGEnd
/TGEnd
/TGBegin "Properties"
/TGFindStr = "^[\t ]*(?:public[\t ]+|private[\t ]+|friend[\t ]+)*(?:static[\t ]+)*property[\t ]+get[\t ]+([a-z][a-z0-9_]*)[\t ]*\("
/TGFindBStart = "Attribute VB_Name"
/TGFindBEnd = "(?=Attribute VB_Name)|\z"
/TGBegin "Parameters"
/TGFindStr = "\s*([^,]+)"
/TGFindBStart = "\("
/TGFindBEnd = "\)"
/TGEnd
/TGEnd
/TGEnd

I specified the case-insensitive interpreted string Attribute VB_Name as begin of the block after which to search for Functions, Subs and Properties.

The regular expression (?=Attribute VB_Name)|\z for the end of the block does not match any character. It's an OR expression with two arguments. The first argument is a positive lookahead for Attribute VB_Name. So when one more line with Attribute VB_Name is found in a file, the character before (a whitespace character) is interpreted as end of current module in file. The second argument is \z which means end of buffer.

If it is impossible that a *.bas file contains more than one module, modify three times (?=Attribute VB_Name)|\z to just \z to speed up the searching for end of a module block.

Some additional information about the Perl regular expression escape sequences \A and \z.

UltraEdit and UEStudio use the Boost Perl regular expression library which explains \A and \z as follows:

\A Matches at the start of a buffer only (the same as \`).
\z Matches at the end of a buffer only (the same as \').

But what is a buffer and how large is it? Is \A equal begin of file and \z equal end of file?

Well, the buffer is whatever the application using the library has defined internally as block in memory holding the characters of a text file or the bytes of a binary file and passing to the library function.

But UltraEdit can be used for viewing and editing small files with just a few KiB to really huge files with more than 8 GiB on computers having less RAM installed as an opened file could have. UltraEdit loads large files only in chunks/blocks into memory as otherwise UltraEdit would fail to edit files larger than total installed memory and of course also being larger the largest free block available in RAM managed by the operating system. The largest free memory block available in RAM is nearly always much smaller than the total free memory as the memory becomes very quickly fragmented on starting and exiting applications.

For that reason running a Perl regular expression search with \A or with \z could result in a positive match more than once per file prior UltraEdit for Windows v24.00 and UEStudio v17.00. This means on using for example \z in a Perl regular expression Find/Replace on an opened large file or in Find/Replace in Files that the entire search string can result in a positive match more than once and not just on end of file.

So for UE < 24.00 and UES < 17.00 \A really means begin of current file buffer and \z means end of current file buffer. How large the current file buffer is on running Find/Replace or Find/Replace in Files with \A or \z is unpredictable. But I have never seen for files not larger than 512 KiB that \A is not begin of file and \z is not end of file. Please note that I have not run excessive tests with \A or \z on various file sizes with various versions of UE/UES on various computers with different amount of installed RAM with various other applications running to find out how large a file buffer is at least in any case. Well, the Windows kernel functions for buffered file access use a buffer of at least 64 KiB. That means a file not larger 64 KiB can be modified always safely with a replace using \A or \z.

The file changes.txt in program files folder of UltraEdit contains for v24.00 the following line:

Added support for Perl regexp buffer boundaries (\A, \z, etc.)

The same line can be found in changes_ues.txt in program files folder of UEStudio for v17.00.

What does this a bit unclear information mean?

It means that with UltraEdit v24.00 or UEStudio v17.00 or any later version of UE/UES the buffer boundary escape sequence \A matches always the begin and \z the end of a file independent on file size because of extra code added to program code of UE/UES to guarantee that even for large and huge files. So it is possible now with UE v24.00+ and UES v17.00+ to use \A to find/replace something at begin of a file or \z to find/replace something at end of a file independent on file size and the file buffer sizes.

But please take into account that the number of bytes which can be kept in memory is still not unlimited. So a Perl regular expression replace with search string (?s)\A(.+)\z and with replace string \1\1 cannot be used for all files independent on size to duplicate the file contents of a file.

harleygnuya · Oct 07, 2017#52017-10-07T14:03+00:00

Thanks, Mofi. It works perfectly!

I also really appreciate your explaining how and why it works, and the extra information you provided. This is exactly what I was looking for.