Mofi wrote:I finally got a working wordfile using grouped function list feature for your example file with the additional code examples you posted. In the RAR file there are:
vhdl93.uew - the wordfile for VHDL for all UltraEdit versions supporting grouped respectively hierarchical function list.
vhdl93_old_style.uew - the wordfile with old style function strings for older versions of UltraEdit.
Well, Mr. Mofi, you're a genius.
THANK YOU!!
It's working very well, and you made more than I'd hope at beginning (like separate parameters of procedure and process).
I want to ask you something more, I don't have much time now, but soon I will try to understand what you did in order to re-use it for other little thing, and I wonder if I can ask you some help if I don't understand some tips or code you use?
The first regular expression search string is for finding procedures with arguments in round brackets. The second one is for finding procedures without arguments.
% means start of line.
[ ^t]++ is for finding zero or more spaces/tabs. As you can see this expression is used several times in all the regexp search strings. Everywhere were spaces/tabs can exist, but must not be present, this expression is used. After % this expression matches preceding whitespaces at beginning of a line.
A procedure definition line must contain next the string procedure (in any case because UE runs the searches always not case-sensitive).
[ ^t]+ matches one or more spaces/tabs. So there must be a space character between the string procedure and the next word.
[0-9a-z_]+ matches the name of the procedure consisting of any letter in any case, any digit and optionally 1 or more underscores. This expression is enclosed by ^(...^) which tells UltraEdit that this part of the found string should be displayed in the function list view.
The second search string searches next again for 1 or more spaces/tabs and then the next word must be is to identify the found string as procedure definition line without arguments in round brackets.
The first string continues with [ ^t]++ matching optional spaces/tabs before next ( must be found.
Your examples showed me that the arguments can span over multiple lines. Therefore I made it simple and used [~)]+ to match a string with 1 or more characters up to ). The negative character set definition includes also line ending characters.
After closing round bracket there must be 1 or more spaces/tabs before the word is to identify the found string as procedure definition line with arguments in round brackets.
/TGBegin "Argument"
/TGFindStr = "^([0-9a-z_]++:[ in]+[a-z_]+^)"
/TGFindBStart = "procedure"
/TGFindBEnd = "begin"
The first problem I had to solve for listing the arguments of a procedure was that there are procedures without arguments in round brackets. Therefore using ( as block start string and ) as block end string could not be used because UltraEdit would then search down in the file until an opening and closing bracket is found and would search in this block with the defined regexp search string.
But I could see that all procedures have in common that there is the word procedure and the line below the procedure definition line has the word begin.
So now just a regular expression search string is needed which finds procedure arguments and ignores everything else between these 2 words. For the procedure arguments that was not really difficult.
[0-9a-z_]+ matches again a string with 1 or more digits, letters or underscores.
I decided now to simple search next for 1 or more spaces with the space character followed by + instead of using [ ^t]+ or [ ^t]++. I don't have seen in your procedure examples with no space before the colon or a tab before the colon. Therefore just 1 or more spaces are searched for.
The next character must be a colon.
In your examples there is next always a space, the word in and another space after the colon, or just a space without in. Again I thought simple and search with [ in]+ for a string consisting of 1 or more spaces or letter i or letter n (both letters in any case).
[a-z_]+ matches a string consisting of letters and underscores. This matches the type of the argument and I don't have seen on your examples the a type string contains a digit.
/TGBegin "Process"
/TGFindStr = "%[ ^t]++^(process^)[ ^t]++([~)]+)"
/TGFindStr = "%[ ^t]++^([0-9a-z_]+^)[ ^t]++:[ ^t]++process"
The first string is for finding processes without a name with arguments and therefore the word process itself is tagged for being listed in the function list view. The second string finds processes with name left to keyword process with a colon as separator. The 2 strings do not contain a not already explained expression, therefore no more details.
/TGBegin "Argument"
/TGFindStr = "^([a-z0-9_]++:+[a-z_]+^)^{;^}^{[ ^t]++)^}"
/TGFindBStart = "process"
/TGFindBEnd = "begin"
/TGFindStr = "^([a-z0-9_]+^),"
/TGFindBStart = "process"
/TGFindBEnd = "begin"
/TGFindStr = "[(,][ ^t]++^([a-z0-9_]+^)[ ^t]++)"
/TGFindBStart = "process"
/TGFindBEnd = "begin"
To find the arguments of processes was really tricky for me because of the 2 different kind of arguments: simply words separated with commas and more complex strings with name + colon + type.
It was clear for me that at least 2 search strings are needed to find the 2 different kind of arguments. But the hard work was that parts of the complex strings are not found also by the easier expression searching for the comma separated words.
The first regexp search string is similar to the search string used to find procedure arguments. There is the difference of just space with + to find only 1 or more spaces after the colon instead of [ in]+ to include optionally also the word in. The second difference is that there is an OR expression appended to match either the semicolon or the closing round bracket of the process after optional spaces/tabs. The second argument of the OR expression is for the last argument inside the round brackets of a process not having a semicolon.
The second regular expression search string executed on the same block as the first one simply finds words consisting of digits, letters or underscores with a comma following. That matches all arguments of process p_Period_MUX, except the last argument because that has no comma following.
So I needed a third search executed on same block as the other 2 searches which finds either the last argument of a comma separated list of process arguments, or the only argument of a process having only 1 argument. I needed a while to find this expression although it is very simple because the regular expression I used before always found also words in the arguments list with a colon.
[(,] matches a opening round bracket (for a process with a single argument) or a comma. That expression excludes the words of the arguments separated by a semicolon. Easy, isn't it. But I needed a while to see that.
[ ^t]++ is well known, it matches 0 or more spaces/tabs.
[a-z0-9_]+ matches the argument name which should be displayed in the function list view.
Then again optionally spaces/tabs can followed matched by [ ^t]++.
The last character must be a closing round bracket. That is important because it excludes the comma separated arguments found already by the second search string and therefore matches only the last argument of a list of arguments or the only argument of a process.
But one question might remain: Why 2 regular expressions for matching the comma separated arguments? Why not simply use a combination of second and third search string which would be [(,][ ^t]++^([a-z0-9_]+^)[ ^t]++[,)] ?
Well, the reason is quite simply. That would require that UltraEdit goes back after every found argument which UltraEdit does not do and lookahead like Perl regexp engine supports is not supported by UE regexp engine. Therefore this search string would match from
A small change in 2 function strings results in finding also procedures and processes with opening round bracket on next line. ^p is for line ending characters (carriage return and/or line-feed).
Please note that ^p usually means DOS line terminator (carriage return + line-feed). But in the regular expression search strings for the function list view UltraEdit interprets ^p as definition for any line terminator type: DOS, UNIX or MAC.
Well, for your example with the round brackets inside the round brackets on last argument, there is a solution. But if the round brackets would be anywhere else in the arguments list and not on last argument, it would be necessary to simplify the first UE regular expression string to just
Well, try this block for procedures in vhdl93.uew. But please note that there are limitations for finding variations of strings. A regular expression engine is not an interpreter which uses program code with lots of if ... else ... conditions to interpret a source code. As you can see I needed to expand the first search string for procedure arguments to exclude substrings of strings found by second regular expression for arguments with round brackets. The question for me is: Is it really important to see also (5 downto 0) in the function list?
I know that regular expression has limitations but as i don't know very well this tool i can't say what it's possible to do or not.
I try your block, and it seems to work.
For your question about see or not (5 downto 0), it's not really important, but it can be helpful in some case.