There is no difference in PHP function list behavior between 32-bit and 64-bit UltraEdit.
But I could reproduce the issue with your configuration and find out the cause and a workaround.
There are two issues:
- There is configured UTF-8 as Default encoding (for new files and file open when auto-detect fails) at Advanced - Settings or Configuration - File handling - Encoding which results in getting file oxid.php containing only ASCII characters and having no BOM loaded as UTF-8 encoded file. The reason is that auto-detecting encoding fails here as a file containing only ASCII characters and no BOM could be interpreted as UTF-8 or as ANSI encoded. The PHP file would be interpreted as ANSI encoded if there would be at least one non ASCII character with a code value greater 0x7F like an ANSI encoded German umlaut.
The default for this configuration setting is ANSI which results in getting interpreted this file as Windows-1252 encoded file according to Windows region and language settings on my Windows computers.
- The Perl regular expression engine fails to find any string with the regular expressions in standard wordfile php.uew containing \x7f-\xff on Unicode files. This is really unexpected as the file does not contain any non ASCII character. That is the real issue causing an empty function list for all PHP files being UTF-8 or UTF-16 encoded.
The quick solution is downloading the attached ZIP file and extract included
php.uew into directory
%APPDATA%\IDMComp\UltraEdit\wordfiles overwriting already existing file.
This wordfile contains
\x{007f}-\x{00ff} instead of
\x7f-\xff in all regular expressions containing this character range definition. Then the Perl regular expression search works for ANSI
and Unicode encoded PHP files. I have made also some other minor improvements on some other Perl regular expressions.
UTF-8 selected for
Default encoding (for new files and file open when auto-detect fails) can be kept because of saving UTF-8 encoded files is configured always without BOM in your configuration. So there is no difference in binary representation of the characters for PHP files containing only ASCII characters and you obviously prefer UTF-8 encoding anyhow.
I will report the issue to IDM support by email.
What is unclear for me is the basic syntax description for PHP labels (function names, variable names, ...).
PHP manual for user-defined functions explains:
Function names follow the same rules as other labels in PHP. A valid function name starts with a letter or underscore, followed by any number of letters, numbers, or underscores. As a regular expression, it would be expressed thus: [a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*.
Is the regular expression for the byte stream read by PHP interpreter or for characters taking character encoding into account?
A function name like
OmegaΩ in a UTF-8 encoded file name would be encoded with the bytes 0x4F 0x6D 0x65 0x67 0x61 0xE2 0x84 and so the case-sensitive regular expression
[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]* matches this function name on being run on byte stream. But the same regular expression does not match
OmegaΩ on being run on character stream as also case-insensitive
[a-z_\x{007f}-\x{00ff}][0-9a-z_\x{007f}-\x{00ff}]* used in
php.uew in attached ZIP file.
I think the right expression for
php.uew would be
[a-z_\x{007f}-\x{fffd}][0-9a-z_\x{007f}-\x{fffd}]* to match any Unicode character with a code value in range U+007F to U+FFFD. I know this is more or less theoretical because most labels in PHP files are most likely mainly using only ASCII characters with a code value lower than 0x7F. However, when we change the Perl regular expressions in
php.uew now, we should change it to correct expression to match also unusual function and variable names.
Could a PHP programmer test if a function name
OmegaΩ in a UTF-8 encoded PHP file really works by writing a function with that name outputting something on being called.
PS: I have added also function name
mysqli_query to color group 6 as this name of built-in function mysqli_query available since PHP 5.0 was missing in wordfile.
Update 1: I modified
php.uew once more and replaced all occurrences of
\x{00ff} by
\x{fffd}.
Update 2: The attached wordfile
php.uew was once more updated on 2019-05-08 with three additional functions added to color group 6 and with some words moved to other color groups.
Update 3: The wordfile
php.uew in attached ZIP file is installed by default with UltraEdit for Windows since v26.10.0.72 and UEStudio since v19.10.0.46 in subdirectory
wordfiles in program files directory of UltraEdit/UEStudio. This updated
php.uew must be just copied manually on using default paths after an update or upgrade of UltraEdit/UEStudio from
- %ProgramFiles%\IDM Computer Solutions\UltraEdit\wordfiles
or
%ProgramFiles(x86)%\IDM Computer Solutions\UltraEdit\wordfiles
or
%ProgramFiles%\IDM Computer Solutions\UEStudio\wordfiles
or
%ProgramFiles(x86)%\IDM Computer Solutions\UEStudio\wordfiles
to
- %APPDATA%\IDMComp\UltraEdit\wordfiles
or
%APPDATA%\IDMComp\UEStudio\wordfiles