Tapatalk

Help Needed: Script Running Slow with Large Files

Help Needed: Script Running Slow with Large Files

681
Advanced UserAdvanced User
681

PostNov 12, 2025#1

Hello everyone,

I've developed a JavaScript script for HTML/XML validation, and while the core logic is fast, I'm facing a significant performance bottleneck when reading large files or processing files in batch. I'm hoping to get some advice on a more efficient approach.

My Current (Slow) Method

Currently, I read file content by loading the file into the active document and using `selectAll()`:

Code: Select all

// This is the part of the script that is slow
UltraEdit.activeDocument.selectAll();
var fileContent = UltraEdit.activeDocument.selection;

// ... validation logic using fileContent runs here ...
This method works, but it struggles under these conditions:
- Processing larger files
- Running batch validation on many files (like 100+ files)
- The script takes minutes instead of seconds

Script Overview
To provide context, my script performs the following tasks:
- Validates HTML/XML files for errors
- Checks table structures, links, and references  
- Generates detailed error reports
- Can process single files or entire folders

What I Need Help With:

1. Is there a faster way to read file content? The `selectAll()` method seems to be the bottleneck.

2. Can I read files without opening them in the editor? When processing 100 files, opening and closing each one takes a lot of time.

3. Any tips for handling large files better? Some files I work with are 5MB+.

Important Note: UEStudio vs. UltraEdit Behavior

I've discovered a critical difference in behavior between UEStudio and UltraEdit:
   Works: The script runs correctly in UEStudio v12.20.0.1004.
   Fails: The script is either non-functional in UltraEdit v27.00.0.22.

This suggests there might be a difference in the scripting engines or available objects between the two applications that could be related to my issue.

The validation logic itself runs fast - it's just the file reading that's slow. I'm looking for any method or trick to read files faster in UltraEdit's JavaScript.

My validation logic itself is not the problem—the bottleneck is purely in how I'm getting the data into the script. I have attached my script for your review.

Any advice on alternative functions or a better overall strategy for this task would be greatly appreciated.

Thank you
Validator_HTML_XML.zip (32.22 KiB)   0

6,825625
Grand MasterGrand Master
6,825625

PostNov 12, 2025#2

1. Is there a faster way to read file content? The `selectAll()` method seems to be the bottleneck.

The selectAll() scripting command is the only scripting command to select the entire contents of a file and load it into the memory of the built-in JavaScript engine as string for processing. You could try if using the following code reduces the time to finish the script execution on running the script on multiple files in a folder.

Code: Select all

// At top of the function main:
UltraEdit.selectClipboard(9);

// Use instead of

UltraEdit.activeDocument.selectAll();
var fileContent = UltraEdit.activeDocument.selection;

// the lines:

UltraEdit.activeDocument.selectAll();
UltraEdit.activeDocument.copy();
var fileContent = UltraEdit.clipboardContent;
That is perhaps faster, especially in old version of UEStudio. I have never made a performance test between these two methods on loading the file contents into a JavaScript String object.

Note: UltraEdit..clipboardContent is a JavaScript String object like fileContent. It is therefore possible to use directly UltraEdit..clipboardContent wherever is currently used fileContent in function main.

UltraEdit.activeDocument.top() after selecting all and getting the contents of the file loaded into the memory of the JavaScript engine to discard the selection is definitely counterproductive here as it causes an extra update of the document window and the status bar. That command should be removed from the script.

It would make sense for the single file option to get the current line number and column number before using UltraEdit.activeDocument.selectAll() and move the caret finally back to the original position before exiting the script.

2. Can I read files without opening them in the editor? When processing 100 files, opening and closing each one takes a lot of time.

The only scripting command capable processing files without opening them in the GUI and therefore without applying all text editing features is the command: UltraEdit.frInFiles.replace()
But the Replace in Files scripting command cannot be used in this case to reformat all the HTML files as far as I could see on the script code.

It is for sure slow to write a large string with something like UltraEdit.activeDocument.write(html.join('\n')) in older versions of UE/UES into a file. I used in the past as workaround for a much faster write operation:

Code: Select all

// With using the user clipboard 9 as selected at top of function main.
UltraEdit.clipboardContent = html.join('\n');
UltraEdit.activeDocument.paste();

// The last lines of the function main should be in this case:
UltraEdit.clearClipboard();
UltraEdit.selectClipboard(0);
3. Any tips for handling large files better? Some files I work with are 5 MB+.

It is in general more efficient for processing a file with an UltraEdit script to avoid document window or entire application window (status bar, document map view etc) updates as much as possible during execution of a script.

It is also useful to disable all features usually used by UltraEdit on opening an HTML file like syntax highlighting and all features depending on syntax highlighting like code folding, counting all lines for line number display, etc. All UltraEdit features resulting in parsing the entire file contents before displaying the first part of the opened file are counterproductive on processing all files in folder with hundreds of files. There have been moved in newer versions of UE/UES some of the parsing routines in threads to more quickly open dozens of files as most users working with many files opened at once have the document windows maximized and so the parsing for code folding and the line counting can be done in background by UE/UES after the files have been opened all and showing just a part of the active file in application window.

It would require running UES or UE from the command line with a special INI file specified with the appropriate command line option with all those features disabled which are not needed on processing multiple files stored in the special INI file. Other command line options like forcing the use of a new instance and the automatic execution of the script would be needed as well on launching UE/UES just for processing all HTML files in a folder (tree).

Updates of the output window during script execution are also counterproductive. Every graphical window update takes several milliseconds which sum up to lots of seconds in total script execution time.

Script interpreters like Node.js or PowerShell process files much faster because of not showing anything in a graphical window during script execution, perhaps with exception of short text information written into a console window. The Windows console window is extremely optimized by Microsoft's console developers for fast updates. There are some very interesting Microsoft developer blogs which explain the efforts made by the Microsoft developers to make the console window really as fast as possible.

UEStudio vs. UltraEdit behavior

There have been several changes in behavior of the UE/UES scripting commands between UEStudio v12.20.0.1004 and UltraEdit v27.00.0.22.

One example is the number 0 versus the number 1 of UltraEdit document property currentColumnNum for the first column in a file while the command gotoLine requires the column numbers always in range 1 to n. I use in my public published script for that reason:

Code: Select all

var nColumnNumber = UltraEdit.activeDocument.currentColumnNum;
if (typeof(UltraEdit.activeDocumentIdx) == "undefined") nColumnNumber++;
The property currentColumnNum of those UE/UES versions not supporting activeDocumentIdx has the value 0 to n-1 for the columns in a line while the command gotoLine and other commands with support for column number must be nevertheless used with a column number in range 1 to n.

The caret position is different after making a selection in some uses case between older and newer versions of UE/UES. In some use cases the caret is at the beginning of the selection and in others at the end of the selection depending on version of UE/UES and how the selection was created before.

There are some more small differences which must be considered on writing a script for use by old and new versions of UE/UES.

The function padLeft has as third parameter char which is a reserved word in JavaScript. Rename that function parameter to something else which is not a reserved word.

It is impossible for me to tell you why the script fails with UltraEdit v27.00.0.22 without having an HTML example file which can be successfully processed by the script with UEStudio v12.20.0.1004 but not with UltraEdit v27.00.0.22.

681
Advanced UserAdvanced User
681

PostNov 13, 2025#3

Hello Mofi,

Thank you so much for taking your valuable time to help with this issue! Your suggestions worked perfectly and made a significant difference.

What I implemented based on your advice:

1. Clipboard method (`selectClipboard(9)` with `copy()/clipboardContent`) - This improved performance noticeably, especially for batch processing
2. Removed unnecessary `top()` commands - Eliminated the UI updates that were slowing things down  
3. Fixed the `padLeft` parameter - Changed 'char' to 'padChar' to avoid the reserved word issue
4. Optimized file handling - Now closing files immediately after copying to clipboard, then validating from memory

Results:
- The script now works perfectly in both UEStudio v12.20.0.1004 and UltraEdit v27.00.0.22 ✓
- Batch processing is approximately 30-40% faster
- Single file validation also shows improved performance
- UI freezing during large batch operations has been reduced compared to before

I've attached the updated script (v6.5) which incorporates all your suggestions. The performance improvement is remarkable - batch processing 100+ files that used to take 5+ minutes now completes in about 3 minutes.

One small question:
When clicking on errors in the Output Window (formatted as `'filepath(line:column): ERROR: message'`), UltraEdit only navigates to the start of the error line, not to the exact column position. Is there a way to make the output window navigation jump to the exact line AND column position? Or is this a limitation of the output window formatting?

Thank you again for your expert guidance. Your explanation about the differences between UEStudio and UltraEdit versions, especially regarding the scripting engine behavior, was very educational. The clipboard method tip was particularly valuable - I wouldn't have thought of that approach myself.

If you notice any areas for further improvement in the attached script, I'd greatly appreciate your suggestions.

Best regards and thanks again for your help!
Validator_HTML_XML_v6.5.zip (33.38 KiB)   0

6,825625
Grand MasterGrand Master
6,825625

PostNov 13, 2025#4

Samir, good job, congratulations. The modifications are well done. 👏

UltraEdit/UEStudio 2025.1.0.20 support output line parsing with line and column number on line being:

C:\Path\to\file.html(line/column): text

But there is required at least UltraEdit v28.10.0.98 or UEStudio v21.00.0.90 which introduce this enhancement in parsing a line in output window and interpreting the first number as line number and the second number after the slash as column number and positioning the caret in the file appropriately. Former versions like UltraEdit v28.10.0.26 or UEStudio v21.00.0.66 support only parsing an output window line for the line number according to my tests with various versions of UltraEdit and UEStudio.