Editing very large files

Editing very large files

4
NewbieNewbie
4

    May 28, 2009#1

    Wow... I thought this little excercise would be no problem for "ULTRA"edit. I am using UE 15.
    I had to retrieve approx 1 million rows of data (only approx 120 char per record{row}) into a flat file (or files) and email it. It was data by month for a whole year.
    Ugh!... but I had to get it done, bosses didn't want to hear why this was stupid.

    Data output from SQL2008 created a 65MB csv file.
    Zipping the file created an 11MB file. We have a 10MB limit on sending email attachments. So, I thought simple... Suck a copy of the file into Ultraedit find the halfway mark in the file (first 6 month mark) and zap the remainder of the file (last 6 months of data) and save the truncated file to a new file name. This would give me a file for half the year that would zip to less then 10MB. I would do this again for the other half of the data and then do two separate emails of the zipped files. (Not pretty but I was in a just "get'er done" mode with a deadline)
    The big 65mb file loaded into Ultraedit. It was pretty slow but it did all load. (Running XP SP3 with 3GB). I did a seek to the bottom of the file (to see the last rows in the file) and everything appeared normal. Approx 1.1 million rows of data were in the file I had loaded in Ultraedit. I found the halfway mark and then selected the 500,000 rows to the bottom of the file. I then deleted them. I then saved this adjusted file to a new file name. The new file was created and the file size of the new file seemed reasonable (it was approx 1/2 the number of bytes of the orginal file). I closed UltraEdit and then re-opened the original 65mb file to re-do the process again but this time creating a file with the 2nd half of the year.
    Well... the 65mb original file (which was never saved or adjusted in UE) now only had the half of the data in it. The first half of the year was all that was in the file. I closed UE and tried again... same thing. I opened the orginal file in notepad (that's a long wait) and it only had half the data. So I pulled a fresh copy of the data from the SQL server again to a new CSV file and tried the whole process all over again. Same results.

    Conclusion... Ultraedit seems to really choke on files this big.
    It doesn't make sense to me.
    I seem to remember be able to do this type of work with V11. I may reload an old copy of UE to test it out.
    Anyone else have problems editing very large files.

    6,675585
    Grand MasterGrand Master
    6,675585

      May 28, 2009#2

      As documented in help of UltraEdit and also in the power tip Large file text editor and viewer opening a large file without a temp file means that you are directly editing the original file so any changes will be permanent even when not saving the file.

      Open your 65 MB file with a temp file and you can work as normal.

      Or select the lower block, copy it to a new file and save it. Select the upper block, copy it to a new file and save the second file too. In other words, do not modify the original file.

      BTW: WinRAR has a special compression algorithm for text files which was used prior v4.20 by default on text files like a CSV file when using good or best compression ratio and which surely can compress your CSV file under the 10 MB limit. You can make a self-extracting RAR file if the receiver can't unpack RAR archives. In later versions of WinRAR the special text compression feature must be explicitly enabled on creating a RAR archive.
      Best regards from an UC/UE/UES for Windows user from Austria

      4
      NewbieNewbie
      4

        May 28, 2009#3

        Well, I checked my settings and I did have "Use temporary file for editing (normal operation)" selected under "File Handling - Temporary Files". I tried the whole file editing process again, but this time I looked for the 'temp' files that should have been created when I opened the large csv ile. The temp file was no where to be found. I had the same results as before editing and saving the large csv file. (The orginal file was modifiedl). So I went into UE and went to "File Handling -temporary Files" and selected "Open without temp file and NO PROMPT". I clicked apply and closed UE. I then opened UE went back to "File Handling -temporary Files" and changed it back to "Use temporary file for editing (normal operation)". Clicked apply. Closed UE and then opened it again the tried the whole process again. Things worked but in a weird way. As I said, this time it did work (it created a backup "temp" copy of my BIG file in my temp file location) and it let me edit the BIG file and save the edited (reduced in size version) it to a new name while the original file file remained intact.
        Here is what is weird... while I was opening the orginal big csv file I was presented with a message box that said... "The file you're opening is 62.51 MB in size. The exceeds the limit specifed in configuration for temporary file threshold and may reduce editing performance of this file. Please choose one of the options below." Three radio button options were "1) Disable temporary files when opening large files (greater than 47MB) for this edit session only", "2) Disable temporary files when opening large files (greater than 47MB) permanently and don't ask again" or "3) Don't change anything; continue to use temporary files" This last option was pre-selected

        Here's my question(s).... (and comments)

        It's strange that the "Use temporary file for editing (normal operation)" option was not working, even though it was selected.
        It seems that I had to 'reset' it to make it work properly.
        I upgraded to UE15 (from UE14) a couple of months ago (I've had UE since V9). Could the upgrade and installation over my previous install could have caused this break?
        I haven't done any 'large' file editing in that time.

        Next, Where does the 47MB value come from? I have looked at all my configuration settings and I don't see anything that would set that particular value. Nothing is mentioned in Help about this 47MB limit either. If this is a built in warning / constraint, why can't I find any reference to it?

        IMHO, these are still unexpected results from UE.

        6,675585
        Grand MasterGrand Master
        6,675585

          May 29, 2009#4

          Configuration - File Handling - Temporary Files - Threshold for above (KB) specifies the limit in kilobytes for opening a file without temporary file according to the selection made above this setting. The help for this configuration dialog contains further information. See also the power tip Working with temporary files in UltraEdit/UEStudio.

          And in blog "UltraEdit v14.20 Just Released..." could be read:

          Enhanced large file detection and handling
          - Prompt for each large file opened to open without temp files, line numbers, etc.

          The prompt is shown only when the file to load has a file size equal or larger the specified threshold limit and the first option is chosen. 47 MB seems to be the default now. I use since years 4096 KB, so I don't really know what the default is in UE v15.00 or was in previous versions. My UE v15.00 showed in the dialog 4 MB as I tried it according to my threshold setting.
          Best regards from an UC/UE/UES for Windows user from Austria

          4
          NewbieNewbie
          4

            May 29, 2009#5

            Thanks Mofi... great info...

            I guess that power tip link explains the 47MB value.
            Interesting though... the message box shown in the power tip (for UE14) is very different from the one I see in UE15

            The UE14 power tip message box is phrased as "Do you want to disable the creation of temporary files..." with you having to select 'No' to keep the temporary file option turned on (otherwise they are turned off for the file you are about to open). The new message box has a default value of "Don't change anything; continue to use temporary files" which is a smarter approach.

            My configuration preference is to just leave the "Use temporary file for editing (normal operation)" option set.
            And not bother with a threshold value.

            I don't click on files to open them (in any application) unless I have a clear understanding of what I am opening and what I expect to do with it. In other words, I don't open really big files by mistake or without knowing the impact.

            As it turns out, the only hiccup in all this was my UE15 not recognizing my pre-exisiting configuration of "Use temporary file for editing (normal operation)" until I reset it.

            6,675585
            Grand MasterGrand Master
            6,675585

              Re: Editing very large files - settings explained

              May 29, 2009#6

              That question is also normal. I explain the 3 options of Advanced - Configuration - File Handling - Temporary Files and the resulting behavior.
              • All files with a file size lower than the specified threshold limit are always opened with the usage of a temporary file.
                So for small files it does not matter which of the 3 options is selected in the dialog. Therefore I suggest not to enter a too large value for the threshold.
              • Option Use temporary file for editing (normal operation) is selected (default). If a file is opened with a file size equal or greater than the threshold value the following dialog is shown:

                large_file_message_1.png (3.82KiB)
                Option dialog before opening first time a large file.

                If the first option is used, this file and all other files with a file size greater the threshold value are opened without creating a temporary file and without showing any message box until UltaEdit is closed (or the user changes the configuration setting for using temp files). This option does not change anything in the configuration (INI file or registry).

                If the second option is used, this file and all other files with a file size greater the threshold value are opened without creating a temporary file and without showing any message box. Additionally this option selects in the configuration the option Open file without temp file but NO Prompt which is saved into the INI file or registry when UltraEdit is closed.

                If the third option is used, the file is opened with creating a temporary file. Nothing is changed in the configuration (INI file or registry). Additionally all other large files are opened also with using a temporary file in this instance of UltraEdit until UltraEdit is closed (or the user changes the configuration setting for using temp files).
              • Option Open file without temp file but prompt for each file is selected. If a file is opened with a file size equal or greater than the threshold value the following dialog is shown:

                large_file_message_2.png (4.63KiB)
                Warning with question before opening a large file without temp file.

                The user has now the choice to open the file without using a temporary file by selecting Yes or with using a temp file by selecting No. Additionally the message warns the user that all changes are permanent and applied on the file directly. Independent which option the users selects now, when the user opens again a file with a size equal or larger the threshold limit, the user will see the same message box again. That's the reason why I prefer this configuration setting. I'm always warned that when I continue now with Yes that I have to take care what I do because Undo is not possible.
              • Option Open file without temp file but NO Prompt is selected. All files with a file size equal or greater than the threshold are opened always without using a temp file. There is never a warning shown that the user edits a large file now directly and all changes are therefore permanent. This is definitely a setting only for users working daily with large files or running scripts or macros daily on large files in background tasks and using additionally a separate INI file just for this task where this configuration is used while standard INI for UltraEdit uses one of the 2 other options.
              I hope that this explanation is now clear enough for everyone.
              Best regards from an UC/UE/UES for Windows user from Austria

              13
              Basic UserBasic User
              13

                Jun 01, 2009#7

                Hi Gregsoc,

                I had the same thing on 14.20 and reported this to Support: after long trials, I could trace back that I had a corrupted configuration file.
                • with that config, UE would simply open large files without temp files without asking (despite the default settings being used). So it would simply overwrite the file on disk without warning.
                • with a new, clean config, UE would behave as described by Mofi above
                I could never trace back why I had a corrupted config. I just used the various dialogs to change config settings and apparently, along the way, corrupted my config files. So there must be a hidden bug someplace.

                I find it also interesting to note that IDM chooses to bring up the "Temporary file handling" dialog box if the user uses the default (Use temporary files). This is VISTA behavior: Are you sure you want to move the mouse cursor to the left?.

                Support explained that they did this because they wanted to make customers aware that there are faster ways of dealing with large files, ie., with temp files disabled. So they are targeting people who haven't read the manual. As a punishment, the dialog box does not explain the consequences of opening without temp files. :twisted:

                As if they were concerned about performance: If one opens a file read only, the temp file behavior is the same as if one opens a file for editing. I've yet to understand what the purpose of a temp file is in the case of read only access. :lol:

                4
                NewbieNewbie
                4

                  Jun 01, 2009#8

                  Hi Stefan_E

                  Good information...

                  I suspect my configuration file was somehow corrupt... because I had to re-apply the "Use temporary file for editing (normal operation)" option to make it work as expected. That's my only point.

                  To me, it seems that Mofi is convinced there is some "pure and justified logic" to the how IDM has set this configuration up, but I agree with you, the prompting does seem backwards.

                  I set the option to 'always use temporary files'... so just do it (and don't bug me again!!). And I am also not going to try and pick some 'magical' threshold number that will determine if I should or shouldn't use temp files. My use of UE is wide and varied, so that number would just become meaningless.

                  That's interesting about the 'read only' file open feature still trying to use temp files. They must have missed that part of their review on 'performance options' for users. If they are going to have a dialog box, maybe they should make one of the options "open as read only", which would forgot the temp files and just let the user troll through the file. That's mostly what I use UE for when I do open really large files. I am not usally interesting in editing these files. Its mostly for just finding some pattern or string (e.g. in big log files or data dumps from external systems).

                  Thanks again...

                  1
                  NewbieNewbie
                  1

                    Jul 22, 2009#9

                    The performance in 15.10 seems to be slower than 14.x by a large margin.

                    I am using temp files (as I did previously), but now it takes about 40-45 sec to open a 77MB Text file. With 14.x it was noticably faster.

                    It takes twice as long on a 400 MB file...I've worked with larger files in previous version of UE and it was never this slow.

                    Any options I can use to speed it up? I don't want to permanently edit the file (too much change of an OOPS).

                    I've set in the config to use large memory buffers (I am running 64-bit XP with 4GB ram on Dual Core)...

                    6,675585
                    Grand MasterGrand Master
                    6,675585

                      Jul 22, 2009#10

                      Copying the large file to the temp folder should need the same time as before.

                      In general using no syntax highlighting for such large files is always good. No syntax highlighting means also no code folding, no function string parsing and in case of an XML, XSL or XHTML file no XML parsing. Well, function list parsing is since UE v14.20.0 done in a background thread. Syntax highlighting is done since UE v15.00 also in a background thread and folding processing is done in its own thread since UE v15.10.

                      You may play with the XML Manager options of UE v15.10 if your file is an XML, XSL or XHTML file and syntax highlighting for this file is enabled.

                      Enabling the configration setting Disable line numbers at Configuration - Editor Display - Miscellaneous is in general also useful when working with large files. But in your case with using temporary files it should not make a difference, especially when you had enabled this setting also in UE v14.x.

                      I have not analyzed if UE v14.20.1 is faster in opening large files as UE v15.10.0. Maybe there is something other than the new GUI which is definitely slower than the old one which makes working with large files slower. But the new GUI should not have a big influence on the opening time of a large file. However, lots of entries in the INI file of UltraEdit from previous version are not of any use after update in UE v15.10. So you might want to backup your configuration files for example by using Advanced - Export Settings or Advanced - Backup/Restore User Customizations (or simply renaming directory %appdata%\IDMComp\UltraEdit\ while no UE is running), delete uedit32.ini and start UltraEdit which creates a new uedit32.ini with the default settings.
                      Best regards from an UC/UE/UES for Windows user from Austria

                      1
                      NewbieNewbie
                      1

                        Jul 22, 2009#11

                        I was just considering updating from 10.10 to 15+, but after reading this and several other recent posts, there doesn't seem to be much point to do so. I occasionally work with large 100+ mb data files to clean up for data analysis programs, and 10.10 may still be a better tool.

                        80
                        Advanced UserAdvanced User
                        80

                          Jul 23, 2009#12

                          I'm still at version 14 and the function list being in a separate thread makes a huge difference for large files. Of course that assumes the large files are in a known language and not plain text, in which case there is no function list. So 14 and 15 should be much faster than 10 for files that aren't just plain text.

                          3
                          NewbieNewbie
                          3

                            May 18, 2010#13

                            Hi Guys,

                            I am reusing this old discussion as I observed that v16.00.0.1040 (latest) seems not to have a separate thread for the function list. On some (complex) C files I am using, I need to wait the function list to have finish the parsing before being able to edit the file. As the parsing occurs each time I am moving to a different file, it is kind of annoying. I was using v12.10b on my previous machine, which was much faster.

                            Do you get the same behavior? Any tips?

                            Thanks a lot

                            6,675585
                            Grand MasterGrand Master
                            6,675585

                              May 18, 2010#14

                              I reported IDM by email that scanning an example file (HTML) for the function strings takes much longer (about 6 seconds) in UE v16.00 in comparison to v15.20 (less than 1 second) using both the same wordfile and running both on the same computer. This performance problem was confirmed by IDM support and forwarded to the developers. However, that the function list scan needs longer and now shows the update process with a process bar at bottom of the function list does not prevent me from editing the file. There should be also a caching of the found function strings which prevents rescanning a file when just re-activating it after looking or editing another file.

                              Can you really not edit your C file while UE scans for the function strings or do you just always wait with editing until the functions are displayed in the function list?
                              Best regards from an UC/UE/UES for Windows user from Austria

                              3
                              NewbieNewbie
                              3

                                May 18, 2010#15

                                Thanks Mofi for the prompt reply.

                                No, I can't edit the file while the scan is occuring. I get the hourglass pointer, and the entire UE is locked until the list is populated.

                                I can also report the slide on the right of the function list window is starting small on the top right corner, and slowly grows while the list is getting populated. Not sure you understand my poor English, but anyway, this is slow and annoying ;-)

                                Thanks

                                Read more posts (3 remaining)