Merge / combine / copy the contents of text or CSV files into a new file

Merge / combine / copy the contents of text or CSV files into a new file

Grand MasterGrand Master

    Feb 19, 2009#1


    Do you have sometimes the need to copy the contents of several text files in a folder together into a new file?

    Well, that can be done with the copy command of Windows with something like:

    copy /A C:\Temp\*.txt C:\Temp\AllTextContent.txt

    But there are some disadvantages using the copy command:
    • The found text files are copied together as stored on the data medium and not inevitably in alphabetical order as often expected. NTFS returns the list of files matching a wildcard pattern like *.txt always in alphabetical order, but other file systems like FAT16, FAT32, exFAT don't do that.
    • The copy command does not verify if the last line of a text file has a line ending and adding one if missing before appending the content of the next text file.
    • There is no possibility to insert a text between the contents of the text files.
    • The copy command appends at end of the file the SUBSTITUTE control character with hexadecimal code 1A which could be be avoided on using /B instead of /A.
    • The copy command does not make any conversion if the text files are in different encodings (Unicode variants, ASCII/ANSI) or have different line endings (DOS, UNIX).
    Main advantages of the copy command: very fast and easy to use.

    The script MergeTextFiles.js available for download on the Macros & Scripts page does not have the limitations of the copy command. It copies the contents of all found files in a specified folder according to a file specification into a new file with taking into account missing line endings on last line and various file formats and encodings.

    The encoding and the line ending format of the new file are defined in the configuration dialog of UltraEdit / UEStudio with the settings:

    Advanced - Settings or Configuration - File handling - Encoding - Default encoding (for new files and file open when auto-detect fails)
    Advanced - Settings or Configuration - File Handling - DOS/Unix/Mac handling - Default file type for new files

    But it is possible to define the encoding and line ending format also in the script by inserting the appropriate conversion commands after command UltraEdit.newFile();

    The format of the line ending in the text files does not matter with by default enabled configuration setting:

    Advanced - Settings or Configuration - File Handling - Conversions - On paste convert line endings to destination type (Unix/Mac/DOS)

    Okay, but you maybe don't want to copy the contents of files in a folder into a new file. Maybe you want to copy the contents of all open files into a new file with the order you can see on the open file tabs bar of UltraEdit / UEStudio. In this case the script MergeTextFiles can be used too because that is also supported.

    And how to use it?

    That's quite simple. Download it from the server and store it in your scripts folder. Open the script file and append at bottom or insert at top the code lines of the function GetListOfFiles which can be downloaded also from the Macros & Scripts page. This function is only needed for merging contents in files found in a folder. Function GetListOfFiles is not required for copying the contents of all open files into a new file. Save the script file and add it to your script list. That's it. It is ready for usage. On execution you will be asked for the main options of the script. Other options like listing the files processed in the output window or which text should be inserted above every text copied from a file must be enabled by modifying the script. The script is completely commented and therefore it should be no problem even for script newbies to make personal adaptations.

    Please read the comment at top of the script file for further details.

    While MergeTextFiles is for general text files there is a second script named MergeCSVFiles.js (Macros & Scripts) specialized on merging content of CSV files (comma-separated values). It merges CSV files without a header line as well as CSV files with a header line. And it supports optionally also to insert the file name of the source CSV file as additional text at start or end of every row of the copied data lines.

    This script requires the function GetListOfFiles (Macros & Scripts) which must be appended at bottom or inserted at top of the script. Further if the options to insert the source file names is used it requires the function GetNameOfFile or GetFileName available in the script file FileNameFunctions.js which can be downloaded also the Macros & Scripts page.

    Please read the comment at top of the script file for further details.

    Both script files were last updated on 2016-02-18.

    If you have found a mistake in the script code, a typo or grammar mistake, or something is not clear, or if there is a problem with one of the two scripts, then please post a reply here and inform me. Suggestions for further enhancements are also welcome.

    If somebody translates one or both scripts to a different language (= localizes the script) it would be fine to post here a reply and attach the script as ZIP file. I would look into it, test if the localized script still works and send it to IDM to upload it to their server with a reference on the Macros & Scripts page. Please insert at top of the localized script below the copyright line a line with your (alias) name for the translation, something like Translated by: xxx in your language.

    The line and block comments can be removed from both script files by running a replace all (from top of file) searching with Perl regular expression for ^ *//.+[\r\n]+|^ */\*[\s\S]+?\*/[\r\n]+| +//.+$ and using an empty replace string. The first part in this OR expression with three arguments matches entire lines containing only a line comment, the second part matches block comments, and third part matches line comments right to code. Removal of the comments makes the usage of these scripts more efficient on using them often because of JavaScript interpreter has to interpret less characters and lines.