It looks like the file is a CSV file using unusually a colon as value separator with two completely useless empty data columns at end of each row. The date/time values are formatted wrong as the time contains also colons and therefore the entire date/time value must be enclosed in
" to be formatted valid in this CSV file.
UltraEdit has built-in the command
Convert to fixed-width to scan a CSV file with a separator defined by the user for maximum length of all data values within all data columns in the CSV and then convert the CSV file to a fixed-width file. But this command can be used only after enclosing the date/time values in
" (and removing the useless colons at the end).
A case-sensitive Perl regular expression replace all with the search string
\b(\d+/
\d+/
\d+ [0-1][0-9]:
[0-5][0-9]:
[0-5][0-9] [AP
]M
):: and the replace string
"\1" can be used to enclose the date/time value in double quotes and to remove the two colons at end of each row.
I wrote already once a script version of command
Convert to fixed-width, see
Converting CSV data to fixed width data? For that reason it was very easy for me to write the script for this task as there needed to be modified the character code value of the separator at top and insert the case-sensitive Perl regular expression replace all to enclose the date/time values in double quotes and remove the two colons at end of each line before using the already existing code.
Code: Select all
// Character code value of separator. Use 0x2C for comma, 0x3B for
// semicolon, 0x7C for vertical bar, or 0x09 for a horizontal tab.
var nSeparatorCode = 0x3A; // Colon is unusually the value separator.
// Array of comma separated list of field widths. For example [5,10,20]
// means first data column should have a width of 5 characters, second
// of 10 characters and third of 20 characters. The number of values in
// first line of CSV file determine the number of data columns. If the
// field widths array is empty or contains less field width numbers than
// values in first line of CSV file, the script automatically determines
// the required width for each data column depending on longest value in
// each data column on which the field width is not defined here at top.
var anFieldWidths = [];
// Number of spaces to add to automatic determined width of a data column.
// For example if longest value in a data column has 22 characters and
// the value for this variable is 3, the field width of this data column
// is 25. The variable value must be 0 or greater. This variable is used
// only for field widths automatic determined by the script, not for the
// widths defined in the array above. The number of additional spaces is
// never added to last field width.
var nAutoWidthSpaces = 3;
// Character as replacement for a carriage return and character as
// replacement for a line-feed in a field value. Each string can be also
// an empty string to remove all carriage returns and line-feeds from
// field values. Also a string with more than one character is possible.
var sCarriageReturn = "«";
var sLineFeed = "»";
// Write the reformatted lines into a new file or overwrite active file.
var bNewFile = true;
if (UltraEdit.document.length > 0) // Is any file opened?
{
// Define environment for this script.
UltraEdit.insertMode();
if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
// Run first a case-sensitive Perl regular expression replace all
// searching for date/time with one more date appended and split
// that data value up into two data values if existing at all.
UltraEdit.activeDocument.top();
UltraEdit.perlReOn();
UltraEdit.activeDocument.findReplace.mode=0;
UltraEdit.activeDocument.findReplace.matchCase=true;
UltraEdit.activeDocument.findReplace.matchWord=false;
UltraEdit.activeDocument.findReplace.regExp=true;
UltraEdit.activeDocument.findReplace.searchDown=true;
if (typeof(UltraEdit.activeDocument.findReplace.searchInColumn) == "boolean")
{
UltraEdit.activeDocument.findReplace.searchInColumn=false;
}
UltraEdit.activeDocument.findReplace.preserveCase=false;
UltraEdit.activeDocument.findReplace.replaceAll=true;
UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;
UltraEdit.activeDocument.findReplace.replace('"?\\b(\\d+/\\d+/\\d+ [0-1][0-9]:[0-5][0-9]:[0-5][0-9] [AP]M)[\\t ":]*?(\\d+/\\d+/\\d+):*$', '"\\1":\\2');
if (UltraEdit.activeDocument.isFound())
{
UltraEdit.activeDocument.top();
// Enclose the date/time values in double quotes and
// replace all colons at end of each line by one colon.
UltraEdit.activeDocument.findReplace.replace('"?\\b(\\d+/\\d+/\\d+ [0-1][0-9]:[0-5][0-9]:[0-5][0-9] [AP]M)"?:+$', '"\\1":');
}
else // Different data with no additional date column.
{
// Enclose the date/time values in double quotes and
// remove the last two colons at end of each line.
UltraEdit.activeDocument.findReplace.replace('"?\\b(\\d+/\\d+/\\d+ [0-1][0-9]:[0-5][0-9]:[0-5][0-9] [AP]M)"?::', '"\\1"');
}
UltraEdit.activeDocument.selectAll();
if (UltraEdit.activeDocument.isSel())
{
// Parse the selected CSV data character by character to support also
// 1 or more delimiters as well as newline characters within a double
// quoted value. Additionally just store in memory on which indices in
// CSV character stream the data values start and end instead of using
// array of strings to avoid copying the strings in memory as much as
// possible for more efficient processing of the CSV data.
var bShowOutput = false; // Show output window after finishing conversion.
var bNewDataRow = false; // For detection of a new data row.
var bWithinQuotes = false; // For detection of double quoted values.
var anValueIndices = [0,0]; // Start and end indices of value in stream.
var anRowValueIndices = []; // Array of value indices pairs in a data row.
var anAllValueIndices = []; // Array of all data row arrays in entire file.
var nMultiLineValue = 0; // Is there any multi-line value in CSV file?
var nCharIndex = 0;
do
{
var nCharCode = UltraEdit.activeDocument.selection.charCodeAt(nCharIndex);
if (!bWithinQuotes)
{
// Is the current character the separator?
if (nCharCode == nSeparatorCode)
{
anValueIndices[1] = nCharIndex;
anRowValueIndices.push(anValueIndices);
anValueIndices = [nCharIndex+1,nCharIndex+1];
}
// Is the current character a carriage return or a line-feed?
else if ((nCharCode == 0x0D) || (nCharCode == 0x0A))
{
if (!bNewDataRow)
{
bNewDataRow = true;
anValueIndices[1] = nCharIndex;
anRowValueIndices.push(anValueIndices);
anAllValueIndices.push(anRowValueIndices);
anRowValueIndices = [];
}
anValueIndices = [nCharIndex+1,nCharIndex+1];
continue;
}
else if (nCharCode == 0x22)
{
bWithinQuotes = true;
}
bNewDataRow = false;
}
else // Everything enclosed in double quotes must be interpreted
{ // as part of data value including delimiter and newline
// character until next double quote character is found.
if (nCharCode == 0x22)
{
bWithinQuotes = false;
}
else if (nCharCode == 0x0D)
{
nMultiLineValue |= 1;
}
else if (nCharCode == 0x0A)
{
nMultiLineValue |= 2;
}
}
}
while (++nCharIndex < UltraEdit.activeDocument.selection.length);
// Does the file not end with newline characters?
if (!bNewDataRow)
{
anValueIndices[1] = nCharIndex;
anRowValueIndices.push(anValueIndices);
anAllValueIndices.push(anRowValueIndices);
}
// UltraEdit.outputWindow.showWindow(true);
// UltraEdit.outputWindow.write("\nDump of array anAllValueIndices:\n");
// var_dump(anAllValueIndices);
// for (var nLine = 0; nLine < anAllValueIndices.length; nLine++)
// {
// var sPluralS = (anAllValueIndices[nLine].length != 1) ? "s" : " ";
// var sLineData = "The " + anAllValueIndices[nLine].length.toString(10)+
// " value indices pair" + sPluralS + " of data row " +
// (nLine+1).toString(10) + ": ";
// for (var nPair = 0; nPair < anAllValueIndices[nLine].length; nPair++)
// {
// if (nPair != 0) sLineData += ", ";
// sLineData += anAllValueIndices[nLine][nPair][0].toString(10);
// sLineData += '-';
// sLineData += anAllValueIndices[nLine][nPair][1].toString(10);
// }
// UltraEdit.outputWindow.write(sLineData);
// }
// UltraEdit.outputWindow.write("");
// The number of values in first row is the number of data columns.
// The number of rows in file is the number of data rows in output.
var nColCount = anAllValueIndices[0].length;
var nRowCount = anAllValueIndices.length;
var nFieldWidth;
var sFieldValue;
var nCol = anFieldWidths.length;
// Extend the array with field width numbers with automatic determined
// field widths if the array with the number of field widths contains
// less values than data values in first row of CSV file.
while (nCol < nColCount)
{
nFieldWidth = 0;
for (nRow = 0; nRow < nRowCount; nRow++)
{
if (nCol < anAllValueIndices[nRow].length)
{
sFieldValue = UltraEdit.activeDocument.selection.substring(anAllValueIndices[nRow][nCol][0],anAllValueIndices[nRow][nCol][1]);
sFieldValue = sFieldValue.replace(/^\"(.*)\"$/,"$1");
sFieldValue = sFieldValue.replace(/\"\"/g,"");
if (nMultiLineValue) // Is there any multi-line value in file?
{
sFieldValue = sFieldValue.replace(/\r/g,sCarriageReturn);
sFieldValue = sFieldValue.replace(/\n/g,sLineFeed);
}
if (sFieldValue.length > nFieldWidth)
{
nFieldWidth = sFieldValue.length;
}
}
}
if (nCol < (nColCount-1)) // Add number of additional spaces
{ // except on last data column.
nFieldWidth += nAutoWidthSpaces;
}
anFieldWidths.push(nFieldWidth);
nCol++;
bShowOutput = true;
UltraEdit.outputWindow.write("Field width for data column " + nCol +
" determined with " + nFieldWidth + " characters.");
}
// Build string with just spaces for each data column for fast
// extending with spaces later each value to required length.
var asFieldWidths = new Array(nColCount);
for (nCol = 0; nCol < nColCount; nCol++)
{
var sSpaces = "";
nFieldWidth = anFieldWidths[nCol];
while (nFieldWidth)
{
sSpaces += ' ';
nFieldWidth--;
}
asFieldWidths[nCol] = sSpaces;
}
// Convert the lines in CSV file to fixed width lines now.
var sValues = (nColCount != 1) ? " values." : " value.";
var asRows = new Array(nRowCount);
for (nRow = 0; nRow < nRowCount; nRow++)
{
// Verify the number of values in that row.
var nValueCount = nColCount;
if (nValueCount > anAllValueIndices[nRow].length)
{
bShowOutput = true;
nValueCount = anAllValueIndices[nRow].length;
UltraEdit.outputWindow.write("Warning: Line " + (nRow+1) +
" has just " + nValueCount +
" instead of " + nColCount + sValues);
}
else if (nValueCount < anAllValueIndices[nRow].length)
{
bShowOutput = true;
UltraEdit.outputWindow.write("Warning: Line " + (nRow+1) + " has " +
anAllValueIndices[nRow].length +
" instead of " + nColCount + sValues);
}
var sDataRow = "";
for (nCol = 0; nCol < nValueCount; nCol++)
{
nFieldWidth = anFieldWidths[nCol];
sFieldValue = UltraEdit.activeDocument.selection.substring(anAllValueIndices[nRow][nCol][0],anAllValueIndices[nRow][nCol][1]);
sFieldValue = sFieldValue.replace(/^\"(.*)\"$/,"$1");
sFieldValue = sFieldValue.replace(/\"\"/g,"");
if (nMultiLineValue) // Is there any multi-line value in file?
{
var sReplacedValue;
if (nMultiLineValue & 1)
{
sReplacedValue = sFieldValue.replace(/\r/g,sCarriageReturn);
if (sReplacedValue != sFieldValue)
{
bShowOutput = true;
UltraEdit.outputWindow.write("Replaced in data row " + (nRow+1) + " in value " +
(nCol+1) + " at least one carriage return by '" +
sCarriageReturn + "'.");
sFieldValue = sReplacedValue;
}
}
if (nMultiLineValue & 2)
{
sReplacedValue = sFieldValue.replace(/\n/g,sLineFeed);
if (sReplacedValue != sFieldValue)
{
bShowOutput = true;
UltraEdit.outputWindow.write("Replaced in data row " + (nRow+1) + " in value " +
(nCol+1) + " at least one line-feed by '" +
sLineFeed + "'.");
sFieldValue = sReplacedValue;
}
}
}
if (sFieldValue.length > nFieldWidth)
{
bShowOutput = true;
UltraEdit.outputWindow.write("Warning: Truncated in data row " + (nRow+1) +
" value " + (nCol+1) + " from " +
sFieldValue.length + " to " +
nFieldWidth + " character" +
((nFieldWidth != 1) ? "s." : "."));
sFieldValue = (nFieldWidth > 0) ? sFieldValue.substr(0,nFieldWidth) : "";
}
if (sFieldValue.length < nFieldWidth)
{
sFieldValue += asFieldWidths[nCol].substr(sFieldValue.length);
}
sDataRow += sFieldValue;
}
while (nCol < nColCount)
{
// Add spaces for each value missing in that row.
sDataRow += asFieldWidths[nCol];
nCol++;
}
// Add this row to the array of reformatted rows.
asRows[nRow] = sDataRow;
}
// Determine line terminator type from active file.
var sLineTerm;
if (UltraEdit.activeDocument.lineTerminator < 1) sLineTerm = "\r\n";
else if (UltraEdit.activeDocument.lineTerminator == 1) sLineTerm = "\n";
else sLineTerm = "\r";
if (bNewFile) // Write the reformatted lines into a new file?
{
// Replace the file extension of active file by ".txt".
// This quick and dirty solution works only for files
// which really have a file extension.
var sNewFileName = UltraEdit.activeDocument.path.replace(/\.[^.\\]+$/,".txt");
// Append .txt if new file name is identical to active file name.
if (sNewFileName == UltraEdit.activeDocument.path) sNewFileName += ".txt";
// Cancel the selection in input file.
UltraEdit.activeDocument.top();
// Create a new file with same line ending type as input file.
UltraEdit.newFile();
UltraEdit.activeDocument.unixMacToDos();
if (sLineTerm == "\n") UltraEdit.activeDocument.dosToUnix();
else if (sLineTerm == "\r") UltraEdit.activeDocument.dosToMac();
UltraEdit.activeDocument.write(asRows.join(sLineTerm)+sLineTerm);
UltraEdit.saveAs(sNewFileName);
UltraEdit.closeFile(UltraEdit.activeDocument.path,2);
}
else
{
// Overwrite the entire data in input file.
UltraEdit.activeDocument.write(asRows.join(sLineTerm)+sLineTerm);
// Close file with saving it.
UltraEdit.closeFile(UltraEdit.activeDocument.path,1);
}
if (bShowOutput) UltraEdit.outputWindow.showWindow(true);
}
}
Please read the comments at top of the script if you want to make easily changes on the output.
The active file on execution of the script is first modified by the Perl regular expression replace all and then a new file is created with the data converted to fixed-width. The new file is saved with same name as active file into the same directory as active file with file extension replaced by .txt respectively with .txt appended if the active file has the file extension .txt. The new file with fixed-width data is closed automatically by the script and so just the active file remains opened with the modification made by the case-sensitive Perl regular expression replace all.