Tapatalk

How to split a string of text by character and getting character counts/array?

How to split a string of text by character and getting character counts/array?

1
NewbieNewbie
1

Post13:47 - 2 days ago#1

Hi, I'm new to UltraEdit. The systems people recommended this software to my team, but said they have no instructions for us.  lol. I'm a finance person. But I deal with large employee raw data files that are staged and loaded into my employers financial database. For example, these files can contain in excess of 500K rows of data.  We recently transitioned to a new database and are still working out system kinks.  So as a result, we are having to refer back to the raw data files frequently to confirm that data is reflecting correctly in system as it is received in the file. Each of the characters or groups of characters in the rows mean something and ends up being loaded into various tables in the database. I'm not sure if I'm using the correct terminology but can UltraEdit split the text into characters and label the characters. I think, it may be referred to as a character array or character index. Or is there an easy way I can cut and paste an employees data into UltraEdit and quickly see what number each character is?

For example, each employee may have 3 or 4 rows of data. (This is not same format but just to give example.) There are spaces at the end of some of the rows that need to be counted as a character. And the count needs to continue for each row. So if last character in row 1 is character 50, the first character in line 2 is character 51. In this example, characters 22-27 would translate to the employees DOB.

Code: Select all

12221DOEJOHN111234567102186CA900021789AG___
12222DOEJOHNBCBS23498HEALTHEMP02LEGAL245K98435___
I've been playing around with some text split formulas in Excel, and I think, the Excel might be the easier way to do this, but still have not figured out how to continue the count on the next row. But I'm still wondering if this can be done with UltraEdit. It came highly recommended so I'm assuming it must be able to do something similar.

So far I've tried a script that I found but the output was still difficult to read.

For example, the output looks similar to this, where the number in the parenthesis is the character count.

Code: Select all

1(1) 2(2) 2(3) 2(4) 1(5) D(6)
I was wondering is there any way to list it or put the character number above the character, similar to the screen shot provided.

Can anyone offer any advice? Is this something UltraEdit can do or is Excel the better option?
split_text_string_to_character_array_hu_c1b5a5d2def303df.webp (11.3KiB)

6,825625
Grand MasterGrand Master
6,825625

Post20:47 - 2 days ago#2

I am 100% sure that an UltraEdit script can produce the data as wanted by you as far as I understood the task description. But I do not understand what you really want and how the real data is structured. The small example is a bit too simplified to be able to write the small script for you.

The first posted example data line has 43 and the second line has 49 characters. The total number of characters of DOEJOHN is therefore 43 + 49 = 92. There is written that for each employee can be three or four data rows in the database file.
  1. What is the data structure in the real database file?
  2. How to identify which data rows belong together and are the data of one employee?
  3. Where should be the number of characters of the data of each employee written to – into the source database file, into a new file, or into the output window of UltraEdit?
  4. How should be the output data structured which is created by the script?
    Should it contain only the character counts of the employees or additionally other data from the source database file?
  5. Should the newline characters of the data rows belonging to one employee be included in the characters count too?
    If the answer on this question is yes, what is the line ending type – DOS/Windows (carriage return + line-feed), Unix (just line-feed), or improbable MAC (just carriage return)?
    The automatically recognized line ending type is displayed in the status bar at the bottom of the UltraEdit main application window.
  6. How large is the source database file in MB with the 500K data rows?
    The size of the file is displayed by UltraEdit on positioning the mouse pointer over the file tab of the opened file.
  7. Or should the UltraEdit script on execution just open a small dialog window to enter (or paste) the name of an employee which is next searched in the opened database file and on indeed found, those data rows are selected and a message box window is displayed with the number of characters of the selected data of the found employee?