Tapatalk

How to split a single file into multiple text files based on first string on each line?

How to split a single file into multiple text files based on first string on each line?

7
NewbieNewbie
7

PostJul 06, 2018#1

Since I visit and search UltraEdit forum SO often, I need to bookmark them in the text file.
I prefer txt file because it's very simple and easy to search and manage. My UE version is 25.10.0.32.

So here is my question.

Before:
Bookmark(unique number)_TITLE    URL

After:
Bookmark(unique number)_TITLE.txt

Example:

Before:

Code: Select all

Bookmark0_forums.ultraedit.com index.html http://forums.ultraedit.com/
Bookmark01_copy.filtered.Bookmarks to clipboard via macro.page1     http://forums.ultraedit.com/viewtopic.php?t=3609
Bookmark01_copy.filtered.Bookmarks to clipboard via macro.page5 http://forums.ultraedit.com/viewtopic.php?t=3609
Bookmark01_copy.filtered.Bookmarks to clipboard via macro.page2 http://forums.ultraedit.com/viewtopic.php?t=3609
Bookmark01_copy.filtered.Bookmarks to clipboard via macro.page80000  http://forums.ultraedit.com/viewtopic.php?t=3609
Bookmark01_copy.filtered.Bookmarks to clipboard via macro.page00     http://forums.ultraedit.com/viewtopic.php?t=3609
Bookmark01_copy.filtered.Bookmarks to clipboard via macro.page010       http://forums.ultraedit.com/viewtopic.php?t=3609
Bookmark2_filtered-Bookmarks [1]          http://forums.ultraedit.com/copy-filtered-Bookmarks-to-clipboard-via-macro-t17765.html
Bookmark2_filtered Bookmarks (02)          http://forums.ultraedit.com/copy-filtered-Bookmarks-to-clipboard-via-macro-t17765.html
Bookmark310005_to clipboard_via_macro                   http://forums.ultraedit.com/copy-filtered-Bookmarks-to-clipboard-via-macro-t17765.html
Bookmark12_..(saved bookmark & added) how to remove dup!          http://forums.ultraedit.com/how-do-i-remove-duplicate-Bookmarks-t2282.html
Bookmark05_Find, replace, find in files, replace in files, regular expressions    http://forums.ultraedit.com/find-replace-regular-expressions-f8/
Bookmark00007_000003 next page via cliboard           https://forums.ultraedit.com/copy-filtered-BOOKMARKs-to-clipboard-via-macro-t17765.html
Bookmark00007_01 next page        https://forums.ultraedit.com/copy-filtered-BOOKMARKs-to-clipboard-via-macro-t17765.html
Bookmark00007_2 previous page https://forums.ultraedit.com/copy-filtered-BOOKMARKs-to-clipboard-via-macro-t17765.html
Bookmark010_question!(@board3) from                     https://forums.ultraedit.com/regular-expression-to-extract-url-strings-from-a-x-t17790.html
Bookmark010_extract url strings from    https://forums.ultraedit.com/regular-expression-to-extract-url-strings-from-a-x-t17790-s15.html
Bookmark010_answer cf links http://forums.ultraedit.com/regular-expression-to-extract-url-strings-from-a-x-t17790-s15.html
After:

So in this case 8 txt files should be there created
==> and *** each text file has all the urls in it that copied from the same number lines *** (and the first lines title would be the file name of each txt file)

Bookmark0_forums.ultraedit.com index.html.txt
Bookmark01_copy.filtered.Bookmarks to clipboard via macro.page1.txt
Bookmark2_filtered Bookmarks [1].txt
Bookmark310005_to clipboard_via_macro.txt
Bookmark12_..(saved bookmark & added) how to remove dup!.txt
Bookmark05_Find, replace, find in files, replace in files, regular expressions.txt
Bookmark00007_000003 next page via cliboard.txt
Bookmark010_question!(@board3).txt

This is my poor so called "copied and pasted poor macro"
credit to all the users who had some questions in this forum :)

This isn't working.
Its funny how things never happen the way I expect them to. :)
"Bookmark" and "the unique number_" to tell it where to break up the files.
please help me! :)

Code: Select all

Loop 0
Find MatchCase "^c"
EndSelect
IfNotFound
ExitLoop
EndIf
Key HOME
Find MatchCase RegExp "Bookmark+\d_"
Clipboard 9
Copy
EndSelect
Top
Find MatchCase "^c"
EndSelect
Key HOME
PerlReOn
Find MatchCase RegExp "(Bookmark+\d_).*\r?\n(?:\1.*\r?\n)*"
Copy
EndSelect
NewFile
Paste
Top
SelectLine
Copy
Top
Find MatchCase RegExp "^.*?(http|https)"
Replace All "\1"
Bottom
Paste
Key UP ARROW
SelectLine
UltraEditReOn
Find MatchCase RegExp SelectText "^{http^}^*$"
Replace All ""
Bottom
Key UP ARROW
SelectLine
Find MatchCase RegExp SelectText "[~^r^n0-9A-Za-z]+"
Replace All ""
EndSelect
Bottom
Key UP ARROW
StartSelect
Key END
Copy
EndSelect
DeleteLine
SaveAs "^c.txt"
CloseFile NoSave
ClearClipboard
Clipboard 8
UltraEditReOn
IfEof
ExitLoop
EndIf
EndLoop
ClearClipboard

6,825625
Grand MasterGrand Master
6,825625

PostJul 06, 2018#2

The macro code for this task is:

Code: Select all

InsertMode
ColumnModeOff
HexOff
Bottom
IfColNumGt 1
InsertLine
EndIf
Top
PerlReOn
Clipboard 9
Loop 0
Find MatchCase RegExp "^(Bookmark\d+_).+(?:\r?\n)?(?:\1.+(?:\r?\n)?)*"
IfNotFound
ExitLoop
EndIf
Copy
NewFile
UnixMacToDos
Paste
Top
Find MatchCase RegExp "([\t ]+(?:http|ftp))"
Replace "\r\n\1"
Top
StartSelect
Key END
EndSelect
Copy
Find MatchCase RegExp SelectText "[\"*/:<>?\\\|]"
Replace All "_"
Clipboard 8
Copy
Clipboard 9
Paste
Delete
Clipboard 8
SaveAs "^c.txt"
CloseFile NoSave
ClearClipboard
Clipboard 9
EndSelect
Key HOME
EndLoop
ClearClipboard
Clipboard 0
Top
I added extra code to make sure the string to use for file name does not contain characters which are not possible in file names by replacing each of them with an underscore.

7
NewbieNewbie
7

PostJul 06, 2018#3

Thank you very much. I really appreciate it. your macro works flawlessly and fast
and its very different than i thought it would be like :)

btw while im studying your macro i found a word "ftp" in the code. Does that mean this macro even works with URLs starting with ftp:// ?
thanks again, have a wonderful time !

6,825625
Grand MasterGrand Master
6,825625

PostJul 06, 2018#4

I needed something to find out where the bookmark text ends and where the URL begins. So I decided that the occurrence of either http or ftp marks the beginning of the URL. Very simple and not fail safe in case of bookmark text contains also http or ftp somewhere, but there seems to be nothing better, except using a more complex Perl regular expression to determine beginning of a URL.

7
NewbieNewbie
7

PostJul 06, 2018#5

@Mofi Can this macro include only  URLs in the text file?

and is there any way to stop macro running in the process?
I often ended up with some error while testing my poor macros and theres the nag screen saying "cancel" 
but sometimes i cant even click on the "cancel" especially when macro's looping fast
so i had to stop UE using the window task manager :) 

thanks mofi! im very scared of modifying your exclusive and amazing macro.  
feels like its a sort of sin for me :)

6,825625
Grand MasterGrand Master
6,825625

PostJul 06, 2018#6

Here is the macro rewritten to have only the URLs in the text files, sorted alphabetically with removing duplicate lines.

Code: Select all

InsertMode
ColumnModeOff
HexOff
Bottom
IfColNumGt 1
InsertLine
EndIf
Top
PerlReOn
Clipboard 9
Loop 0
Find MatchCase RegExp "^(Bookmark\d+_).+(?:\r?\n)?(?:\1.+(?:\r?\n)?)*"
IfNotFound
ExitLoop
EndIf
Copy
NewFile
UnixMacToDos
Paste
Top
Find MatchCase RegExp "[\t ]+((?:http|ftp)s?://)"
Replace "\r\n\1"
Top
StartSelect
Key END
EndSelect
Find MatchCase RegExp SelectText "[\"*/:<>?\\\|]"
Replace All "_"
Copy
DeleteLine
Find MatchCase RegExp "^(?:(?!(?:http|ftp)s?://).)+"
Replace All ""
SortAsc RemoveDup 1 -1 0 0 0 0 0 0
SaveAs "^c.txt"
CloseFile NoSave
EndSelect
Key HOME
EndLoop
ClearClipboard
Clipboard 0
Top
The macro has also an approved detection of beginning of an URL as searching for http:// or https:// or ftp:// or ftps://.

For breaking a running macro press and hold key ESC.

7
NewbieNewbie
7

PostJul 06, 2018#7

@Mofi, thank you so much! 

The latter version seems to even remove duplicate urls right? Which is, I think, very useful to manage a bookmark list.
Yet I like both of the versions!

Can you check it out with this list of bookmark txt which I attached with this message.
It's kinda weird cause both your macro work fine and fast with the example above.
But in this case, it won't work properly.
Could you confirm that is it just me or on your end either?

If possible, please make both of them adjusted to this unusual case.
UE_bookmark_list.txt (81.62 KiB)   39

6,825625
Grand MasterGrand Master
6,825625

PostJul 07, 2018#8

The first problem is caused by the empty line at 213 which breaks up the lines starting with Bookmark9502_ into two blocks according to the Perl regular expression used in the macros to select lines starting with same bookmark number string. Well, that could be easily solved by deleting first all empty lines before starting the loop.

But the main problem is that UltraEdit v25.10.0.50 and also former versions like v22.20.0.49 select wrong the lines starting with Bookmark9501_ or Bookmark9502_ depending on version of UltraEdit. I have just reported this bug by email to IDM support.

So I had to decide to work around this bug as done many years ago with UltraEdit versions not supporting Perl regular expressions making it possible at all to select multiple lines starting with same string, or writing an UltraEdit script for this task which does as much as possible in memory. I decided to write a script as doing as much as possible in memory avoids UltraEdit window refreshes resulting in finishing the task within a shorter time in comparison to a macro solution. And it is possible with a script to add better error handling and better user information.

The attached ZIP file contains the two commented UltraEdit scripts to split active file with bookmarks into multiple files containing either the entire bookmark lines or just the URLs alphabetically sorted case-sensitive with removing duplicate lines. The two scripts are nearly identical, just a few lines are different.

Both scripts write the file names with full path of active file (or no path if active file is a new, unnamed file) of saved files into the output window and append a summary line. The output window is automatically opened on at least one file created by the script. A double click on a file name in output window opens the appropriate file.
split_bookmarks_scripts.zip (3.11 KiB)   40
This ZIP file contains the two UltraEdit scripts to write the bookmarks into several text files.

7
NewbieNewbie
7

PostJul 08, 2018#9

@Mofi thank you! I really appreciate it
you didnt have to go out of your way to make these scripts for this unusual case, 'cause the two macros you made before still works in most cases  :)
I could wait for the next updates for bug fixes 
and I hope the next version of UE works fine with the macros you had made


btw I would suggest that the macro section should be divided into by its version
I mean, I search and find some useful tips or macros in the forum but most of them are not that compatible with UE im using.
Of course, advanced users won't have any problem cause they are able to modify or adjust them freely
but newbies like me would end up disappointed as they are out of date or not available or not compatible with their whatever version
so how about having some section like "before version 17xxxx" "after version 25xxxx?"
just my 2 cents :)