I am working with a large text file, and need to insert a line break after every character in the text. Unlike other text tools I have tried, UltraEdit is able to handle the large amount of text really quickly (pretty much everything else I have tried times out).
I can use (Perl) regex to find each character using either
or
and replace with
This inserts the line breaks, but UltraEdit loses any diacritic characters in the original string, replacing them with non-character code points. So, for example, if my input string is
the output of the find/replace operation is
The individual Unicode diacritic characters à and é are being replaced by sequences of two U+FFFD REPLACEMENT CHARACTER codes.
Is there a way to prevent this? I tested short strings like this in TextMate and Sublime Text, and they didn’t mess up the diacritics like this, but they can’t handle my large text file.
I can use (Perl) regex to find each character using either
Code: Select all
(.)
Code: Select all
(\X)
Code: Select all
$1\n
Code: Select all
Union à Dieuchez Denys l'Aréopagite
Code: Select all
U
n
i
o
n
�
�
D
i
e
u
c
h
e
z
D
e
n
y
s
l
'
A
r
�
�
o
p
a
g
i
t
e
Is there a way to prevent this? I tested short strings like this in TextMate and Sublime Text, and they didn’t mess up the diacritics like this, but they can’t handle my large text file.