UTF-8 problems again *sigh*
I nearly got mad trying to convert a messed up file from ASCII to UTF-8 again by converting it to UTF-8, repairing the messed up Umlaut-characters, and saving it. Just to make sure I reopened the file again and...its back to ASCII, the characters I just entered messed up again!
So I did the same again and again: always the same result. I tried to do the same with a small test file: all worked fine! So what is this, I thought?
Then I noticed that the first multibyte character in the file is very far at the end of the file. So I did a test, try yourself:
- create a new file, convert it to U8 and enter an umlaut (e.g. ä), save and close
- reopen the file: its still U8. So far so good. No enter at least 10kb of ascii characters before the umlaut (in my test file its 11116 "-" characters), save and close
- reopen the file: its ASCII again, the umlaut messed up!
Is this a convention/standard? Does this have to do with a setting? I have autodetect U8 on...so this rather looks like a bug to me.
negg
I nearly got mad trying to convert a messed up file from ASCII to UTF-8 again by converting it to UTF-8, repairing the messed up Umlaut-characters, and saving it. Just to make sure I reopened the file again and...its back to ASCII, the characters I just entered messed up again!
So I did the same again and again: always the same result. I tried to do the same with a small test file: all worked fine! So what is this, I thought?
Then I noticed that the first multibyte character in the file is very far at the end of the file. So I did a test, try yourself:
- create a new file, convert it to U8 and enter an umlaut (e.g. ä), save and close
- reopen the file: its still U8. So far so good. No enter at least 10kb of ascii characters before the umlaut (in my test file its 11116 "-" characters), save and close
- reopen the file: its ASCII again, the umlaut messed up!
Is this a convention/standard? Does this have to do with a setting? I have autodetect U8 on...so this rather looks like a bug to me.
negg